Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Changed Esti Data Lake Storage Gen2 to use a mock for CosmosDB#10173

Open
idanovo wants to merge 15 commits intomasterfrom
10156-make-esti-azure-faster
Open

Changed Esti Data Lake Storage Gen2 to use a mock for CosmosDB#10173
idanovo wants to merge 15 commits intomasterfrom
10156-make-esti-azure-faster

Conversation

@idanovo
Copy link
Contributor

@idanovo idanovo commented Feb 22, 2026

Two changes to speed up Esti's run:

  1. Speed up deploy-image CI job: Build Go binaries on the runner (with cached actions/setup-go) instead of inside Docker. A new slim Dockerfile.ci just copies the pre-built binaries into Alpine. Saves ~1.5-2 min on incremental builds.
  2. Changed Data Lake Storage Gen2 to use a mock for CosmosDB (azure-cosmos-emulator). Saves about ~3 minutes.

@github-actions github-actions bot added area/testing Improvements or additions to tests area/ci labels Feb 22, 2026
@github-actions github-actions bot added the infrastructure build, deploy and release processes label Feb 22, 2026
@github-actions github-actions bot added the area/KV Improvements to the KV store implementation label Feb 22, 2026
@idanovo idanovo added mostly-ai exclude-changelog PR description should not be included in next release changelog labels Feb 22, 2026
@idanovo idanovo linked an issue Feb 23, 2026 that may be closed by this pull request
@idanovo idanovo requested a review from a team February 23, 2026 08:32
@idanovo idanovo marked this pull request as ready for review February 23, 2026 08:32
Copy link
Contributor

@ItamarYuran ItamarYuran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

- LAKEFS_DATABASE_COSMOSDB_ENDPOINT=http://127.0.0.1:8081
- LAKEFS_DATABASE_COSMOSDB_DATABASE=esti-db
- LAKEFS_DATABASE_COSMOSDB_CONTAINER=esti-container
- LAKEFS_DATABASE_COSMOSDB_KEY=C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it cool to have it hard coded like that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's only a mock DB random key 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe adding something like FAKE_KEY_ to avoid security auditing from suspecting this as real key leak?

Comment on lines +133 to +135
mkdir -p dist
go build -ldflags "-X github.com/treeverse/lakefs/pkg/version.Version=${VERSION}" -o dist/lakefs ./cmd/lakefs
go build -ldflags "-X github.com/treeverse/lakefs/pkg/version.Version=${VERSION}" -o dist/lakectl ./cmd/lakectl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could these binaries be kept at the root? If so, can this be replaced with:

make build-binaries VERSION=${{ needs.gen-code.outputs.tag }}

If not, does it make sense to add support for BINARIES_DIR (or something similar) to build-binaries target in the Makefile and use that here?

Copy link
Contributor

@Isan-Rivkin Isan-Rivkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote against using a mock, we don't have anything to test cosmosdb.
mock of cosmos that is not even maintained by azure doesn't give us any confidence.
speed is secondary goal.

@idanovo
Copy link
Contributor Author

idanovo commented Feb 26, 2026

I vote against using a mock, we don't have anything to test cosmosdb. mock of cosmos that is not even maintained by azure doesn't give us any confidence. speed is secondary goal.

@N-o-Z As far as I remember, we used to test things with mock, right?
I thought we switched to a real CosmosDB just because we had an issue with the mock

@idanovo
Copy link
Contributor Author

idanovo commented Feb 26, 2026

I vote against using a mock, we don't have anything to test cosmosdb. mock of cosmos that is not even maintained by azure doesn't give us any confidence. speed is secondary goal.

@Isan-Rivkin Can you please review the build and push image step?

@idanovo idanovo requested a review from Isan-Rivkin February 26, 2026 11:04
@N-o-Z
Copy link
Member

N-o-Z commented Feb 26, 2026

I vote against using a mock, we don't have anything to test cosmosdb. mock of cosmos that is not even maintained by azure doesn't give us any confidence. speed is secondary goal.

@N-o-Z As far as I remember, we used to test things with mock, right? I thought we switched to a real CosmosDB just because we had an issue with the mock

We use mocks for unit tests not for integration tests. For dynamoDB we use a local implementation but that is an AWS official and maintained image
The thing is we already have coverage for a production dynamoDB as part of the cloud CI. We don't have the same for CosmosDB.
In this case I strongly agree with Isan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/ci area/KV Improvements to the KV store implementation area/testing Improvements or additions to tests exclude-changelog PR description should not be included in next release changelog infrastructure build, deploy and release processes mostly-ai

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Esti- Improve Azure ADLS/CosmosDB run time.

6 participants