Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Stateless support and switch to new deeplake-api.#3130

Merged
khustup2 merged 5 commits intomainfrom
stateless-v2
Feb 7, 2026
Merged

Stateless support and switch to new deeplake-api.#3130
khustup2 merged 5 commits intomainfrom
stateless-v2

Conversation

@khustup2
Copy link
Contributor

@khustup2 khustup2 commented Feb 7, 2026

🚀 🚀 Pull Request

Impact

  • Bug fix (non-breaking change which fixes expected existing functionality)
  • Enhancement/New feature (adds functionality without impacting existing logic)
  • Breaking change (fix or feature that would cause existing functionality to change)

Description

Things to be aware of

Things to worry about

Additional Context

@claude
Copy link

claude bot commented Feb 7, 2026

Claude finished @khustup2's task in 2m 29s —— View job


Review Complete

I've reviewed the PR and found several issues that should be addressed:

Issues Found:

  1. Thread safety in cache implementation (cpp/deeplake_pg/dl_catalog.cpp:62)

    • The cache doesn't verify credentials match when returning cached values
    • Credentials should be passed by const reference
  2. Cache invalidation strategy (cpp/deeplake_pg/dl_catalog.cpp:560)

    • Invalidating cache on every read error may cause performance issues with transient failures
    • Consider invalidating only on specific error types
  3. Migration error handling (cpp/deeplake_pg/dl_catalog.cpp:154)

    • Using elog(ERROR) aborts transactions, may prevent catalog initialization
    • Should use elog(WARNING) and continue
  4. Potential data loss risk (cpp/deeplake_pg/dl_catalog.cpp:183)

    • Automatically deleting remote datasets without user confirmation is risky
    • Should add a GUC setting or require explicit confirmation
  5. Race condition on path change (cpp/deeplake_pg/sync_worker.cpp:205)

    • Cache not explicitly invalidated when root_path changes
    • May return stale version from old path
  6. Thread-local initialization (cpp/deeplake_pg/extension_init.cpp:1201)

    • Static thread_local in utility hook may have initialization races
    • Should use session-level state instead

All issues have been posted as inline comments on the specific lines.

Test Coverage:
The new test files look appropriate for validating stateless mode resilience and catalog recovery scenarios.

const auto meta_path = join_path(path, k_meta_name);
meta_table = deeplake_api::open_catalog_table(meta_path, std::move(creds)).get_future().get();
}
return meta_table;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thread safety issue: The cache uses thread_local, but credentials are passed by value and the path comparison doesn't account for concurrent root_path changes across requests.

Fix: Add mutex protection or ensure credentials equality check when cache hits occur:

Suggested change
return meta_table;
std::shared_ptr<deeplake_api::catalog_table> get_meta_table(const std::string& path, const icm::string_map<>& creds)
{
if (path != root_path || !meta_table) {
// Cache miss or path changed - open and cache
root_path = path;
const auto meta_path = join_path(path, k_meta_name);
meta_table = deeplake_api::open_catalog_table(meta_path, icm::string_map<>(creds)).get_future().get();
}
return meta_table;
}

try {
auto table = open_catalog_table(root_path, k_meta_name, std::move(creds));
// Use cached meta table handle to avoid repeated S3 opens
auto table = catalog_table_cache::instance().get_meta_table(root_path, std::move(creds));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache invalidation on every error: Invalidating the cache on read errors may cause performance degradation if transient errors occur frequently (e.g., network issues).

Fix: Consider invalidating only on specific error types or after N consecutive failures to avoid unnecessary cache churn.

"Catalog path %s is a non-directory filesystem artifact. Removing it before catalog initialization.",
path.c_str());
if (!std::filesystem::remove(path, ec) && ec) {
elog(ERROR, "Failed to migrate local catalog path %s: %s", path.c_str(), ec.message().c_str());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error handling uses elog(ERROR) which aborts transaction: In the migration path, using elog(ERROR) will abort the entire transaction and may prevent catalog initialization.

Fix: Consider using elog(WARNING) instead and continuing:

Suggested change
elog(ERROR, "Failed to migrate local catalog path %s: %s", path.c_str(), ec.message().c_str());
elog(WARNING, "Failed to migrate local catalog path %s: %s - continuing anyway", path.c_str(), ec.message().c_str());

is_catalog = deeplake_api::is_catalog_table(path, icm::string_map<>(creds)).get_future().get();
deeplake_api::delete_dataset(path, icm::string_map<>(creds)).get_future().get();
} catch (const std::exception& e) {
elog(ERROR, "Failed to migrate remote catalog path %s: %s", path.c_str(), e.what());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential data loss: Deleting a remote dataset without user confirmation could cause data loss if the wrong path is configured.

Fix: Add a GUC setting to control migration behavior or require explicit confirmation.

// Ensure catalog exists
pg::dl_catalog::ensure_catalog(root_path, creds);
// Only ensure catalog on first call or when root_path changes
if (!catalog_ensured || root_path != last_root_path) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition on root_path change: When root_path changes, the cache is not explicitly invalidated before calling get_catalog_version, potentially returning stale version from the old path.

Fix: Explicitly invalidate cache when path changes:

Suggested change
if (!catalog_ensured || root_path != last_root_path) {
if (!catalog_ensured || root_path != last_root_path) {
pg::dl_catalog::ensure_catalog(root_path, creds);
catalog_ensured = true;
last_root_path = root_path;
last_catalog_version = 0; // Reset version when path changes
}

Note: The cache uses thread_local so each worker thread should be isolated, but verify this is the intended behavior.

// auto-discovered when pointing to a shared root_path
pg::table_storage::instance().force_load_table_metadata();
// Track the previous root_path to detect actual changes
static thread_local std::string last_root_path;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thread-local static initialization race: Using static thread_local inside a function called from utility hook may not be thread-safe during initialization across concurrent SET commands.

Fix: Consider moving to a proper session-level state or add explicit initialization guards.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Feb 7, 2026

Quality Gate Passed Quality Gate passed

Issues
1 New issue
2 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

@khustup2 khustup2 merged commit 41e606e into main Feb 7, 2026
6 checks passed
@khustup2 khustup2 deleted the stateless-v2 branch February 7, 2026 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant