Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Tags: datastax/cassandra

Tags

cndb-main-release-202505-HF11

Toggle cndb-main-release-202505-HF11's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CNDB-15623: Only use write path for CDC tables in CassandraStreamRece…

…iver if CDC is enabled on the node (#2043)

Repairs use the local write path for streams on CDC-enabled tables,
based on table schema. This interacts poorly with the separation of CNDB
services.

This commit fixes the issue by only using the CDC write path for a stream if CDC 
is enabled in the node's configuration (as well as in the schema). This avoids 
attempting to use the local write path if commitlog-based CDC is not enabled.

cndb-main-release-202505-HF10

Toggle cndb-main-release-202505-HF10's commit message
CNDB-14861: Fix usage of PrimaryKeyWithSource in SAI

The PrimaryKeyWithSource class has been
present for two years in the code base
as an optimization for hybrid vector workloads,
which have to materialize many primary keys
in the search-then-sort query path.

However, the logic is invalid for version
aa (because we have the bug where compacted
sstables write per row, not per partition)
and it is also invalid for static columns.
This commit avoids creation of PrimaryKeyWithSource
in those cases.

(cherry picked from commit e942cae)

cndb-main-release-202505-HF9

Toggle cndb-main-release-202505-HF9's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CNDB-15485: Fix ResultRetriever key comparison to prevent dupes in re…

…sult set (#2023)

### What is the issue
riptano/cndb#15485

### What does this PR fix and why was it fixed
This PR fixes a bug introduced to this branch via
#1884. The bug only impacts
SAI file format `aa` when the index file was produced via compaction,
which is why the modified test simply adds coverage to compact the table
and hit the bug.

The bug happens when an iterator produces the same partition across two
different batch fetches from storage. These keys were not collapsed in
the `key.equals(lastKey)` logic because compacted indexes use a row id
per row instead of per partition, and the logic in
`PrimaryKeyWithSource` considers rows with different row ids to be
distinct. However, when we went to materialize a batch from storage, we
hit this code:

```java
        ClusteringIndexFilter clusteringIndexFilter = command.clusteringIndexFilter(firstKey.partitionKey());
        if (cfs.metadata().comparator.size() == 0 || firstKey.hasEmptyClustering())
        {
            return clusteringIndexFilter;
        }
        else
        {
            nextClusterings.clear();
            for (PrimaryKey key : keys)
                nextClusterings.add(key.clustering());
            return new ClusteringIndexNamesFilter(nextClusterings, clusteringIndexFilter.isReversed());
        }
```

which returned `clusteringIndexFilter` for `aa` because those indexes do
not have the clustering information. Therefore, each batch fetched the
whole partition (which was subsequently filtered to the proper results),
and produced a multiplier effect where we saw `batch` many duplicates.

This fix works by comparing partition keys and clustering keys directly,
which is a return to the old comparison logic from before
#1884. There was actually a
discussion about this in the PR to `main`, but unfortunately, we missed
this case
#1883 (comment).

A more proper long term fix might be to remove the logic of creating a
`PrimaryKeyWithSource` for AA indexes. However, I preferred this
approach because it is essentially a `revert` instead of fixing forward
solution.

cndb-main-release-202510-RC1

Toggle cndb-main-release-202510-RC1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CNDB-15452: Split SAI metrics query types into disjoint categories (#…

…2015)

It's simpler to understand SAI query metrics when they are split
into granular, non-overlapping categories. The fact they are
non-overlapping makes any of their combinations meaningful.
They can be also visualized in stacked charts.

Additionally, a bug was fixed that prevented proper updates of
SortThenFilterQueriesCompleted and FilterThenSortQueriesCompleted
metrics for non-ANN TopK queries and for some non-hybrid queries.
Now those metrics are bumped up by all hybrid topK queries, and
only by those.

cndb-main-release-2025-08-RC2

Toggle cndb-main-release-2025-08-RC2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CNDB-14577: Compact all SSTables of a level shard if their number rea… (

#1925)

…ches a limit (#1873)

CNDB-14577: [UCS by default does not compact many small non-overlapping
sstables with very few
rows](riptano/cndb#14577)

This PR limits the number of SSTables for a given compaction level shard
by executing a major compaction of the shard instead of the regular
compaction of overlapping SSTables if the number of SSTables reaches a
threshold.

The threshold is controlled by the `max_sstables_per_shard_factor`
setting:
```md
  `max_sstables_per_shard_factor` Limits the number of SSTables per shard. If the number of sstables in a shard
  exceeds this factor times the shard compaction threshold, a major compaction of the shard will be triggered.
  Some conditions like slow writes can lead to SSTables being very small, and never overlap with enough other SSTables
  to be compacted.
  So this setting is useful to prevent the number of SSTables in a shard from growing too large, which can cause
  problems due to the per-sstable overhead. Also these small SSTables may still have overlaps even if under the
  compaction threshold (eg. due to write replicas) and never compacting them wastes storage space.
  The default value is 10.
```

---------

### What is the issue
...

### What does this PR fix and why was it fixed
...

Co-authored-by: Christophe Bornet <[email protected]>

hcd-1.2.3

Toggle hcd-1.2.3's commit message
A CC version based on https://github.com/datastax/cassandra/releases/…

…tag/cndb-main-release-202505-HF3

it contains HCD 1.2.3 specific patches.

cndb-main-release-2025-08-RC1

Toggle cndb-main-release-2025-08-RC1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CNDB-11666: Batch clusterings into single SAI partition post-filterin…

…g reads (#1897)

Port of CASSANDRA-19497.

Co-authored-by: Caleb Rackliffe <[email protected]>
Co-authored-by: Michael Marshall <[email protected]>
Co-authored-by: Andrés de la Peña <[email protected]>

hcd-1.1.2

Toggle hcd-1.1.2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Hcd-130 incremental repair failure during compaction (#1743)

### What is the issue
Concurrent and incremental repairs would spin fail or deadlock.

### What does this PR fix and why was it fixed
Concurrent and incremental repairs would spin fail. This patch:
- Removes an optimization failing to observe max parallelism
- Provides an improved algorithm to enforce max parallelism
- Closes transactions on some exceptions failing to be caught
- Removes a deadlock between cfs and the compaction strategy for long
running sequential operations

cndb-main-release-202505-HF3

Toggle cndb-main-release-202505-HF3's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CNDB-14602: Fix bytes-based paging for partition deletions (#1836)

Only preserve the original data size or rows in case of purging.

Fixes DBPE-16935.

Cherry picked from riptano/cndb#14602, 
which was merged to main as c5e2e64