Changes

A small, incremental release: gaining Elo at this level is getting harder without a new architecture and much more compute. The new version focuses on new nets, search cleanups, and the first proper SMP tuning, with especially nice gains in DFRC.

Changes

Evaluation

New neural net trained on 17B positions
Added a castling-rights bonus for FRC/DFRC, combining well with the new nets for larger gains there than in regular chess.
Removed KRvKR and KQvKQ specialty code (was causing low-time blunders).

Search & TT

Store eval in TT as early as possible in QSearch.
In PV nodes, skip QSearch on the TT move (idea by Viz from Stockfish).
Increased NMP start depth (more conservative at shallow depth).
Added a recapture extension.
Increased the “improving” factor in RFP.
Simplified move ordering and eval adjustment code paths.
TT now updates the move field even when the entry is not overwritten.
Removed current-move reporting from the search output.
First dedicated multithreaded tuning (≈31k games, 20+0.2, 8 threads), giving about +5 Elo on 4 threads.

Time management & SMP

Tweak with the biggest gain in high-increment games (≈+4 Elo).

Progression test

SMP LTC (20+0.2, 4 threads)

Elo   | 17.33 +- 3.96 (95%)
Conf  | 20.0+0.20s Threads=4 Hash=128MB
Games | N: 6964 W: 1758 L: 1411 D: 3795
Penta | [5, 639, 1847, 986, 5]

https://furybench.com/test/3969/

LTC (60+0.6)

Elo   | 8.84 +- 2.38 (95%)
Conf  | 60.0+0.60s Threads=1 Hash=128MB
Games | N: 20010 W: 4835 L: 4326 D: 10849
Penta | [19, 2133, 5185, 2656, 12]

https://furybench.com/test/3935/

DFRC LTC (60+0.6)

Elo   | 10.33 +- 2.17 (95%)
Conf  | 60.0+0.60s Threads=1 Hash=128MB
Games | N: 17118 W: 2595 L: 2086 D: 12437
Penta | [28, 1110, 5792, 1583, 46]

https://furybench.com/test/3965/

A smaller, focused update. At this level gains are hard to come by and my dev time is limited, but 1.23 still squeezes out steady Elo at both STC and LTC, plus a big leap in a quirky “queens-only” test.

Changes

Evaluation
- New neural net.
- Simplify Evaluate: leaner piece/phase accounting; removed the old castling-rights bonus.
Search
- Simplify extensions.
- Fix SkipQuiets so quiets are truly skipped.
- Simplify TT alpha cutoffs.
- Use uint8 for TT entry depth (avoids signed wrap; widens depth range) and adjust a static-eval TT write.
- Tweak correction history bonus.
- QSearch: adjust best score toward beta on stand-pat exceed (idea from Stockfish).
- Perform Null Move Pruning only in cut nodes.
Tuning
- Retuned all parameters at long time control (80+0.8).

Progression test (UHO book)

LTC

Elo   | 15.85 +- 3.27 (95%)
Conf  | 60.0+0.60s Threads=1 Hash=128MB
Games | N: 10000 W: 2532 L: 2076 D: 5392
Penta | [2, 923, 2704, 1359, 12]

STC

Elo   | 7.92 +- 3.56 (95%)
Conf  | 10.0+0.10s Threads=1 Hash=16MB
Games | N: 10006 W: 2466 L: 2238 D: 5302
Penta | [30, 1126, 2475, 1330, 42]

Special sauce: Queens-only test

Using more high-quality queens-only games in training continues to pay off.

Score of Caissa 1.23 BMI2 vs Caissa 1.22 BMI2: 295 - 19 - 463  [0.678] 777
...      Caissa 1.23 BMI2 playing White: 159 - 11 - 218  [0.691] 388
...      Caissa 1.23 BMI2 playing Black: 136 - 8 - 245  [0.665] 389
...      White vs Black: 167 - 147 - 463  [0.513] 777
Elo difference: 129.0 +/- 14.7, LOS: 100.0 %, DrawRatio: 59.6 %

Thanks to everyone running games and sharing feedback!

Full Changelog: 1.22...1.23

Caissa just got a little smarter and a lot more fun - especially in strange “queens-only” or "rook-only" setups, where it now crushes Stockfish and beats its own previous version by +500 Elo.

Changes

Tweaked how deep singular extensions go.
Added extra negative extensions.
Late-move reductions (LMR) are now fully deterministic.
Quiescence search now averages stand-pat scores.
Time-manager tweaks.
Transposition table clears faster on multi-core CPUs.
Fresh neural net trained on 15B positions.

Special sauce

In positions with only queens on the first rank (e.g. qqqqkqqq/pppppppp/8/8/8/8/PPPPPPPP/QQQQKQQQ w - - 0 1) Caissa 1.22 gains a massive +530 (+/- 40) over 1.21, putting it far ahead of Stockfish in that odd corners of chess.

Measured gains

Book: UHO_Lichess_4852_v1.epd

Elo   | 16.96 +- 3.61 (95%)
Conf  | 60.0+0.60s Threads=1 Hash=128MB
Games | N: 8508 W: 2158 L: 1743 D: 4607
Penta | [4, 801, 2237, 1200, 12]

Elo   | 16.62 +- 3.09 (95%)
Conf  | 8.0+0.08s Threads=1 Hash=8MB
Games | N: 13284 W: 3449 L: 2814 D: 7021
Penta | [43, 1313, 3327, 1884, 75]

Not huge jumps - improvements are getting harder at this level...

Thanks for testing!

@aronpetko

I'm excited to announce Caissa version 1.21, the latest release packed with improvements to make gameplay even stronger and more stable.

Progression test

TC=60+0.6s, Book=UHO_Lichess_4852_v1.epd

Elo   | 35.09 +- 4.16 (95%)
Conf  | 60.0+0.60s Threads=1 Hash=128MB
Games | N: 6278 W: 1758 L: 1126 D: 3394
Penta | [2, 446, 1624, 1052, 15]

TC=8+0.08s, Book=UHO_Lichess_4852_v1.epd

Elo   | 22.65 +- 3.68 (95%)
Conf  | 8.0+0.08s Threads=1 Hash=8MB
Games | N: 10000 W: 2812 L: 2161 D: 5027
Penta | [31, 974, 2402, 1499, 94]

TC=10+0.1s, Book=DFRC.epd

Elo   | 18.98 +- 5.35 (95%)
Conf  | 10.0+0.10s Threads=1 Hash=16MB
Games | N: 5002 W: 1114 L: 841 D: 3047
Penta | [38, 462, 1258, 675, 68]

Changes

Fixed problems detected by sanitizers that could lead to potential crashes.
New neural network trained on a total of 13 billion positions.
Improved eval correction (Stockfish style).
Tuned parameters at long time control.
Various search improvements.

Special thanks to @aronpetko for invaluable access to the OpenBench instance.

Progression test

TC=40+0.4s, Book=UHO_Lichess_4852_v1.epd

Elo   | 22.15 +- 5.24 (95%)
Conf  | 40.0+0.40s Threads=1 Hash=64MB
Games | N: 7808 W: 2054 L: 1557 D: 4197
Penta | [7, 696, 2007, 1181, 13]

TC=10+0.1s, Book=UHO_Lichess_4852_v1.epd

Elo   | 18.95 +- 2.35 (95%)
Conf  | 10.0+0.10s Threads=1 Hash=16MB
Games | N: 40014 W: 10644 L: 8464 D: 20906
Penta | [98, 3904, 9948, 5834, 223]

TC=1+0s, Book=UHO_Lichess_4852_v1.epd

Elo   | 71.57 +- 4.99 (95%)
Conf  | 1.0+0.00s Threads=1 Hash=1MB
Games | N: 10240 W: 3832 L: 1752 D: 4656
Penta | [89, 731, 1889, 1833, 578]

Changes

Bigger neural net (11 king buckets instead of 5) trained on total 12.6B positions
Improved performance in ultra short time controls without increment

Progression test

TC=60+0.6, Book=UHO_Lichess_4852_v1.epd

Elo   | 22.12 +- 7.60 (95%)
Conf  | 60.0+0.60s Threads=1 Hash=64MB
Games | N: 3790 W: 1016 L: 775 D: 1999
Penta | [1, 344, 969, 575, 6]

Changes

New neural net trained on total 12.5B positions
Various search improvements
Search parameter tuning at LTC

Progression test

TC=8+0.08, Book=UHO_Lichess_4852_v1.epd

Elo   | 21.60 +- 3.48 (95%)
Conf  | 8.0+0.08s Threads=1 Hash=8MB
Games | N: 18582 W: 5085 L: 3931 D: 9566
Penta | [62, 1822, 4474, 2766, 167]

TC=60+0.6, Book=UHO_Lichess_4852_v1.epd

Elo   | 21.98 +- 5.45 (95%)
Conf  | 60.0+0.60s Threads=1 Hash=128MB
Games | N: 7266 W: 1924 L: 1465 D: 3877
Penta | [10, 654, 1850, 1105, 14]

Changes

New neural net trained on total 9.3B positions. Introduced more positions from regular chess games (instead of DFRC games).
Various search improvements

Progression test

TC=8+0.08, Book=UHO_Lichess_4852_v1.epd

Elo   | 20.60 +- 4.85 (95%)
Conf  | 8.0+0.08s Threads=1 Hash=8MB
Games | N: 9690 W: 2670 L: 2096 D: 4924
Penta | [38, 947, 2337, 1449, 74]

TC=60+0.6, Book=UHO_Lichess_4852_v1.epd

Elo   | 19.83 +- 6.47 (95%)
Conf  | 60.0+0.60s Threads=1 Hash=64MB
Games | N: 5000 W: 1272 L: 987 D: 2741
Penta | [4, 442, 1332, 709, 13]

TC=8+0.08, Book=DFRC.epd

Elo   | 15.59 +- 6.38 (95%)
Conf  | 8.0+0.08s Threads=1 Hash=8MB
Games | N: 4394 W: 948 L: 751 D: 2695
Penta | [28, 433, 1110, 566, 60]

Changes

New neural net trained on total 7.1B positions. Introduced more high quality games from SPRT tests to the dataset (~220M positions) and random endgame positions scored with 7-man TB (~40M positions). Finetuning previous net for over 50B iterations.
Small speedup
Search improvements:
- Prevent search explosions in LMR
- Simplify LMR history formula
- Additional history bonus based on score difference
- Store eval in TT as soon as possible
- Higher RFP margin if opponent is threating a capture
Use threats info to generate less illegal king moves
Tweak transposition table replacement scheme

Progression test

TC=10+0.1, Book=UHO_4060_v2.epd

Elo   | 36.65 +- 4.71 (95%)
Conf  | 10.0+0.10s Threads=1 Hash=16MB
Games | N: 10000 W: 2917 L: 1866 D: 5217
Penta | [25, 788, 2387, 1711, 89]

TC=60+0.6, Book=UHO_4060_v2.epd

Elo   | 34.15 +- 7.41 (95%)
Conf  | 60.0+0.60s Threads=1 Hash=128MB
Games | N: 3848 W: 1065 L: 688 D: 2095
Penta | [4, 276, 991, 645, 8]

TC=10+0.1, Book=DFRC.epd

Elo   | 27.03 +- 7.27 (95%)
Conf  | 10.0+0.10s Threads=1 Hash=16MB
Games | N: 3568 W: 866 L: 589 D: 2113
Penta | [23, 311, 879, 508, 63]

Changes

New neural net trained on total 6.9B positions. Introduced high quality games from SPRT tests to the dataset (~350M positions).
SPSA parameter tuning at long time controls
Eval correction improvements
Smaller transposition table entries
Few speedups (around 4% in total)
Various search improvements

Releases: Witek902/Caissa

Caissa v1.24

Changes

Evaluation

Search & TT

Progression test

SMP LTC (20+0.2, 4 threads)

LTC (60+0.6)

DFRC LTC (60+0.6)

Uh oh!

Caissa v1.23

Changes

Progression test (UHO book)

LTC

STC

Special sauce: Queens-only test

Uh oh!

Caissa v1.22

Changes

Special sauce

Measured gains

Uh oh!

Caissa v1.21.7

Uh oh!

Caissa v1.21

Progression test

Changes

Contributors

Uh oh!

Caissa v1.20

Progression test

Changes

Uh oh!

Caissa v1.19

Progression test

Changes

Uh oh!

Caissa v1.18

Progression test

Changes

Uh oh!

Caissa v1.17

Progression test

Changes

Uh oh!

Caissa v1.16

Progression test

Changes

Uh oh!