Thanks to visit codestin.com
Credit goes to github.com

Skip to content

chore(sqlite, syncthing): improve database maintenance performance#10567

Draft
jtek wants to merge 45 commits intosyncthing:mainfrom
jtek:db_maintenance_performance
Draft

chore(sqlite, syncthing): improve database maintenance performance#10567
jtek wants to merge 45 commits intosyncthing:mainfrom
jtek:db_maintenance_performance

Conversation

@jtek
Copy link

@jtek jtek commented Feb 9, 2026

Purpose

While Syncthing tries its periodic cleanup of the database at least some large folders (with databases in the multiple 10's of GiB for a single folder) suffer from very poor performance, effectively stopping any syncing for several hours.

After inspection many parts of the maintenance use some techniques that can indeed put an heavy load on large databases, the problem seems to be made worse by the fact that most of the maintenance happens while locking a mutex protecting against concurrent access to the database : this confirms that if the maintenance tasks aren't very fast Syncthing is effectively blocked for the folder being processed.

The purpose of the current work is to find a better compromise between the duration during which the maintenance task runs and the overall "health" of the database (current maintenance tasks target both disk usage efficiency and Syncthing performance by various means).

Testing

Not yet addressed but new functionality isn't introduced so unless the tests focus on implementation details of the maintenance they shouldn't have to be modified.

jtek and others added 10 commits February 8, 2026 13:26
These are wasteful as optimize is properly activated on DB connection.
ANALYZE needs a full scan of the whole DB unless configured otherwise.

Signed-off-by: Lionel Bouton <[email protected]>
Will do less work in each interval.

Signed-off-by: Lionel Bouton <[email protected]>
SELECT COUNT(...) is doing a full table scan.
An estimate is enough to know an approximate chunk size to process

Signed-off-by: Lionel Bouton <[email protected]>
the process adapts to the speed of the DB by targeting a fixed and low (250ms) process duration and fixes a bug where it was stopped before processing the whole tables
when the folder device sequence froze

Signed-off-by: Lionel Bouton <[email protected]>
The sequential and load adaptative cleanup doesn't need the row count.

Signed-off-by: Lionel Bouton <[email protected]>
- add timing debugs to better study each maintenance task
- checkpoint(truncate) could probably be a problem, delay it until further study

Signed-off-by: Lionel Bouton <[email protected]>
@tomasz1986 tomasz1986 changed the title Db maintenance performance chore(sqlite, syncthing): improve database maintenance performance Feb 9, 2026
@github-actions github-actions bot added the chore label Feb 9, 2026
jtek added 18 commits February 9, 2026 14:30
the "cursors" for incremental processing where improperly stored in the main DB which made shared between folders
they are now in the folder struct (there's little gain from storing them between Syncthing starts)

Signed-off-by: Lionel Bouton <[email protected]>
- apply the same slow walk for file_names and file_versions
- factor the chunkSize adaptation in a dedicated func
- force int64 to support the large values the database supports even on 32 bit systems

Signed-off-by: Lionel Bouton <[email protected]>
PRAGMA incremental_vacuum frees all the freelist by default.
To really make it incremental, a number of pages must be passed.
Target 1MiB freed every maintenance run (5 minutes).
This allows databases to shrink by a bit under 300MiB each day.

Signed-off-by: Lionel Bouton <[email protected]>
The main DB is tiny. Running tidy on it every call to periodic is wasteful.
Limit the call to 1/day

Remove one timing measurement of a call to logger

Signed-off-by: Lionel Bouton <[email protected]>
Once in the lifetime of the DB is enough according to the documentation.
Once at open should be enough.

Signed-off-by: Lionel Bouton <[email protected]>
The range was improperly computed and the whole tables where processed in
a single pass, defeating the purpose of the slow walk on large databases.
Detail the timings of each pass and add more debug log.

Signed-off-by: Lionel Bouton <[email protected]>
Simplified the SQL to find the ranges on which to work for
it was a significant part of the runtime of the function.

Signed-off-by: Lionel Bouton <[email protected]>
According to sqlite source code, if readers are active and busy_timeout is 0,
the checkpointer when asked to TRUNCATE reverts to PASSIVE
busy_timeout allows it to wait for readers to finish

Signed-off-by: Lionel Bouton <[email protected]>
Don't process all Folders at the same time :
each Folder is responsible for determining when the next round of maintenance happens.
This allows relaxing calls to maintenance during low activity, and accelerate catching up
after modifications have been made that left orphans to cleanup

Remove some global periodic debug logs to better focus on individual Folder maintenance debugs

Signed-off-by: Lionel Bouton <[email protected]>
Syncthing is resilient to lost DB transactions on crash as it resyncs automatically on startup
NORMAL is thus enough, synchronous FULL can be very costly on slow and/or busy storage

Signed-off-by: Lionel Bouton <[email protected]>
Signed-off-by: Lionel Bouton <[email protected]>
if the slow walk reached max speed and covers the whole hash range in a single step.
there's no need for a condition on the range. The debug log can be simplified too

Signed-off-by: Lionel Bouton <[email protected]>
@Mrgaton
Copy link

Mrgaton commented Feb 12, 2026

Why default maintenance to 5m?

@jtek
Copy link
Author

jtek commented Feb 12, 2026

Why default maintenance to 5m?

That's a bit complicated to explain. The original maintenance code was based on a periodic maintenance run every 8 hours by default processing all cleanups in a single pass. On very busy folders this creates very long processing times that block the whole folder.

This PR completely change the maintenance to process smaller chunks at more frequent intervals (hence the 5min default) which let Syncthing process the folder without long freezes. In the most recent versions (just pushed today) the frequency is actually dynamic : it uses the configured frequency until it detects there is work to do and speeds up by a factor of ten to better catch up.

My own test instances (including the one with 30 million files and 15TiB) are configured with a 1min interval which switches dynamically to a 6s interval when they detect there are cleanups to do and they perform very well.

jtek and others added 14 commits February 12, 2026 23:46
lastMaintenanceTime is a remnant of a periodic maintenance running at the same time for every folder
The load on large busy folders made Syncthing unusable. The period is now configured by folder and the maintenance is split in smaller chunks targetting a fixed runtime for its processing

Signed-off-by: Lionel Bouton <[email protected]>
- use CamelCase for attribute/variables
- move duplicated attributes of folders to common variables

Signed-off-by: Lionel Bouton <[email protected]>
Use same algorithm than the one for names/versions

Signed-off-by: Lionel Bouton <[email protected]>
Signed-off-by: Lionel Bouton <[email protected]>
Signed-off-by: Lionel Bouton <[email protected]>
…scan of each table

- use row count estimates to determine the appropriate interval between chunk too reach the DBMaintenanceInterval
- debug logs to check the intervals needed for each table
- data types involved in range computations are done with int64 instead of int to minimize castings int/int64

Signed-off-by: Lionel Bouton <[email protected]>
Comment about expected memory usage: unchanged for folders with less than 8000 files
Capped at 8GiB per folder maxing out at 32 million files in a folder

Signed-off-by: Lionel Bouton <[email protected]>
jtek and others added 3 commits February 18, 2026 21:01
Golang newbie confusing break with continue

Signed-off-by: Lionel Bouton <[email protected]>
- the reason referred in the code doesn't exist anymore,
- simpler code,
- it can only slow down the cleanups by not reusing the SQLite's connection cache.

Signed-off-by: Lionel Bouton <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants