Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@L-M-Sherlock
Copy link
Contributor

@L-M-Sherlock L-M-Sherlock commented Jul 30, 2025

Problem

It takes a long time to fill all existing cards' last_review_time because the user needs to review them all.

Source: https://forums.ankiweb.net/t/fill-the-card-last-review-time-field-when-check-database/64792

Solution

This PR adds functionality to the database check process to automatically fix cards with missing last review times by:

  1. Identifying affected cards: During database check, find cards that are not new (have been reviewed) but have no last_review_time recorded
  2. Recovery from revlog: Use the revision log entries to determine the actual last review time for these cards
  3. Automatic repair: Update the cards with the correct last_review_time based on their review history
  4. User feedback: Report the number of fixed cards to the user through the database check results

Key Changes

  • New database check category: Added card_last_review_time_empty counter to track and report fixed cards
  • Improved revlog filtering: Refactored review entry filtering logic into a reusable has_rating_and_affect_scheduling() method that properly excludes manual reschedules and cramming sessions
  • Enhanced card repair: Extended fix_card_properties() to also repair missing last review times by cross-referencing with revlog data
  • Localization support: Added user-facing messages to inform users about the repair process

Benefits

  • Better scheduling accuracy: Cards with proper last review times will be scheduled more accurately
  • Improved statistics: Review statistics will be more accurate when all cards have correct last review times
  • Automatic repair: Users don't need to manually fix corrupted card data - it's handled automatically during database check
  • Transparency: Users are informed about what was fixed during the database check process

The fix is conservative and only updates cards where the data is clearly missing, ensuring no valid data is overwritten.

if e.button_chosen >= 1 {
if e.has_rating_and_affects_scheduling() {
last_reviewed_at = Some(e.id.as_secs());
}
Copy link
Contributor

@user1823 user1823 Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am assuming that this loop goes from earlier revlogs to later revlogs. So, without the following code, if a card is Reset and doesn't have any subsequent rating, the last_reviewed_at will be the timestamp of the revlog before the Reset entry, which is not desired.

Suggested change
}
} else if e.button_chosen = 0 && e.ease_factor = 0 {
last_reviewed_at = None;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the card is Reset, its type will be New and skipped in the filling process.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it still makes sense to correct the output of the function because we might use it for another purpose later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I find it's better to wrap this condition in a new function because this condition is also used in other cases.

for (card_id, last_revlog_info) in last_revlog_info {
let card = self.get_card(card_id)?;
if let Some(mut card) = card {
if card.ctype != CardType::New && card.last_review_time.is_none() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@user1823 you can check this code.

.execute(params![mtime, usn])?;
Ok((new_cnt, other_cnt))
let mut last_review_time_cnt = 0;
let revlog = self.get_all_revlog_entries_in_card_order()?;
Copy link
Contributor

@user1823 user1823 Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than fetching all revlogs and calculating last review time for each card, wouldn't it be faster to fetch the revlogs of only those cards which lack last_review_time?

Something like this: (untested)

SELECT max(id), cid GROUP BY cid FROM revlog WHERE ease >= 1 AND cid IN (SELECT id FROM cards WHERE type >= 1 AND json_extract(data, '$.lrt') IS NULL)

Copy link
Contributor Author

@L-M-Sherlock L-M-Sherlock Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current code is fast enough in some large collections. But I hope @dae could test it in some huge collections.

This commit introduces the `is_reset` method to the `RevlogEntry` struct, which identifies entries representing reset operations. Additionally, the scheduling logic in `memory_state.rs` and `params.rs` has been updated to utilize this new method, ensuring that reset entries are handled correctly during review scheduling.
This commit adds the `is_cramming` method to the `RevlogEntry` struct, which identifies entries representing cramming operations. The scheduling logic in `params.rs` has been updated to utilize this new method, improving the clarity and maintainability of the code.
…nctions

This commit introduces a new `has_rating` method in the `RevlogEntry` struct to encapsulate the logic for checking if an entry has a rating. The scheduling logic in `params.rs` and the calculation of normal answer counts in `card.rs` have been updated to use this new method, enhancing code clarity and maintainability.
@user1823
Copy link
Contributor

user1823 commented Jul 31, 2025

The tests can be fixed by adding button_chosen: 0, to those revlogs in the tests that have RevlogReviewKind::Manual.

Or, you can modify revlog() so that it automatically sets button_chosen to 0 if review_kind is Manual or Rescheduled.

usn: Usn,
v1_sched: bool,
) -> Result<(usize, usize)> {
) -> Result<(usize, usize, usize)> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Struct would be better than unnamed tuple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done on 86bbc2c

if let Some(mut card) = card {
if card.ctype != CardType::New && card.last_review_time.is_none() {
card.last_review_time = last_revlog_info.last_reviewed_at;
self.update_card(&card)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the potential to cause data loss if run before changes on other devices have been synced. Like in #4236 (comment) we could write the changes without mtime/usn bump, but I don't think that's a good idea, as then last_review_time doesn't propagate. I think our best approach here may be calling set_schema_modified() if last_review_time_cnt > 0 (dbcheck.rs:310), forcing the user to either do a one-way sync of the changes to other devices, or revert their changes.

Copy link
Member

@dae dae Aug 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to give this a bit more thought. In the past, a forced one-way sync was appropriate for fixing some issues, but this is probably going to happen fairly frequently until people have updated all of their clients, so I'm a bit worried about the amount of one way syncs it'll cause. But perhaps I'm overthinking things. Anyone want to weigh in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done on 6947afd

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert their changes.

I am assuming this means using "Download from AnkiWeb" on the next sync. Isn't this more likely to cause data loss? If there are unsynced changes on both devices and the user uses Check Database, they won't get any option to preserve their changes because Anki will force a one-way sync.

/// The `ease_factor` should be 0 because
/// [`crate::scheduler::states::ReviewState::revlog_kind`] returns
/// `RevlogReviewKind::Filtered` when `days_late() < 0`.
pub(crate) fn is_cramming(&self) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These helpers are a nice touch.

@dae
Copy link
Member

dae commented Aug 6, 2025

Thank you both. I'm still a bit worried about the extra full syncs this may cause, but I can't see a better way to handle this right now. Bumping usn without mtime won't help - the sync protocol will still treat that as newer, resulting in other changes being overwritten, so we still need the full sync so that users are aware of the conflict.

@dae dae merged commit 62e01fe into ankitects:main Aug 6, 2025
1 check passed
@L-M-Sherlock L-M-Sherlock deleted the Fix-Cards-with-Missing-Last-Review-Time-During-Database-Check branch August 6, 2025 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants