Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

Corruption bug: reuse_logs can reuse a log with an incomplete record #1287

@robofinch

Description

@robofinch

If the experimental reuse_logs option is enabled, a log will be reused so long as:

  • a previous log (MANIFEST or most-recent write-ahead log) exists (and was not compacted)
  • reading that log with log::Reader never reported a corruption error (or report that bytes were dropped)
  • the records returned by log::Reader were valid (version edits forming a valid version, or valid write batches)
  • the file size of that log could be read, and is not too large
  • the log had a filename of the correct type (MANIFEST-* or *.log) and could successfully be opened as appendable

There's an important edge case not among the above conditions.

A log::Writer could terminate / crash in the middle of writing a record, and the database needs to be resilient against that.
As such, if log::Reader encounters end-of-file and the last record is incomplete (that is, either the last record fragment is shorter than a record header, or the record length indicated in the header is longer than the length of data actually read in the record fragment), log::Reader assumes that the log::Writer creating that record had died while writing that record, and does not report a corruption error (or report that bytes were dropped); it merely discards that incomplete record. (Regardless of whatever data was being written, it couldn't have been reported as successfully written to the user, so it's correct to discard it.)

So... what happens if we append more data to the end of a log whose last record was incomplete? The last record would likely no longer look like it was incomplete, and either that record or a following record would be very likely to appear corrupted. Moreover, as this could affect the MANIFEST, compactions could then be reported in a corrupted log, potentially resulting in arbitrarily large amounts of user data being reported as corrupted. (Presumably it wouldn't be too difficult to recover from this error, but I doubt a casual end-user of something using LevelDB would know how to recover their data.)

Solution

Have log::Reader report whether EOF was reached at the end of a complete record, or if EOF occurred early and the last record is incomplete. (In each place where a log can be reused, the previous log had been read with a log::Reader not too long before.) Then, a log can be reused only if its last record isn't incomplete.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions