litex/uart: fix TX race condition of unexpected txfull interrupt #2576
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Overview
The LiteUART peripheral driver transmit code contained a bug where in a certain scenario, an interrupt is not expected to be thrown and a deferred call is requested instead, but the hardware generates an interrupt anyways. If this happens, a panic such as the following will be issued:
The root cause lies in the fact that the LiteUART hardware peripheral uses a FIFO buffer of a certain depth. Transmission of bytes is started as soon as data is fed into the FIFO. A TX event (interrupt) will be asserted on a falling edge of the
txfull
signal, i.e. as soon as the hardware is capable to send more data but was at capacity.The driver as implemented in Tock will -- in a loop -- check if there is still space in the transmission FIFO (!txfull) and place bytes into the FIFO accordingly. As soon as either the buffer is full (txfull == true), OR the entire buffer has been transmitted, the loop is aborted. Following that, the code evaluates whether the hardware will issue an interrupt, either to continue sending more data or inform the client about the finished transmission. If the buffer has been fully transmitted and txfull was never asserted, the hardware is assumed to not generate an interrupt.
However, the hardware might be able to transmit some data between inserting a byte into the FIFO and checking whether the FIFO is full, causing txfull to not be asserted during the check while the FIFO was temporarily at capacity.
During normal transmission this is not an issue, since the driver will simply fill the free space with additional data. However, when this occurs on the last byte to send, the driver assumes that the buffer capacity has not been reached, while the hardware was temporarily at capacity (txfull == true) and sent at least one byte prior to reading txfull (which will report txfull == false). As specified, the TX interrupt is asserted on this falling edge and as such both an interrupt and a deferred call will be delivered, causing the aforementioned panic.
This commit introduces an additional check to determine whether an interrupt will be delivered: if either txfull is asserted (an interrupt will be generated as soon as transmission of at least one byte completes) OR if the TX event is already asserted (txfull was temporarily asserted) an interrupt will be delivered and no deferred call is issued.
I've known of this bug for some time, though it was hard to trace down:
Testing Strategy
This pull request was tested by running a kernel/app combination which previously managed to cause the panic on a 1MBaud UART for approx. 20 hours. There's a good chance this fixes it, as it makes sense when looking at the Verilog/Migen side of things. There might be additional issues generating further unexpected interrupts, though I'm pretty confident this solves the issues and I have the interrupts under control now 😄.
TODO or Help Wanted
N/A
Documentation Updated
Updated the relevant files inor no updates are required./docs
,Formatting
make prepush
.