Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@alistair23
Copy link
Contributor

Pull Request Overview

As part of #1882 we started talking about safe waits (#1552).

I have a PR with some initial safe wait support (#1886) in that we can be interrupted while looping on a hardware condition. The idea here is that while entering low power mode we can skip the infinite loop if we get an interrupt.

#1886 doesn't address a large number of loops in the driver capsules as they don't have access to Chip so it's more difficult to wait on interrupts. The harder part is what to do when a wait "fails". What happens when returning to full power fails to occur in a specific time, what happens when a bus fails to respond in a reasonable amount of time? How do we gracefully handle this and continue running in some capacity.

This PR starts to implement a watchdog functionality. This is not instead of a safe wait (in lots of cases a safe wait would be more graceful) but is generally to help avoid infinite loops and help debugging. It is also something that has come up to help make Tock more reliable.

It is up to Chips and boards to handle the watchdog interrupt and decided how to act.

Currently we tickle the watch dog when iterating on each process and at the start of each kernel loop.

Testing Strategy

None

TODO or Help Wanted

Feedback

Documentation Updated

  • Updated the relevant files in /docs, or no updates are required.

Formatting

  • Ran make format.
  • Fixed errors surfaced by make clippy.

@bradjc bradjc added the rfc Issue designed for discussion and to solicit feedback. label May 29, 2020
@bradjc
Copy link
Contributor

bradjc commented Jun 12, 2020

My thought would be to add a watchdog trait, inline with systick, mpu, and UKB. Then the kernel could entirely manage this, and if a chip wants to opt-out it could just not provide a watchdog implementation.

@alistair23
Copy link
Contributor Author

Pushed an update where it's not a general trait like the MPU or systick.

@alistair23 alistair23 closed this Jun 23, 2020
@alistair23 alistair23 reopened this Jun 23, 2020
@hudson-ayers
Copy link
Contributor

I feel that whether to include a watchdog is probably more of a board-specific decision than a chip specific one. For example, people maintaining out of tree boards probably should not be forced to also maintain an out-of-tree version of an otherwise in-tree chip just because the board maintainer has different thoughts on whether to use a watchdog. Maybe kernel_loop() could take in a boolean that indicates whether to use the chip watchdog or the unit version?

Also, I am interested to see an example of how watchdog interrupts would be handled.

@ppannuto
Copy link
Member

ppannuto commented Jul 6, 2020

It is up to Chips and boards to handle the watchdog interrupt and decided how to act.

I'm not quite sure I see in this interface how boards get to configure the watchdog. I imagine this is actually fairly important, as watchdogging is much more of a board-specific question that a chip-specific one.

I do think this is a pretty good interface for the chip part of the watchdog system.

@alistair23 alistair23 force-pushed the alistair/watchdog branch from 57557ec to 64fd1f9 Compare July 7, 2020 19:41
@alistair23
Copy link
Contributor Author

That is a good point. Currently there is no hookup to boards. That is probably something that the Chips crate will have to expose to a board.

@alistair23 alistair23 force-pushed the alistair/watchdog branch from 64fd1f9 to f497b2b Compare July 7, 2020 20:58
@alistair23
Copy link
Contributor Author

I have updated this so it passes the tests and remove the old watchdog HIL and replaced it with this one.

Untested on the SAM4L though

@alistair23 alistair23 marked this pull request as ready for review July 7, 2020 20:58
@alistair23
Copy link
Contributor Author

I have updated this so it passes the tests and remove the old watchdog HIL and replaced it with this one.

Untested on the SAM4L though

hudson-ayers
hudson-ayers previously approved these changes Jul 15, 2020
Copy link
Contributor

@hudson-ayers hudson-ayers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One spelling issue, but I don't think its blocking.

I do think that we really want this to ultimately be configurable by boards rather than chips, but I don't want to block on that either. Eventually there will need to be some thought about how the handling of the watchdog interrupt can be configured by the board, rather than the chip, which will probably require an additional function in the HIL (something like unsafe fn handle_interrupt(), which would be called within the chip interrupt handler but the contents of which could be chosen by the board)

This PR starts to implement a watchdog functionality.
This is not instead of a safe wait (in lots of cases a safe wait would be
more graceful) but is generally to help avoid infinite loops and help
debugging. It is also something that has come up to help make Tock more
reliable.

It is up to Chips (and in the future hopefully boards) to handle the
watchdog interrupt and decided how to act.

Currently we tickle the watch dog when iterating on each process
and at the start of each kernel loop.

Signed-off-by: Alistair Francis <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
@alistair23
Copy link
Contributor Author

I have fixed the spelling issue.

I agree about configuring from boards. There are a few places where we would like to do this (see #1998) so it's something that should happen eventually. We just need to figure out a good way to do it.

@hudson-ayers
Copy link
Contributor

I have fixed the spelling issue.

I don't think you pushed the commit

@alistair23
Copy link
Contributor Author

I did, GitHub is just really slow.

hudson-ayers
hudson-ayers previously approved these changes Jul 15, 2020
@bradjc bradjc added the P-Significant This is a substancial change that requires review from all core developers. label Jul 15, 2020
@bradjc
Copy link
Contributor

bradjc commented Jul 15, 2020

I think the tension here is that after the scheduler PR is merged a board will be able to select a cooperative scheduler, but not disable the watchdog (if the chip crate has enabled the watchdog). A watchdog timer is probably under the purview of the chip crate (I don't think that boards should have to decide what watchdog to use, and there probably is only one choice anyway). Perhaps this falls to the scheduler PR: just like the scheduler algorithm decides how to use the systick, maybe it should also decide how to use the watchdog.

Overall this PR looks good and moves things forward so we should merge it soon.

If we start using the watchdog in the future, it would be nice to add a function to the trait that allows the kernel to check if the last reset was due to the watchdog. This would help with debugging by allowing the kernel to somehow notify developers that the watchdog is triggering.

bradjc
bradjc previously approved these changes Jul 15, 2020
@bradjc
Copy link
Contributor

bradjc commented Jul 16, 2020

@alistair23 @hudson-ayers How do you want to merge this? This before or after #1767?

@hudson-ayers
Copy link
Contributor

I think it is fine for this to go first, I can rebase #1767 after.

@bradjc bradjc added the last-call Final review period for a pull request. label Jul 16, 2020
@bradjc bradjc dismissed stale reviews from hudson-ayers and themself via 8486a76 July 16, 2020 14:36
@bradjc
Copy link
Contributor

bradjc commented Jul 16, 2020

bors r+

@bors
Copy link
Contributor

bors bot commented Jul 16, 2020

@bors bors bot merged commit 14fe53a into tock:master Jul 16, 2020
@alistair23 alistair23 deleted the alistair/watchdog branch July 16, 2020 23:13
@krady21 krady21 mentioned this pull request Sep 21, 2020
2 tasks
bors bot added a commit that referenced this pull request Oct 8, 2020
2118: stm32f3: Watchdog Timers r=bradjc a=krady21

### Pull Request Overview

This pull request adds both watchdogs supported by the stm32f3 boards. The difference between the two of them is thoroughly described [here](https://electronics.stackexchange.com/questions/123080/independent-watchdog-iwdg-or-window-watchdog-wwdg). TLDR: The independent watchdog  is clocked by its own dedicated low-speed clock, but is not as precise, while the window watchdog is more precise, has a configurable time window that can be used to detect early or late abnormalities and can also generate an interrupt just before resetting.

At this point, someone could configure the board to use one, both or neither of the two watchdogs. I don't know if there's need for both, but i wanted to consult with you first before deleting one or the other. I also tried to have this configuration done in the boards file, as it was discussed in a previous [pr](#1887 (comment)), but I am not entirely satisfied with how i did it.

### Testing Strategy

This pull request was tested using the stm32f3discovery board.


### TODO or Help Wanted

The main problem with both of the watchdogs is that none of them provides a way to suspend them once they are started, except by a full system reset. Since suspend and resume functions are unimplemented, sleeping in the kernel_loop [function](https://github.com/tock/tock/blob/ad9387a577405675b044d5bde85badf0274995c8/kernel/src/sched.rs#L495-L514) will probably end up causing a watchdog reset.

### Documentation Updated

- [x] Updated the relevant files in `/docs`, or no updates are required.

### Formatting

- [x] Ran `make prepush`.


Co-authored-by: Bogdan Grigoruta <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

last-call Final review period for a pull request. P-Significant This is a substancial change that requires review from all core developers. rfc Issue designed for discussion and to solicit feedback.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants