Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

qazwsxedcrfvtg14
Copy link
Contributor

Pull Request Overview

This pull request fixed a bug that the kernel will crash if we only have one APP and that APP is not ready.

In that kind of case, the scheduler will panic when doing next.unwrap():

Testing Strategy

This pull request was tested by running a few combos of apps / and single app test cases on our platform(ti50).

TODO or Help Wanted

N/A

Documentation Updated

  • Updated the relevant files in /docs, or no updates are required.

Formatting

  • Ran make prepush.

We should always usethe head of the queue instead of using the next of
the current node. Because we will change the location of the current
node.
Copy link
Member

@lschuermann lschuermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand the intricacies of this change.

processes = List {
    head:  ListLink(Some(RoundRobinProcessNode {
        proc: Some($PROCESS),
        next: ListLink(None)
    }),
}

In the current version using for node in self.processes.iter() {, we

  1. create a ListIterator with cur: Some(RoundRobinProcessNode { proc: Some($PROCESS), next: ListLink(None) })
  2. (implicitly) pull the first element from the ListIterator: Iterator through Iterator::next(). This will set cur: None and returns a Some(RoundRobinProcessNode { proc: Some($PROCESS), next: ListLink(None) }) for the loop iteration as node.
  3. Given that first_head is None, set first_head = Some(node).
  4. Check node.proc, which is Some(proc). However, proc.ready() = false. We perform self.processes.push_tail(self.processes.pop_head().unwrap()), which will first set head of the list to None, then set next of proc to None, and finally set head of the list to proc.
    This translates to the list effectively not being changed.
  5. Do step 2 & 3.
  6. Given that first_head is Some(first_head), and first_head == node, break out of the loop, returning SchedulingDecision::TrySleep.

In contrast, the newly proposed changes would do the following:

  1. Run self.processes.head(), which returns Some(RoundRobinProcessNode { proc: Some($PROCESS), next: ListLink(None) }) as Some(node).
  2. Do step 3 & 4 from above.
  3. Do step 1.
  4. Do step 6 from above.

From reading over this code, I can't really make out the difference between two approaches. I suspect I may be missing something which probably has to do with the mutation going on of the list while iterating over it.

Could you perhaps try to clarify exactly what is going on here?

In general though, this code is very complicated to understand. I tried to address this some time ago, along with the inefficiencies we have here through our use of push_tail: #2845. Maybe that's worth going back to? The fact that this scheduling code if of quadratic complexity is pretty crazy.

Copy link
Contributor

@bradjc bradjc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find, and this really emphasizes why we should just not allow code that uses .unwrap(), no matter how much we trust the programmer. It's just too hard to write correct code!

I think what is happening here is that today if there is only one entry in the process array, the .iter() loop only executes once, and so the loop ends with both a) next == None and return Sleep never being called.

With this patch, the loop will run twice when the processes array is 1 item long, so the return will be hit the second time.

Merging this seems good to me, but I of course would be in favor of removing that .unwrap().

@qazwsxedcrfvtg14
Copy link
Contributor Author

IIUC, the "original" idea of the scheduler is looping on the queued elements "forever", and breaks the loop when it sees an element twice.

This patch is just making sure the code aligns with the original idea.


For the quadratic complexity part, I think the quickest way to fix that is fixing the kernel/src/collections/list.rs

    pub fn push_tail(&self, node: &'a T) {
        node.next().0.set(None);
        match self.iter().last() {
            Some(last) => last.next().0.set(Some(node)),
            None => self.push_head(node),
        }
    }

IIUC, the complexity of self.iter().last() is linear, but that can be fixed if we store an extra reference to the current tail.

Another possibility is replacing the list with a static buffer, and storing a reference/index to the current head. (a simple ring buffer.)
But this method would not benefit the mlfq scheduler.

@bradjc bradjc added this pull request to the merge queue Jun 23, 2023
Merged via the queue into tock:master with commit f989880 Jun 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants