Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@michael2012z
Copy link
Member

@michael2012z michael2012z commented May 8, 2020

This pull request contains the code to enable the basic functionalities of Cloud-hypervisor on AArch64.

Summary of the major changes:

  • Enabled vm-allocator, device manager, memory manager and vcpu manager for AArch64.
  • Enabled Virtio devices with MMIO transport option.
  • Configured VM: load kernel, setup GIC setup and generate FDT.
  • Exit VMM when guest shutdown on AArch64.
  • Added a guide for testing on Arm64.

What was tested:

  • Booting a VM (Ubuntu 18.04) with up to 4 VCPUs.
  • Booting a VM with up to 256 GiB memory.
  • Serial I/O.
  • Virtio devices with MMIO transport option: virtio-blk and virtio-console devices were tested.

@michael2012z michael2012z changed the title Aarch64 enable Enable AArch64 May 8, 2020
@rbradford
Copy link
Member

@michael2012z Thanks for your contribution. This is a huge and significant contribution for your first one!

I can see that you have composed the PR of many well structured commits but we may ask you to revisit the way aspects of this PR are structured (possibly even splitting it if necessary.)

* Now Cloud-hypervisor forked these 2 crates and made necessary modification for X86, but they havenot been ready on ARM. The temporary workaround is modifying some Cargo.toml files to use offical versions of kvm-bindings and kvm-ioctls (without serde support) before building on ARM.

In your PR you have the lines that change the change the dependencies commented out?

For kvm-bindings our ch branch has everything that upstream master has (plus extras) are you saying that you have to use the older 0.2.0 tag because master does not work

For kvm-ioctls I can see we have diverged slightly and we should be able to rebase the changes on top.

@michael2012z
Copy link
Member Author

Hi, @rbradford

When I tried to build on an ARM machine, some errors were seen:

   Compiling kvm-bindings v0.2.0 (https://github.com/cloud-hypervisor/kvm-bindings?branch=ch#3a678008)
error: cannot find derive macro `Deserialize` in this scope
  --> /usr/local/rust/git/checkouts/kvm-bindings-c6b55506be61a061/3a67800/src/arm64/mod.rs:29:43
   |
29 | #[cfg_attr(feature = "with-serde", derive(Deserialize, Serialize))]
   |                                           ^^^^^^^^^^^

error: cannot find derive macro `Serialize` in this scope
  --> /usr/local/rust/git/checkouts/kvm-bindings-c6b55506be61a061/3a67800/src/arm64/mod.rs:29:56
   |
29 | #[cfg_attr(feature = "with-serde", derive(Deserialize, Serialize))]
   |                                                        ^^^^^^^^^

ch branch of kvm-bindings has some modifications for Serde, but maybe it's not ready for AArch64. So I have to fallback to the upstream version without Serde to workaround the error. The modification of kvm-ioctls dependency shared similar reason.

And I kept my local modification as comments to remind, if someones want to try on ARM.

@michael2012z
Copy link
Member Author

I can see that you have composed the PR of many well structured commits but we may ask you to revisit the way aspects of this PR are structured (possibly even splitting it if necessary.)

Do you suggest to split this single PR into several smaller ones?

@rbradford
Copy link
Member

Do you suggest to split this single PR into several smaller ones?

We'll let you know how we'd like to see it split up but don't do any work on this at the moment.

@michael2012z
Copy link
Member Author

Thanks. I will wait for your advice and do the restructuring then.

I am wondering why the CI jobs hang unfinished. I saw it also happened sometimes in other PR's.

@rbradford
Copy link
Member

@michael2012z Again thank you for submitting this PR. Please could you extract the following changes into separate PRs so that we can start making progress on merging this.

Firstly can you separate the abstraction and refactoring you did for the IOAPIC (the InterruptController trait etc.)

Secondly can you create the most minimal PR that makes the code build (I don't expect it to be functional) with the rust stable aarch64-unknown-linux-musl toolchain. Adding it to the GitHub actions so that we can build test the code.

Longer term we will need to establish a CI to be able to try and run the integration test suite.

With those two PRs reviewed and merged I think the PR will be a lot less intrusive and more easily reviewable. It's also possible that some of the other "portability" fixes would also benefit from being in their own PR.

Thanks, I look forward to seeing your submissions.

@michael2012z
Copy link
Member Author

Hi, @rbradford

Nice to see your feedback. Besides the splitting of InterruptController and porting code you mentioned, I am also going to separate more from this PR. The plan is:

  • Split the update on Dockerfile & build scripts in a separate PR. And fix all building errors on ARM. Just to confirm, do you mean the same thing by can you create the most minimal PR that makes the code build?
  • Separate PR for arch/, including layout design and porting code of GIC, FDT.
  • Separate PR for RTC (porting code). Several files were included, better to put them into a PR.
  • Refactoring IOAPIC.
  • Refactoring booting sequence. A big change was made in booting a VM, especially in creating and activating VCPU's. Better to use a separate PR to review and merge.
  • Other things (memory manager, vm allocator, secomp, etc.) will remain in current PR to finally enable AArch64.

@rbradford
Copy link
Member

rbradford commented May 12, 2020

Just to confirm, do you mean the same thing by can you create the most minimal PR that makes the code build?

Using the cross toolchain the compilation succeeds (when started by a GitHub action.)

@egernst
Copy link

egernst commented May 12, 2020

Is it feasible to augment existing device support documentation in order to clarify the 'support level' per architecture, as we expand the architecture support?

Nice work @michael2012z

@michael2012z
Copy link
Member Author

@egernst Thanks.

Yes, that document should be updated to reflect the difference between architectures.

@michael2012z michael2012z changed the title Enable AArch64 [WIP] Enable AArch64 May 14, 2020
@michael2012z michael2012z changed the title [WIP] Enable AArch64 Enable AArch64 Jun 10, 2020
@michael2012z michael2012z force-pushed the aarch64_enable branch 2 times, most recently from 51bd18d to f1f5825 Compare June 10, 2020 03:59
@michael2012z
Copy link
Member Author

Hi, @rbradford, @sboeuf Having much code separated, this PR came back with ARM-enabling code only. Could you help review it?

@egernst Regarding the documents, now I wrote a very small guide for testing on Arm64. More documenting work is not included in this PR. My personal opinion is it's better update device supporting document when we have CI (in progress now) and have more things stably tested.

Copy link
Member

@rbradford rbradford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking very good. Could you do a quick pass through the commit messages and:

  • Standardise on aarch64 or arm64 (or whichever :))
  • Try and make the commit messages describe WHY rather than WHAT (the code should ideally show the what).
  • I think some of the commit messages have some strange wrapping issues.

As always look for ways to minimise the amount of compile-time differences even if that means refactoring the existing code.

Comment on lines 104 to 106
pub fn allocate_irq(&mut self) -> Result<u32> {
self.next_irq = self.next_irq.checked_add(1).ok_or(Error::Overflow)?;
Ok(self.next_irq - 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Purely as a matter of taste I prefer:

let irq = self.next_irq;
self.next_irq = self.next_irq.checked_add(1).ok_or(Error::Overflow)?;
Ok(irq)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better than using an additional subtracting.


#[cfg(target_arch = "x86_64")]
/// Allocate an IRQ
pub fn allocate_irq(&mut self) -> Result<u32> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for you to fix but if we only support one IOAPIC (@sboeuf - would we need more?) then maybe we could heavily simplify this function and make it look like the ARM version.

ConsoleOutputMode::Off | ConsoleOutputMode::Null => None,
};
let serial = if serial_config.mode != ConsoleOutputMode::Off {
#[cfg(target_arch = "x86_64")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refactor this into two functions; one for each architecture. I think it will be much easier to read (e.g. add_serial_device(...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Thanks.

};

#[cfg(target_arch = "aarch64")]
let device_type = virtio_device.lock().unwrap().device_type();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this line lower and then use block syntax for the architecture control

Copy link
Member Author

@michael2012z michael2012z Jun 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

virtio_device is consumed among next lines, I can't move it down.
I chose to move self.id_to_dev_info.insert() up, and then have to move irq_num in front of them.

io_base: GuestAddress,
io_size: GuestUsize,
#[cfg(target_arch = "x86_64")] io_base: GuestAddress,
#[cfg(target_arch = "x86_64")] io_size: GuestUsize,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than all this complexity. What about making it an Option<T> type. You could also do the same for everywhere the io_bus is used. That would remove a bunch of conditionals and make the code easier to read.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original consideration to screen IO things with conditions is:

  1. IO is not for ARM, having it in binary looks weird.
  2. Hiding IO bus may avoid potential mistakes in code. It should be wrong if some ARM code is using IO operating/bus, but this will be impossible if IO things are not present. I also planed to make PciBarRegionType::IORegion x86_64 only when coming to support PCI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay!

Implemented GSI allocator and system allocator for AArch64.
Renamed some layout definitions to align more code between architectures.

Signed-off-by: Michael Zhao <[email protected]>
Screened IO bus because it is not for AArch64.
Enabled Serial, RTC and Virtio devices with MMIO transport option.

Signed-off-by: Michael Zhao <[email protected]>
Screened IO space as it is not available on AArch64.

Signed-off-by: Michael Zhao <[email protected]>
Added MPIDR which is needed in system configuration.

Signed-off-by: Michael Zhao <[email protected]>
Signed-off-by: Michael Zhao <[email protected]>
X86 and AArch64 work in different ways to shutdown a VM.
X86 exit VMM event loop through ACPI device;
AArch64 need to exit from CPU loop of a SystemEvent.

Signed-off-by: Michael Zhao <[email protected]>
The support of AArch64 is in very early stage. The steps in building and
runing on X86 and AArch64 can not align well yet. Adding AArch64 content
to README.md would produce much divergence.
Adding a guide in docs/ folder could be a better way to start now.

Signed-off-by: Michael Zhao <[email protected]>
.clone();

arch::configure_system(
&self.memory_manager.lock().as_ref().unwrap().fd,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Echoing comment #1126 (comment)
The fd was added for this. Now I set fd field public in memory manager and BORROW it back from there.

@michael2012z
Copy link
Member Author

Thanks for reviewing.
Code was updated.
I also fixed commit messages. Some messages were removed because the titles explain.

@rbradford
Copy link
Member

@sboeuf Please review

Copy link
Member

@rbradford rbradford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks pretty good to me!


## Prerequisites

On Arm64 machines, Cloud-hypervisor depends on an external library `libfdt-dev` for generating Flatted Device Tree (FDT).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: Flatted should be Flattened

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michael2012z Can you please fix this in a subsequent PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @sboeuf , @rbradford . I will fix it in my next PR.

@rbradford rbradford merged commit 3f18f93 into cloud-hypervisor:master Jun 11, 2020
@michael2012z michael2012z mentioned this pull request Jun 12, 2020
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants