Virtualisation
Barry Denby
Griffith College Dublin
January 17, 2019
Virtualisation
I Virtualisation is the main reason why the cloud
can scale applications rapidly
I All resources are virtualised: CPU, memory,
storage, bandwidth etc
I Because they are virtualised it is easy to
allocated and deallocate resources to
applications
I In this lecture we will cover some strategies
and technologies used to virtualise resources
I Lecture heavily based on Cloud Computing:
Theory and Practice
Virtualisation
I Traditional datacentres relied on OSes to
virtualise resources for their applications
I The problem here is that operating systems
only virtualise resources belonging to a single
node
I However in a cloud we need to virtualise
resources for an entire network.
I For this we introduce a software layer beneath
the OS called a hypervisor
I piece of software or hardware that runs virtual
machines
I creates virtual versions of the node resources
I will accept a virtual machine to run on top of
those virtualised resources
Virtualisation: Advantages and
Disadvantages
I Advantages
I More than one virtual machine can be run on a
node at any given time
I Maximised use of resources
I Virtual machine state can be saved and migrated
to a different node and resumed there
I Disadvantages
I Hypervisor consumes some CPU/memory in
providing virtualisation
I Virtual machines run at near native speeds if the
instruction set of the virtual processor and real
processor match (and processor supports
virtualisation)
I Provides an extra attack vector that must be
secured
Virtualisation: Approaches
I We will discuss tow strategies for virtualising
resources
I Full virtualisation: hardware abstraction
provided by virtual machine monitor (VMM) is
an exact replica of the physical hardware
I no OS modifications are required
I Paravirtualisation: requires modifications of the
guest OSes because the hardware abstration
provided by the VMM does not support all the
functions the hardware does
I As virtualisation has become popular CPUs
have added hardware support for virtualisation
Virtualisation: Methods
I Virtualisation provides an abstraction or
simulation of a resource
I Simulation is performed in one of four methods
I Multiplexing: for each resources to be
virtualised, create multiple virtual objects.
Grant access to the physical object fairly
amongst the virtualised objects
I e.g. one CPU being shared amongst many virtual
machines
I Aggregation: The opposite of multiplexing.
Multiple physical objects are combined together
to have the appearance of one single object
I e.g. a RAID array combining multiple disks into
one single large area of storage
Virtualisation: Methods
I Emulation: Construct a virtual object from a
different type of physical object
I For example a virtual ARM processor running onto
of a physical X86 one
I This translation is expensive and slow
I Multiplexing and emulation combined: for
example virtual memory in OSes
Virtualisation: Results of simulation
I Abstraction and simplification of resources
I Users are isolated from each other. i.e. one
user cannot affect another
I Replication support: without virtualisation
replication involves acquiring and initialising
new hardware, OSes, and software stack
I Replication in a VMM only requires a full copy of
the VM
I System Security: Virtual Machines are not
allowed to interfere with each other. They are
sandboxed.
I Restriction is strictly enforced by the VMM
Virtualisation: Results of simulation
I Performance and Reliability
I should new faster hardware become available and
the appropriate VMM is installed the VM can take
immediate advantage of new hardware
I Migration of a VM consists of suspending the VM,
moving it to the new node, and restarting the VM
I VM does not see changes in hardware
Virtualisation: Software stack
I Due to the introduction of the VMM the
software stack is affected
I Software stack in a non-virtualised environment
I Bottom layer lies the hardware, with an OS layer
on top of this
I libraries and APIs are the next layer followed by
applications
I Software stack in a virtualised environment
I Bottom layer lies the hardware, with the VMM
layer on top of this
I OS, libraries, and applications now go into
moveable containers called VMs that sit on top of
VMM
Virtualisation: VMs and virtualised
hardware
I Normal CPU I/O operations that do not
require privileges function in the same way
I Privileged operations are controlled by the
VMM
I Partially because multiplexing for CPU,
storage, and networking is needed
I Also by letting privileged operations run
unchecked it could destroy the state of other
VMs
Virtualisation: VM rights
I Virtual machines are assigned privilege access
rights
I VMM is responsible for granting, revoking, and
enforcing these rights
I Permits tight control on the amount of
resources a VM can use at any time by
modifying the privileges of the VM
Virtualisation: VMs and users
I Virtual machines provide user convenience
I Users can build new VMs by making copies of
the current VM
I Permits the VM to run in a remote
environment instead of user provided hardware
I Both of these points are what makes the cloud
work
Virtualisation: Downsides
I The hypervisor and virtualisation require CPU
time, and memory to work
I Always slower than running the OS and
applications on hardware directly
I Usually because of the VMM trapping and
validating privileged instructions
I Initial hardware costs are much higher
I Nodes for running virtual machines are generally of
a higher specification
I But most of this cost is recouped by running many
virtual machines on the same hardware
Virtualisation: Definitions
I Definition of a VMM
I “A VMM (or hypervisor) is a piece of software
that securely partitions the resources of a
computer system into one or more virtual
machines”
I Definition of a Guest Operating System
I “A guest operating system is an operating
system that runs under the control of a VMM
rather than directly on hardware
Virtualisation: VMMs
I In normal applications there are two modes of
execution
I User mode: execution of non-privileged instructions
(Ring 3 in x86)
I Kernel mode: execution of privileged instructions
(Ring 0 in x86)
I During execution an application will at times
context switch between the two to run a
privileged instruction
I Application must context switch to kernel mode to
execute instruction and then context switch back
to usermode
I Costly operation
I Kernel enforces if applications are permitted to run
privileged operations
Virtualisation: VMMs
I There is added complexity to this when running
a VMM
I VMM runs in kernel mode
I VM runs in user mode
I VM kernel mode must be simulated by the VMM if
virtualisation is not supported by the hardware
I produces slowdown due to translation of operation
from VM to OS
Virtualisation: VMMs
I If hardware virtualisation is supported the cost
of translation is reduced
I A special set of instructions are added along with a
new privilege level (Ring -1 on x86)
I The hypervisor is in control of Ring -1
I Can grant VMs access to Ring 0 from here (with
simulation)
I Ensures that current operating systems and
applications are not affected by a hypervisor
Virtualisation: VMM forms
I A VM can take one of many forms
I Bare-Metal VMM: Thin software layer that
runs on the hardware directly. VMs run on top
of this layer
I Provides the highest performance
I Hypervisor has direct access to hardware
I Reduces the delay in granting VMs access to
privileged instructions
I Hybrid VMM: shares the hardware with a
currently existing OS
I VMM sits along side the OS (think VMWare player
or VirtualBox)
I Easy access to virtual machines
I Slower than bare metal due to competition with OS
Virtualisation: VMM forms
I Hosted VMM: the VM shares many
components of the host OS
I Simplifies the construction of a VM
I However there is a significant loss of performance
and increased overhead
I Some operations are enforced by the host OS and
not the VMM
I Not used anymore
Virtualisation: Performance Isolation
I A necessary condition needed to satisfy Quality
of Service (QoS) in a cloud environment
I The runtime performance of one VM should not be
affected by other VMs
I All VM’s are competing for the same resources
I Must ensure that all VM’s get fair access to those
resources
I In most cases the VMM will restrict the
processing time each VM receives
I To ensure quality of service
Virtualisation: Performance Isolation
I Isolation can be provided in two ways
I Processor virtualisation: each VMM gets
access to a virtual version of the processor but
all instructions are run in hardware
I Virtual version is usually restricted in speed
I Only supports OSes that use the same instruction
set as the processor
I Processor emulation: each VMM gets access to
a software version of the CPU but all
instructions are emulated
I Can support different processors on a single CPU
I Must translate between instructions
I Significantly slower than virtualisation
I In a traditional OS processes and threads are
multiplexed
Virtualisation: Performance Isolation
I Whereas in a VMM entire OSes are
multiplexed
I OSes are much larger than a process or thread
I Thus a VM context switch is more expensive than
a process or thread context switch
I Each VM can only execute user mode
instructions without interference from the
VMM
I Any kernel mode instructions will be trapped by
the VMM and will be handled by the VMM
I These context switches are expensive in time and
instructions
Virtualisation: Security Isolation
I Because the VMM simulates dedicated
hardware for each virtual machine the VMM
must enforce security
I VMs are constrained and isolated from each
other
I It appears to the VMs that they are each
running on their own individual physical node
I Should a VM wish to communicate with
another VM this will be marshalled by the
VMM
Virtualisation: VMM implementations
I VMMs much simpler to construct and specify
than an OS
I There are less privileged functions exposed
I Less chance of a security attack
I For example the Xen VMM consists of roughly
60,000 lines of code where a comparable OS
would be in the millions
Virtualisation: efficient virtualisation
I Popek and Goldberg devised a set of conditions
necessary for virtualisation to be efficient
I A program running on a VMM should behave
identically to a program running on an OS
I The VMM should be in complete control of
resources
I A statistically significant fraction of machine
instructions mus be executed without VMM
intervention
Virtualisation: Instruction Virtualisation
I There are some instructions that are not
virtualisable regardless of the VMM used.
I For these instructions one of two strategies
may be used.
I Binary Translation: non virtualisable instructions
are replaced with other instructions
I Paravirtualisation: The guest OS is modified such
that all nonvirtualisable instructions are replaced
by virtualisable instructions that achieve the same
effect
I Binary Translation is often used in full VMM
environments whereas paravirtualisation is the only
option for paravirtualised environments
Virtualisation: Full Vs Para
I Full virtualisation gives each VM an exact
virtual copy of hardware
I Paravirtualisation gives each VM a slightly
modified virtual version of the hardware
I The CPU and other components must have modes
or instructions to support virtualisation
I If these are not present the only option is
paravirtualisation
Virtualisation: Shadow copies
I VMMs must maintain shadow copies of
standard OS control structures
I These include page tables for virtual memory
I Every time a VM wishes to modify one of these
structures the VMM must trap that instruction
and handle it
I Introduces significant overhead
I Ensures consistency among VMs
I Prevents memory trashing and corruption
I Introduces significant slowdown to a VM
Virtualisation: Shadow copies
I However, it is possible for an application to
perform better in a VMM than in an OS
environment
I Through a process of cache isolation, where
cache is evenly divided between VMs
I Workloads that compete for cache when
running concurrently can be run in two separate
VMs and thus use two separate areas of cache
I Thus no cache trashing, i.e. less cache misses
Virtualisation Hardware Support
I Due to the rising popularity of virtualisation
almost all CPU vendors have embedded
virtualisation modes and instructions in their
processors which bring the following benefits
I Improved performance
I Improved security
I Simplified implementation of hypervisors
I However not all processors are designed to have
virtualisation
I We will look at the example of x86 which is a
30+ year old design and the issues it presents
when adding virtualisation
Virtualisation: x86 issues
I There are many problems faced by x86 with
respect to virtualisation
I Ring deprivileging: applications and OSes are
forced by the VMM to use a higher privilege
level than 0
I VMM should be only one with access to ring 0
I With 64 bit processors the only option is to run the
OS in ring 3 (same ring as userspace)
I For this to work the hypervisor must trap and
emulate all instructions intended for ring 0
Virtualisation: x86 issues
I Ring Aliasing: Issues arising from when a guest
OS or application runs in a different privilege
level under the VMM than it was designed for
I Address Space Compression: VMM stores
system data structures such as interrupt tables
in the guest address space
I As a way of interfacing between the hypervisor and
VM
I The hypervisor must protect these structures but
must permit guest access so the guest can function
Virtualisation: x86 issues
I Non-faulting access to privileged state
I Some CPU instructions can only execute at
privilege level 0
I May load or modify special CPU registers that
affect CPU operation
I The VMM must trap and emulate these
instructions in order to hide from the OS that it
does not have the required privilege level
I This will modify an emulated CPU structure that
the guest will think is the actual CPU
Virtualisation: x86 issues
I Guest System Calls
I SYSENTER and SYSEXIT permit software to
transition in and out of ring 0
I Used to reduce latency on system calls by grouping
them together
I VMM runs in ring zero
I Therefore VMM is responsible for emulating all
instructions that are contained in a SYSENTER
and SYSEXIT
Virtualisation: x86 issues
I Interrupt virtualisation: interrupts generated by
VMs must be caught by the VMM
I VMM must determine which virtual machine is
receiving the interrupt and generate a virtual
interrupt to pass to the VM
I Gets more complex if the OS has the ability to
mask interrupts (pretty much any modern OS)
I Thus the VMM has to monitor and track this
which adds complexity and performance
overheads to the running of the VM
Virtualisation: x86 issues
I Access to hidden state
I Some parts of system state are hidden
I There is no method in x86 to save and restore this
state
I Ring Compression: protects the VMM memory
from trashing by VMs
I Frequent access to privileged resources will
increase VMM overhead
I guest OSes frequently use the task priority register
I VMM must trap and emulate all accesses to this
register and protect it from corruption
I introduces yet more emulation and overheads
Virtualisation: Security
I Like all software systems VMMs have security
concerns
I The VMM is responsible for determining access
to virtual devices from VMs
I It is extremly difficult to get access to a VMM
from a VM but it can be compromised
I In a layered software system like this control is
enforced by the layer closest to the hardware
(in this case the VMM)
Virtualisation: Security
I One way of circumventing a VMM is to insert a
layer below the VMM itself
I This can be done by installing a VMBR
(Virtual Machine Based Rootkit)
I Installs itself between the hardware and the
VMM thus the VMM is now controlled by the
VMBR
I VMM has no idea it is running on a rootkit.
Therefore has no defence from it
I VMBR modifies boot order of the system to
start itself first so it can run the VMM ontop
of itself
Virtualisation: Security
I Software Fault Isolation is used to sandbox
binary code
I As VMs can be insecure or tampered with
I VMs can be tampered in the same way as if it
was running on hardware directly
I Assumes that we have a trusted runtime and
an untrusted application running on top
Virtualisation: Security
I To enforce this rules are applied to the binary
code
I It is set to be read only as malware is usually
self modifying
I Code is divided into 32 byte boundaries which
no instruction can cross
I Disassembly at boundary will reach all valid
instructions
I All indirect flow control instructions are
replaced by pseudo instructions