L3 Virtualization

Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

Virtualization

Lecture 3
CS451 Cloud Computing
 Virtualization.
 Layering and virtualization.
 Virtual machine monitor.
 Virtual machine.
 x86 support for virtualization.
 Full and paravirtualization.
Contents  Xen.

 Resources:
 Book and,
 VMware White paper: “Understanding Full Virtualization,
Paravirtualization, and Hardware Assisted”
https://www.vmware.com/techpapers/2007/understanding-full-
virtualization-paravirtualizat-1008.html

2
 Three fundamental abstractions are necessary to describe the
operation of a computing systems:
 (1) interpreters/processors, (2) memory, (3) communications links
 As the scale of a system and the size of its users grows, it becomes
very challenging to manage its recourses (see three points above)

 Resource management issues:


 provision for peak demands  overprovisioning
 heterogeneity of hardware and software
Motivation  machine failures
 Virtualization is a basic enabler of Cloud Computing, it simplifies the
management of physical resources for the three abstractions

 For example, the state of a virtual machine (VM) running under a


virtual machine monitor (VMM) can de saved and migrated to another
server to balance the load
 For example, virtualization allows users to operate in environments
they are familiar with, rather than forcing them to specific ones

3
 “Virtualization, in computing, refers to the act of creating a virtual
(rather than actual) version of something, including but not
limited to a virtual computer hardware platform, operating system
(OS), storage device, or computer network resources.” from
Motivation Wikipedia.
(cont’d)
 Virtualization abstracts the underlying resources; simplifies their
use; isolates users from one another; and supports replication
which increases the elasticity of a system.

4
 Cloud resource virtualization is important for:
 Performance isolation
 as we can dynamically assign and account for resources across different
Motivation applications

(cont’d)  System security:


 as it allows isolation of services running on the same hardware
 Performance and reliability:
 as it allows applications to migrate from one platform to another
 The development and management of services offered by a provider

5
 Virtualization simulates the interface to a physical object by:
 Multiplexing
 creates multiple virtual objects from one instance of a physical object.
Many virtual objects to one physical. Example - a processor is
multiplexed among a number of processes or threads.
 Aggregation
 creates one virtual object from multiple physical objects. One virtual
Virtualization object to many physical objects. Example - a number of physical disks
are aggregated into a RAID disk.
 Emulation
 constructs a virtual object of a certain type from a different type of a
physical object. Example - a physical disk emulates a Random Access
Memory (RAM).
 Multiplexing and emulation
 Examples - virtual memory with paging multiplexes real memory and
disk; a virtual address emulates a real address.

6
 Layering – a common approach to manage system complexity:
 Simplifies the description of the subsystems; each subsystem is
abstracted through its interfaces with the other subsystems
 Minimises the interactions among the subsystems of a complex
system
 With layering we are able to design, implement, and modify the
individual subsystems independently
Layering and  Layering in a computer system:
Virtualization  Hardware
 Software
 Operating system
 Libraries
 Applications

7
A1 Applications

API

Libraries A2

ABI
Layering and System calls

Operating System
Interfaces A3

ISA
System ISA User ISA
Hardware

Application Programming Interface (API), Application Binary Interface (ABI), and Instruction
Set Architecture (ISA). An application uses library functions (A1), makes system calls (A2), and
executes machine instructions (A3) (from book)

8
 Instruction Set Architecture (ISA) – at the boundary between
hardware and software.

 Application Binary Interface (ABI) – allows the ensemble


consisting of the application and the library modules to access the
hardware; the ABI does not include privileged system instructions,
Interfaces instead it invokes system calls.

 Application Program Interface (API) - defines the set of


instructions the hardware was designed to execute and gives the
application access to the ISA; it includes high-level language (HLL)
library calls which often invoke system calls.

9
 Binaries created by a compiler for a specific ISA and a specific
operating systems are not portable

 It is possible, though, to compile a HLL program for a virtual


machine (VM) environment where portable code is produced and
Code distributed and then converted by binary translators to the ISA of
the host system
portability
 A dynamic binary translation converts blocks of guest instructions
from the portable code to the host instruction and leads to a
significant performance improvement, as such blocks are cached
and reused

10
HLL code

Compiler front-end Compiler

Intermediate Portable

HLL Language code code

Translations Compiler back-end VM loader

Object code VM image

VM compiler/ VM compiler/
Loader
interpreter interpreter

Memory Memory Memory


image image ISA-1 image ISA-2

11
 A virtual machine monitor (VMM/hypervisor) partitions the
resources of computer system into one or more virtual machines
Virtual (VMs). Allows several operating systems to run concurrently on a
single hardware platform
Machine  A VM is an execution environment that runs an OS
Monitor  VM – an isolated environment that appears to be a whole
computer, but actually only has access to a portion of the
(VMM / computer resources

Hypervisor)  A VMM allows:


 Multiple services to share the same platform
 Live migration - the movement of a server from one platform to
another
 System modification while maintaining
 backward compatibility with the original system
 Enforces isolation among the systems, thus security
 A guest operating system is an OS that runs in a VM under the
control of the VMM.

12
 A VMM (also hypervisor) (howto):
 Traps the privileged instructions executed by a guest OS and
enforces the correctness and safety of the operation
 Traps interrupts and dispatches them to the individual guest
operating systems
VMM  Controls the virtual memory management
Virtualizes the  Maintains a shadow page table for each guest OS and replicates any
modification made by the guest OS in its own shadow page table.
CPU and the This shadow page table points to the actual page frame and it is
used by the Memory Management Unit (MMU) for dynamic address
Memory translation.
 Monitors the system performance and takes corrective actions to
avoid performance degradation. For example, the VMM may swap
out a VM to avoid thrashing.

13
Type 1 Hypervisor Type 2 Hypervisor

Type 1 and 2
Hypervisors

 Taxonomy of VMMs:
1. Type 1 Hypervisor (bare metal, native): supports multiple virtual machines and
runs directly on the hardware (e.g., VMware ESX , Xen, Denali)
2. Type 2 Hypervisor (hosted) VM - runs under a host operating system (e.g., user-
mode Linux)

14
 The run-time behavior of an application is affected by other
applications running concurrently on the same platform and
competing for CPU cycles, cache, main memory, disk and network
access. Thus, it is difficult to predict the completion time!

 Performance isolation - a critical condition for QoS guarantees in


shared computing environments
Performance
and Security  A VMM is a much simpler and better specified system than a
traditional operating system. Example - Xen has approximately
Isolation 60,000 lines of code; Denali has only about half: 30,000

 The security vulnerability of VMMs is considerably reduced as the


systems expose a much smaller number of privileged functions. For
example, Xen VMM has 28 hypercalls while Linux has 100s of system
calls

15
Conditions for
 Conditions for efficient virtualization (from Popek and Goldberg):
Efficient  A program running under the VMM should exhibit a behavior
Virtualization essentially identical to that demonstrated when running on an
equivalent machine directly.
(from Popek  The VMM should be in complete control of the virtualized resources.
 A statistically significant fraction of machine instructions must be
and Goldberg): executed without the intervention of the VMM. (Why?)

16
 Dual-mode operation allows OS to protect itself and other system
components
 User mode and kernel mode
 Mode bit provided by hardware
 Ability to distinguish when system is running user or kernel code
 Some instructions are privileged, only executable in kernel mode

Dual-Mode  System call changes mode to kernel, return resets it to user

Operation
(recap)

17
 Kernel-code (in particular, interrupt handlers) runs in kernel mode
User-mode vs  the hardware allows all machine instructions to be executed and
allows unrestricted access to memory and I/O ports
Kernel-mode  Everything else runs in user mode
(recap)  The OS relies very heavily on this hardware-enforced protection
mechanism

18
 Four layers of privilege execution  rings
Challenges of  User applications run in ring 3
 OS runs in ring O
x86 CPU  In which ring should the VMM run?
Virtualization  In ring O, then, same privileges as an OS  wrong
 In rings 1,2,3, then OS has higher privileges  wrong
 Move the OS to ring 1 and the VMM in ring O  OK

 Three classes of machine instructions:


 privileged instructions can be executed in kernel mode.
 When attempted to be executed in user mode, they cause a trap and so
executed in kernel mode.
 nonprivileged instructions the ones that can be executed in user mode
 sensitive instructions can be executed in either kernel or user but they
behave differently. Sensitive instructions require special precautions at
execution time.
 sensitive and nonprivileged instructions are hard to virtualize

19
Techniques for  Full virtualization with binary translation
Virtualizing  OS-assisted Virtualization or Paravirtualization
CPU on x86  Hardware assisted virtualization

20
Techniques for  Full virtualization
 a guest OS can run unchanged under the VMM as if it was running
directly on the hardware platform. Each VM runs an exact copy of the
Virtualizing actual hardware.
 Binary translation rewrites parts of the code on the fly to replace
CPU on x86 – sensitive but not privileged instructions with safe code to emulate the
original instruction
Full  “The hypervisor translates all operating system instructions on the fly
and caches the results for future use, while user level instructions run
unmodified at native speed.” (from VMware paper)
Virtualization  Examples:
 VMware, Microsoft Virtual Server
 Advantages:
 No hardware assistance,
 No modifications of the guest OS
 Isolation, Security
 Disadvantages:
 Speed of execution

21
Techniques for  Para-virtualization
 “involves modifying the OS kernel to replace non- virtualizable
Virtualizing instructions with hypercalls that communicate directly with the
virtualization layer hypervisor.
CPU on x86 –  The hypervisor also provides hypercall interfaces for other critical
Para- kernel operations such as memory management, interrupt handling
and time keeping. “ (from VMware paper)
virtualization  Advantage:
 faster execution, lower virtualization overhead

 Disadvantage:
 poor portability

 Examples:
 Xen, Denali

22
Guest OS Guest OS
Hardware Hardware
abstraction abstraction
layer layer
Full Virtualization
and Para- Hypervisor Hypervisor
virtualization
Hardware Hardware

(a) Full virtualization (b) Paravirtualization

23
Techniques for  Hardware Assisted Virtualization
 “a new CPU execution mode feature that allows the VMM to run in a
Virtualizing new root mode below ring O.
CPU on x86 –  As depicted in Figure 7, privileged and sensitive calls are set to
automatically trap to the hypervisor, removing the need for either
Hardware binary translation or para-virtualization“ (from VMware paper)

Assisted
 Advantage:
Virtualization  even faster execution

 Examples:
 Intel VT-x, Xen 3.x

24
 In 2005 Intel released two Pentium 4 models supporting VT-x.
 VT-x supports two modes of operations (Figure (a)):
 VMX root - for VMM operations.
 VMX non-root - support a VM.
 And a new data structure called the Virtual Machine Control
Structure including host-state and guest-state areas (Figure (b)).
VT-x, a Major  VM entry
 the processor state is loaded from the guest-state of the VM scheduled
Architectural to run; then the control is transferred from VMM to the VM.
 VM exit
Enhancement  saves the processor state in the guest-state area of the running VM;
then it loads the processor state from the host-state area, finally
transfers control to the VMM.

25
 The goal was to design a VMM capable of scaling to about 100 VMs
running standard applications and services without any
modifications to the Application Binary Interface (ABI).
 Linux, Minix, NetBSD, FreeBSD and others can operate as
paravirtualized Xen guest OS running on x86, x86-64, Itanium, and
ARM architectures.
Xen - a VMM
 Xen domain
based on  ensemble of address spaces hosting a guest OS and applications
running under the guest OS. Runs on a virtual CPU.
Paravirtualizati  DomO - dedicated to execution of Xen control functions and privileged
on instructions.
 DomU - a user domain.

 Applications make system calls using hypercalls processed by Xen;


privileged instructions issued by a guest OS are paravirtualized
and must be validated by Xen.

26
Management
OS Application Application Application

Guest OS Guest OS Guest OS


Xen-aware
device drivers
Xen-aware Xen-aware Xen-aware

Xen device drivers device drivers device drivers

Xen
Domain0 control Virtual x86 Virtual physical Virtual block
interface Virtual network
CPU memory devices

X86 hardware

27
 XenStore – a DomO process.
 Supports a system-wide registry and naming service.
 Implemented as a hierarchical key-value storage.
 A watch function informs listeners of changes of the key in storage
they have subscribed to.
 Communicates with guest VMs via shared memory using Dom0
privileges.
Dom0  Toolstack - responsible for creating, destroying, and managing the
Components resources and privileges of VMs.
 To create a new VM, a user provides a configuration file describing
memory and CPU allocations and device configurations.
 Toolstack parses this file and writes this information in XenStore.
 Takes advantage of DomO privileges to map guest memory, to load
a kernel and virtual BIOS and to set up initial communication
channels with XenStore and with the virtual console when a new VM
is created.

28
Strategies for
virtual
memory
management,
CPU
multiplexing,
and I/O devices

29
 Each domain has one or more Virtual Network Interfaces (VIFs)
which support the functionality of a network interface card. A VIF
is attached to a Virtual Firewall-Router (VFR).
 Split drivers have a front-end in the DomU and the back-end in
Xen Dom0; the two communicate via a ring in shared memory.
 Ring - a circular queue of descriptors allocated by a domain and
Abstractions accessible within Xen. Descriptors do not contain data, the data
for buffers are allocated off-band by the guest OS.
 Two rings of buffer descriptors, one for packet sending and one for
Networking packet receiving, are supported.
and I/O  To transmit a packet:
 a guest OS enqueues a buffer descriptor to the send ring,
 then Xen copies the descriptor and checks safety,
 copies only the packet header, not the payload, and
 executes the matching rules.

30
I/O channel
Driver domain Guest domain
Bridge

Backend Frontend

Network
interface

Event channel

 Xen zero-copy semantics for XEN

data transfer using I/O rings. NIC


(a)

 The communication between Request queue

a guest domain and the


Xen I/O driver domain over an I/O and
Consumer Request
(private pointer in Xen)
Producer Request
(shared pointer updated
by the guest OS)

an event channel; NIC is the


Network Interface Controller.
 The circular ring of buffers. Outstanding
descriptors
Unused
descriptors

Consumer Response
Producer Response
(private pointer maintained by
(shared pointer updated Response queue the guest OS)
by Xen)
(b)

31
Driver domain Guest domain Driver domain Guest domain

Bridge Bridge

Xen 2.0 I/O


Offload
Driver I/O

Network NIC Backend


channel
Virtual NIC Backend
channel High Level
Virtual
Driver Interface Interface Driver Interface
Architecture Interface

Physical Xen VMM Physical Xen VMM


NIC NIC

(a) (b)
The original architecture The optimized architecture

32
 In a layered structure, a defense mechanism at some layer can be
disabled by malware running at a layer below it.
 It is feasible to insert a rogue VMM, a Virtual-Machine Based
Rootkit (VMBR) between the physical hardware and an operating
system.
The Darker  Rootkit - malware with a privileged access to a system.

 The VMBR can enable a separate malicious OS to run


Side of surreptitiously and make this malicious OS invisible to the guest
Virtualization OS and to the application running under it.
 Under the protection of the VMBR, the malicious OS could:
 observe the data, the events, or the state of the target system.
 run services, such as spam relays or distributed denial-of-service
attacks.
 interfere with the application.

33
 The insertion of a Virtual-Machine Based Rootkit (VMBR) as the
lowest layer of the software stack running on the physical
hardware; (a) below an operating system; (b) below a legitimate
virtual machine monitor. The VMBR enables a malicious OS to run
surreptitiously and makes it invisible to the genuine or the guest
OS and to the application.
The Darker Application

Side of Application
Malicious Guest OS
Virtualization OS
Operating
(con’t) Malicious
OS
system (OS) Virtual machine monitor

Virtual machine based rootkit Virtual machine based rootkit

Hardware Hardware

(a) (b)
34
 A Linux Container is a Linux process (or processes) that is a virtual
environment with its own process network space. (lightweight
process virtualization)
 Containers share portions of the host kernel
 Containers use:
 Namespaces
 per-process isolation of OS resources (filesystem, network and user ids)
Linux  Cgroups
Containers  resource management and accounting per process

 Examples for using containers:


 https://www.dotcloud.com/
 https://www.heroku.com/

opensource.com 35
Comparison of
Traditional
virtualization
and containers

36
Why do we
want to run
our application
inside
containers?

37
● Lightweight footprint and minimal overhead,
● Portability across machines,
● Simplify DevOps practices,
● Speeds up Continuous Integration,
● Empower Micro-services Architectures.
Container ● Isolation

Advantages

38
 Virtualization
 Layering and virtualization.
 Virtual machine monitor.
Summary  Virtual machine.
 x86 support for virtualization.
 Xen.

39

You might also like