NAME
libnvmm
—
NetBSD Virtualization API
LIBRARY
library “libnvmm”
SYNOPSIS
#include
<nvmm.h>
int
nvmm_init
(void);
int
nvmm_capability
(struct
nvmm_capability *cap);
int
nvmm_machine_create
(struct
nvmm_machine *mach);
int
nvmm_machine_destroy
(struct
nvmm_machine *mach);
int
nvmm_machine_configure
(struct
nvmm_machine *mach,
uint64_t op,
void *conf);
int
nvmm_vcpu_create
(struct
nvmm_machine *mach,
nvmm_cpuid_t cpuid,
struct nvmm_vcpu
*vcpu);
int
nvmm_vcpu_destroy
(struct
nvmm_machine *mach,
struct nvmm_vcpu
*vcpu);
int
nvmm_vcpu_configure
(struct
nvmm_machine *mach,
struct nvmm_vcpu *vcpu,
uint64_t op,
void *conf);
int
nvmm_vcpu_getstate
(struct
nvmm_machine *mach,
struct nvmm_vcpu *vcpu,
uint64_t flags);
int
nvmm_vcpu_setstate
(struct
nvmm_machine *mach,
struct nvmm_vcpu *vcpu,
uint64_t flags);
int
nvmm_vcpu_inject
(struct
nvmm_machine *mach,
struct nvmm_vcpu
*vcpu);
int
nvmm_vcpu_run
(struct
nvmm_machine *mach,
struct nvmm_vcpu
*vcpu);
int
nvmm_hva_map
(struct
nvmm_machine *mach,
uintptr_t hva,
size_t size);
int
nvmm_hva_unmap
(struct
nvmm_machine *mach,
uintptr_t hva,
size_t size);
int
nvmm_gpa_map
(struct
nvmm_machine *mach,
uintptr_t hva,
gpaddr_t gpa,
size_t size,
int prot);
int
nvmm_gpa_unmap
(struct
nvmm_machine *mach,
uintptr_t hva,
gpaddr_t gpa,
size_t size);
int
nvmm_gva_to_gpa
(struct
nvmm_machine *mach,
struct nvmm_vcpu *vcpu,
gvaddr_t gva,
gpaddr_t *gpa,
nvmm_prot_t *prot);
int
nvmm_gpa_to_hva
(struct
nvmm_machine *mach,
gpaddr_t gpa,
uintptr_t *hva,
nvmm_prot_t *prot);
int
nvmm_assist_io
(struct
nvmm_machine *mach,
struct nvmm_vcpu
*vcpu);
int
nvmm_assist_mem
(struct
nvmm_machine *mach,
struct nvmm_vcpu
*vcpu);
DESCRIPTION
libnvmm
provides a library for emulator software to
handle hardware-accelerated virtual machines in
NetBSD. A virtual machine is described by an opaque
structure, nvmm_machine
. Emulator software should not
attempt to modify this structure directly, and should use the API provided by
libnvmm
to manage virtual machines. A virtual CPU is
described by a public structure, nvmm_vcpu
.
nvmm_init
()
initializes NVMM. See NVMM
Initialization below for details.
nvmm_capability
()
gets the capabilities of NVMM. See
NVMM Capability below for
details.
nvmm_machine_create
()
creates a virtual machine in the kernel. The mach
structure is initialized, and describes the machine.
nvmm_machine_destroy
()
destroys the virtual machine described in mach.
nvmm_machine_configure
()
configures, on the machine mach, the parameter
indicated in op. conf describes
the value of the parameter.
nvmm_vcpu_create
()
creates a virtual CPU in the machine mach, giving it
the CPU id cpuid, and initializes
vcpu.
nvmm_vcpu_destroy
()
destroys the virtual CPU identified by vcpu in the
machine mach.
nvmm_vcpu_configure
()
configures, on the VCPU vcpu of machine
mach, the parameter indicated in
op. conf describes the value of
the parameter.
nvmm_vcpu_getstate
()
gets the state of the virtual CPU identified by vcpu
in the machine mach. flags is
the bitmap of the components that are to be retrieved. The components are
located in vcpu->state. See
VCPU State Area below for
details.
nvmm_vcpu_setstate
()
sets the state of the virtual CPU identified by vcpu
in the machine mach. flags is
the bitmap of the components that are to be set. The components are located
in vcpu->state. See
VCPU State Area below for
details.
nvmm_vcpu_inject
()
injects into the CPU identified by vcpu of the machine
mach an event described by
vcpu->event. See
Event Injection below for
details.
nvmm_vcpu_run
()
runs the CPU identified by vcpu in the machine
mach, until a VM exit is triggered. The
vcpu->exit structure is filled to indicate the exit
reason, and the associated parameters if any.
nvmm_hva_map
()
maps at address hva a buffer of size
size in the calling process' virtual address space.
This buffer is allowed to be subsequently mapped in a virtual machine.
nvmm_hva_unmap
()
unmaps the buffer of size size at address
hva from the calling process' virtual address
space.
nvmm_gpa_map
()
maps into the guest physical memory beginning on address
gpa the buffer of size size
located at address hva of the calling process' virtual
address space. The hva parameter must point to a
buffer that was previously mapped with
nvmm_hva_map
().
nvmm_gpa_unmap
()
removes the guest physical memory area beginning on address
gpa and of size size from the
machine mach.
nvmm_gva_to_gpa
()
translates, on the CPU vcpu from the machine
mach, the guest virtual address given in
gva into a guest physical address returned in
gpa. The associated page premissions are returned in
prot. gva must be
page-aligned.
nvmm_gpa_to_hva
()
translates, on the machine mach, the guest physical
address indicated in gpa into a host virtual address
returned in hva. The associated page premissions are
returned in prot. gpa must be
page-aligned.
nvmm_assist_io
()
emulates the I/O operation described in vcpu->exit
on CPU vcpu from machine mach.
See I/O Assist below for details.
nvmm_assist_mem
()
emulates the Mem operation described in vcpu->exit
on CPU vcpu from machine mach.
See Mem Assist below for details.
NVMM Initialization
NVMM initialization is performed by the
nvmm_init
() function, which must be invoked by
emulator software before any other NVMM function.
nvmm_init
()
opens the NVMM device, and expects to have the proper permissions to do so.
In a default configuration, this implies being part of the "nvmm"
group. If using a special configuration, emulator software should arrange to
have the proper permissions before invoking
nvmm_init
(), and can drop them after the call has
completed.
It is to be noted that
nvmm_init
()
may perform non-re-entrant operations, and should be called only once.
NVMM Capability
The nvmm_capability
structure helps
emulator software identify the capabilities offered by NVMM on the host:
struct nvmm_capability { uint64_t version; uint64_t state_size; uint64_t max_machines; uint64_t max_vcpus; uint64_t max_ram; struct { ... } arch; };
For example, the max_machines
field
indicates the maximum number of virtual machines supported, while
max_vcpus
indicates the maximum number of VCPUs
supported per virtual machine.
Machine Ownership
When a process creates a virtual machine via
nvmm_machine_create
(),
it is considered the owner of this machine. No other processes than the
owner can operate a virtual machine.
When an owner exits, all the virtual
machines associated with it are destroyed, if they were not already
destroyed by the owner itself via
nvmm_machine_destroy
().
Virtual machines are not inherited across fork(2) operations.
Machine Configuration
Emulator software can configure several parameters of a virtual
machine by using
nvmm_machine_configure
().
Currently, no parameters are implemented.
VCPU Configuration
Emulator software can configure several parameters of a VCPU by
using nvmm_vcpu_configure
(), which can take the
following operations:
#define NVMM_VCPU_CONF_CALLBACKS 0 ...
The higher fields depend on the architecture.
Guest-Host Mappings
Each virtual machine has an associated guest physical memory. Emulator software is allowed to modify this guest physical memory by mapping it into some parts of its virtual address space.
Emulator software should follow the following steps to achieve that:
- Call
nvmm_hva_map
() to create in the host's virtual address space an area of memory that can be shared with a guest. Typically, the hva parameter will be a pointer to an area that was previously mapped viammap
().nvmm_hva_map
() will replace the content of the area, and will make it read-write (but not executable). - Make available in the guest an area of guest physical memory, by calling
nvmm_gpa_map
() and passing in the hva parameter the value that was previously given tonvmm_hva_map
().nvmm_gpa_map
() does not replace the content of any memory, it only creates a direct link from gpa into hva.nvmm_gpa_unmap
() removes this link without modifying hva.
The guest will then be able to use the guest
physical address passed in the gpa parameter of
nvmm_gpa_map
().
Each change the guest makes in gpa will be reflected
in the host's hva, and vice versa.
It is illegal for emulator software to use
munmap
() on
an area that was mapped via nvmm_hva_map
().
VCPU State Area
A VCPU state area is a structure that entirely defines the content of the registers of a VCPU. Only one such structure exists, for x86:
struct nvmm_x64_state { struct nvmm_x64_state_seg segs[NVMM_X64_NSEG]; uint64_t gprs[NVMM_X64_NGPR]; uint64_t crs[NVMM_X64_NCR]; uint64_t drs[NVMM_X64_NDR]; uint64_t msrs[NVMM_X64_NMSR]; struct nvmm_x64_state_intr intr; struct fxsave fpu; }; #define nvmm_vcpu_state nvmm_x64_state
Refer to functional examples to see precisely how to use this structure.
A VCPU state area is divided in sub-states. A flags parameter is used to set and get the VCPU state; it acts as a bitmap which indicates which sub-states to set or get.
During VM exits, a partial VCPU state area is provided in exitstate, see Exit Reasons below for details.
VCPU Programming Model
A VCPU is described by a public structure,
nvmm_vcpu
:
struct nvmm_vcpu { nvmm_cpuid_t cpuid; struct nvmm_vcpu_state *state; struct nvmm_vcpu_event *event; struct nvmm_vcpu_exit *exit; };
This structure is used both publicly by emulator software and
internally by libnvmm
. Emulator software should not
modify the pointers of this structure, because they are initialized to
special values by libnvmm
.
A call to
nvmm_vcpu_getstate
()
will fetch the desired parts of the VCPU state and put them in
vcpu->state. A call to
nvmm_vcpu_setstate
() will install in the VCPU the
desired parts of vcpu->state. A call to
nvmm_vcpu_inject
() will inject in the VCPU the event
in vcpu->event. A call to
nvmm_vcpu_run
() will fill
vcpu->exit with the VCPU exit information.
If emulator software uses several threads, a VCPU should be associated with only one thread, and only this thread should perform VCPU modifications. Emulator software should not modify the state of a VCPU with several different threads.
Exit Reasons
The nvmm_vcpu_exit
structure is used to
handle VM exits:
/* Generic. */ #define NVMM_VCPU_EXIT_NONE 0x0000000000000000ULL #define NVMM_VCPU_EXIT_INVALID 0xFFFFFFFFFFFFFFFFULL /* x86: operations. */ #define NVMM_VCPU_EXIT_MEMORY 0x0000000000000001ULL #define NVMM_VCPU_EXIT_IO 0x0000000000000002ULL /* x86: changes in VCPU state. */ #define NVMM_VCPU_EXIT_SHUTDOWN 0x0000000000001000ULL #define NVMM_VCPU_EXIT_INT_READY 0x0000000000001001ULL #define NVMM_VCPU_EXIT_NMI_READY 0x0000000000001002ULL #define NVMM_VCPU_EXIT_HALTED 0x0000000000001003ULL #define NVMM_VCPU_EXIT_TPR_CHANGED 0x0000000000001004ULL /* x86: instructions. */ #define NVMM_VCPU_EXIT_RDMSR 0x0000000000002000ULL #define NVMM_VCPU_EXIT_WRMSR 0x0000000000002001ULL #define NVMM_VCPU_EXIT_MONITOR 0x0000000000002002ULL #define NVMM_VCPU_EXIT_MWAIT 0x0000000000002003ULL #define NVMM_VCPU_EXIT_CPUID 0x0000000000002004ULL struct nvmm_vcpu_exit { uint64_t reason; union { ... } u; struct { ... } exitstate; };
The reason field indicates the reason of the VM exit. Additional parameters describing the exit can be present in u. exitstate contains a partial, implementation-specific VCPU state, usable as a fast-path to retrieve certain state values.
It is possible that a VM exit was caused by a reason internal to
the host kernel, and that emulator software should not be concerned with. In
this case, the exit reason is set to
NVMM_VCPU_EXIT_NONE
. This gives a chance for
emulator software to halt the VM in its tracks.
Refer to functional examples to see precisely how to handle VM exits.
Event Injection
It is possible to inject an event into a VCPU. An event can be a hardware interrupt, a software interrupt, or a software exception, defined by:
#define NVMM_VCPU_EVENT_EXCP 0 #define NVMM_VCPU_EVENT_INTR 1 struct nvmm_vcpu_event { u_int type; uint8_t vector; union { struct { uint64_t error; } excp; } u; };
This describes an event of type type, to be sent to vector number vector, with a possible additional error code that is implementation-specific.
It is possible that the VCPU is in a state where it cannot receive this event, if:
- the event is a hardware interrupt, and the VCPU runs with interrupts disabled, or
- the event is a non-maskable interrupt (NMI), and the VCPU is already in an in-NMI context.
Emulator software can manage interrupt and NMI window-exiting via
the intr component of the VCPU state. When such
window-exiting is enabled, NVMM will cause a VM exit with reason
NVMM_VCPU_EXIT_INT_READY
or
NVMM_VCPU_EXIT_NMI_READY
to indicate that the guest
is now able to handle the corresponding class of interrupts.
Assist Callbacks
In order to assist emulation of certain operations,
libnvmm
requires emulator software to register, via
nvmm_vcpu_configure
(),
a set of callbacks described in the following structure:
struct nvmm_assist_callbacks { void (*io)(struct nvmm_io *); void (*mem)(struct nvmm_mem *); };
These callbacks are used by
libnvmm
each time
nvmm_assist_io
()
or nvmm_assist_mem
() are invoked. Emulator software
that does not intend to use either of these assists can put
NULL
in the callbacks.
I/O Assist
When a VM exit occurs with reason
NVMM_VCPU_EXIT_IO
, it is necessary for emulator
software to emulate the associated I/O operation.
libnvmm
provides an easy way for emulator software
to perform that.
nvmm_assist_io
()
will call the registered io callback function and give
it a nvmm_io
structure as argument. This structure
describes an I/O transaction:
struct nvmm_io { struct nvmm_machine *mach; struct nvmm_vcpu *vcpu; uint16_t port; bool in; size_t size; uint8_t *data; };
The callback can emulate the operation using this descriptor, following two unique cases:
- The operation is an input. In this case, the callback should fill data with the desired value.
- The operation is an output. In this case, the callback should read data to retrieve the desired value.
In either case, port will indicate the I/O port, in will indicate if the operation is an input, and size will indicate the size of the access.
Mem Assist
When a VM exit occurs with reason
NVMM_VCPU_EXIT_MEMORY
, it is necessary for emulator
software to emulate the associated memory operation.
libnvmm
provides an easy way for emulator software
to perform that, similar to the I/O Assist.
nvmm_assist_mem
()
will call the registered mem callback function and
give it a nvmm_mem
structure as argument. This
structure describes a Mem transaction:
struct nvmm_mem { struct nvmm_machine *mach; struct nvmm_vcpu *vcpu; gpaddr_t gpa; bool write; size_t size; uint8_t *data; };
The callback can emulate the operation using this descriptor, following two unique cases:
- The operation is a read. In this case, the callback should fill data with the desired value.
- The operation is a write. In this case, the callback should read data to retrieve the desired value.
In either case, gpa will indicate the guest physical address, write will indicate if the access is a write, and size will indicate the size of the access.
RETURN VALUES
Upon successful completion, each of these functions returns zero. Otherwise, a value of -1 is returned and the global variable errno is set to indicate the error.
FILES
- https://www.netbsd.org/~maxv/nvmm/nvmm-demo.zip
- Functional example (demonstrator). Contains an emulator that uses the
libnvmm
API, and a small kernel that exercises this emulator. - src/sys/dev/nvmm/
- Source code of the kernel NVMM driver.
- src/lib/libnvmm/
- Source code of the
libnvmm
library.
ERRORS
These functions will fail if:
- [
EEXIST
] - An attempt was made to create a machine or a VCPU that already exists.
- [
EFAULT
] - An attempt was made to emulate a memory-based operation in a guest, and the guest page tables did not have the permissions necessary for the operation to complete successfully.
- [
EINVAL
] - An inappropriate parameter was used.
- [
ENOBUFS
] - The maximum number of machines or VCPUs was reached.
- [
ENOENT
] - A query was made on a machine or a VCPU that does not exist.
- [
EPERM
] - An attempt was made to access a machine that does not belong to the process.
SEE ALSO
AUTHORS
NVMM was designed and implemented by Maxime Villard.