NAME
kmem
—
kernel wired memory
allocator
SYNOPSIS
#include
<sys/kmem.h>
void *
kmem_alloc
(size_t
size, km_flag_t
kmflags);
void *
kmem_zalloc
(size_t
size, km_flag_t
kmflags);
void
kmem_free
(void
*p, size_t
size);
void *
kmem_intr_alloc
(size_t
size, km_flag_t
kmflags);
void *
kmem_intr_zalloc
(size_t
size, km_flag_t
kmflags);
void
kmem_intr_free
(void
*p, size_t
size);
char *
kmem_asprintf
(const
char *fmt,
...);
char *
kmem_strdupsize
(const
char *str, size_t
*size, km_flag_t
kmflags);
void
kmem_strfree
(char
*str);
options KMEM_SIZE
options KMEM_REDZONE
options KMEM_GUARD
DESCRIPTION
kmem_alloc
()
allocates kernel wired memory. It takes the following arguments.
- size
- Specify the size of allocation in bytes.
- kmflags
- Either of the following:
KM_SLEEP
- If the allocation cannot be satisfied immediately, sleep until enough
memory is available. If
KM_SLEEP
is specified, then the allocation cannot fail. KM_NOSLEEP
- Don't sleep. Immediately return
NULL
if there is not enough memory available. It should only be used when failure to allocate will not have harmful, user-visible effects.Use ofKM_NOSLEEP
is strongly discouraged as it can create transient, hard to debug failures that occur when the system is under memory pressure.In situations where it is not possible to sleep, for example because locks are held by the caller, the code path should be restructured to allow the allocation to be made in another place.
The contents of allocated memory are uninitialized.
Unlike Solaris, kmem_alloc(0, flags) is illegal.
kmem_zalloc
()
is the equivalent of kmem_alloc
(), except that it
initializes the memory to zero.
kmem_asprintf
()
functions as the well known
asprintf
()
function, but allocates memory using kmem_alloc
().
This routine can sleep during allocation. The size of the allocated area is
the length of the returned character string, plus one (for the NUL
terminator). This must be taken into consideration when freeing the returned
area with kmem_free
().
kmem_free
()
frees kernel wired memory allocated by kmem_alloc
()
or kmem_zalloc
() so that it can be used for other
purposes. It takes the following arguments.
- p
- The pointer to the memory being freed. It must be the one returned by
kmem_alloc
() orkmem_zalloc
(). - size
- The size of the memory being freed, in bytes. It must be the same as the
size argument used for
kmem_alloc
() orkmem_zalloc
() when the memory was allocated.
Freeing NULL
is illegal.
kmem_intr_alloc
(),
kmem_intr_zalloc
()
and
kmem_intr_free
()
are the equivalents of the above kmem routines which can be called from the
interrupt context. These routines are for the special cases. Normally,
pool_cache(9) should be used for memory allocation from
interrupt context.
The
kmem_strdupsize
()
function is a utility function that can be used to copy the string in the
str argument to a new buffer allocated using
kmem_alloc
() and optionally return the size of the
allocation (the length of the string plus the trailing
NUL
) in the size argument if
that is not NULL
.
The
kmem_strfree
()
function can be used to free a NUL
terminated string
computing the length of the string using
strlen(3) and adding one for the NUL
and then
using kmem_free
().
NOTES
Making KM_SLEEP
allocations while holding
mutexes or reader/writer locks is discouraged, as the caller can sleep for
an unbounded amount of time in order to satisfy the allocation. This can in
turn block other threads that wish to acquire locks held by the caller. It
should be noted that kmem_free
() may also block.
For some locks this is permissible or even unavoidable. For others, particularly locks that may be taken from soft interrupt context, it is a serious problem. As a general rule it is better not to allow this type of situation to develop. One way to circumvent the problem is to make allocations speculative and part of a retryable sequence. For example:
retry: /* speculative unlocked check */ if (need to allocate) { new_item = kmem_alloc(sizeof(*new_item), KM_SLEEP); } else { new_item = NULL; } mutex_enter(lock); /* check while holding lock for true status */ if (need to allocate) { if (new_item == NULL) { mutex_exit(lock); goto retry; } consume(new_item); new_item = NULL; } mutex_exit(lock); if (new_item != NULL) { /* did not use it after all */ kmem_free(new_item, sizeof(*new_item)); }
OPTIONS
KMEM_SIZE
Kernels compiled with the KMEM_SIZE
option
ensure the size given in
kmem_free
()
matches the actual allocated size. On kmem_alloc
(),
the kernel will allocate an additional contiguous kmem page of eight bytes
in the buffer, will register the allocated size in the first kmem page of
that buffer, and will return a pointer to the second kmem page in that same
buffer. When freeing, the kernel reads the first page, and compares the size
registered with the one given in kmem_free
(). Any
mismatch triggers a panic.
KMEM_SIZE
is enabled by default on
DIAGNOSTIC
and DEBUG
.
KMEM_REDZONE
Kernels compiled with the KMEM_REDZONE
option add a dynamic pattern of two bytes at the end of each allocated
buffer, and check this pattern when freeing to ensure the caller hasn't
written outside the requested area. This option does not introduce a
significant performance impact, but has two drawbacks: it only catches write
overflows, and catches them only on
kmem_free
().
KMEM_REDZONE
is enabled by default on
DIAGNOSTIC
.
KMEM_GUARD
Kernels compiled with the KMEM_GUARD
option perform CPU intensive sanity checks on kmem operations. It adds
additional, very high overhead runtime verification to kmem operations. It
must be enabled with KMEM_SIZE
.
KMEM_GUARD
tries to catch the following
types of bugs:
- Overflow at time of occurrence, by means of a guard page. An unmapped guard page sits immediately after the requested area; a read/write overflow therefore triggers a page fault.
- Underflow at
kmem_free
(), by usingKMEM_SIZE
's registered size. If an underflow occurs, the size stored byKMEM_SIZE
will be overwritten, which means that when freeing, the kernel will spot the mismatch. - Use-after-free at time of occurrence. When freeing, the memory is unmapped, and depending on the value of kmem_guard_depth, the kernel will more or less delay the recycling of that memory. Which means that any ulterior read/write access to the memory will trigger a page fault, given it hasn't been recycled yet.
To enable it, boot the system with the -d
option, which causes the debugger to be entered early during the kernel boot
process. Issue commands such as the following:
db> w kmem_guard_depth 0t30000 db> c
This instructs kmem_guard
to queue up to 60000 (30000*2) pages of unmapped KVA to catch use-after-free
type errors. When
kmem_free
()
is called, memory backing a freed item is unmapped and the kernel VA space
pushed onto a FIFO. The VA space will not be reused until another 30k items
have been freed. Until reused the kernel will catch invalid accesses and
panic with a page fault. Limitations:
- It has a severe impact on performance.
- It is best used on a 64-bit machine with lots of RAM.
KMEM_GUARD
is enabled by default on
DEBUG
.
RETURN VALUES
On success, kmem_alloc
(),
kmem_asprintf
(),
kmem_intr_alloc
(),
kmem_intr_zalloc
(),
kmem_strdupsize
(), and
kmem_zalloc
() return a pointer to allocated memory.
Otherwise, NULL
is returned.
CODE REFERENCES
The kmem
subsystem is implemented within
the file sys/kern/subr_kmem.c.
SEE ALSO
intro(9), memoryallocators(9), percpu(9), pool_cache(9), uvm_km(9)
CAVEATS
The kmem_alloc
(),
kmem_asprintf
(),
kmem_free
(),
kmem_strdupsize
(),
kmem_strfree
(), and
kmem_zalloc
() functions cannot be used from
interrupt context, from a soft interrupt, or from a callout. Use
pool_cache(9) in these situations.
SECURITY CONSIDERATIONS
As the memory allocated by kmem_alloc
() is
uninitialized, it can contain security-sensitive data left by its previous
user. It is the caller's responsibility not to expose it to the world.