NAME
membar_ops
,
membar_enter
, membar_exit
,
membar_producer
,
membar_consumer
,
membar_datadep_consumer
,
membar_sync
—
memory ordering barriers
SYNOPSIS
#include
<sys/atomic.h>
void
membar_enter
(void);
void
membar_exit
(void);
void
membar_producer
(void);
void
membar_consumer
(void);
void
membar_datadep_consumer
(void);
void
membar_sync
(void);
DESCRIPTION
Themembar_ops
family of functions prevent reordering of
memory operations, as needed for synchronization in multiprocessor execution
environments that have relaxed load and store order.
In general, memory barriers must come in pairs
— a barrier on one CPU, such as
membar_exit
(),
must pair with a barrier on another CPU, such as
membar_enter
(), in order to synchronize anything
between the two CPUs. Code using membar_ops
should
generally be annotated with comments identifying how they are paired.
membar_ops
affect only operations on
regular memory, not on device memory; see
bus_space(9) and
bus_dma(9) for machine-independent interfaces to handling device
memory and DMA operations for device drivers.
Unlike C11,
all memory operations
— that is, all loads and stores on regular memory — are
affected by membar_ops
, not just C11 atomic
operations on _Atomic-qualified objects.
membar_enter
()- Any store preceding
membar_enter
() will happen before all memory operations following it.An atomic read/modify/write operation (atomic_ops(3)) followed by a
membar_enter
() implies a load-acquire operation in the language of C11.WARNING: A load followed by
membar_enter
() does not imply a load-acquire operation, even thoughmembar_exit
() followed by a store implies a store-release operation; the symmetry of these names and asymmetry of the semantics is a historical mistake. In the NetBSD kernel, you can use atomic_load_acquire(9) for a load-acquire operation without any atomic read/modify/write.membar_enter
() is typically used in code that implements locking primitives to ensure that a lock protects its data, and is typically paired withmembar_exit
(); see below for an example. membar_exit
()- All memory operations preceding
membar_exit
() will happen before any store that follows it.A
membar_exit
() followed by a store implies a store-release operation in the language of C11. For a regular store, rather than an atomic read/modify/write store, you should use atomic_store_release(9) instead ofmembar_exit
() followed by the store.membar_exit
() is typically used in code that implements locking primitives to ensure that a lock protects its data, and is typically paired withmembar_enter
(). For example:/* thread A */ obj->state.mumblefrotz = 42; KASSERT(valid(&obj->state)); membar_exit(); obj->lock = 0; /* thread B */ if (atomic_cas_uint(&obj->lock, 0, 1) != 0) return; membar_enter(); KASSERT(valid(&obj->state)); obj->state.mumblefrotz--;
In this example, if the
atomic_cas_uint
() operation in thread B witnesses the storeobj->lock = 0
from thread A, then everything in thread A before themembar_exit
() is guaranteed to happen before everything in thread B after themembar_enter
(), as if the machine had sequentially executed:obj->state.mumblefrotz = 42; /* from thread A */ KASSERT(valid(&obj->state)); ... KASSERT(valid(&obj->state)); /* from thread B */ obj->state.mumblefrotz--;
membar_exit
() followed by a store, serving as a store-release operation, may also be paired with a subsequent load followed bymembar_sync
(), serving as the corresponding load-acquire operation. However, you should use atomic_store_release(9) and atomic_load_acquire(9) instead in that situation, unless the store is an atomic read/modify/write which requires a separatemembar_exit
(). membar_producer
()- All stores preceding
membar_producer
() will happen before any stores following it.membar_producer
() has no analogue in C11.membar_producer
() is typically used in code that produces data for read-only consumers which usemembar_consumer
(), such as ‘seqlocked’ snapshots of statistics; see below for an example. membar_consumer
()- All loads preceding
membar_consumer
() will complete before any loads after it.membar_consumer
() has no analogue in C11.membar_consumer
() is typically used in code that reads data from producers which usemembar_producer
(), such as ‘seqlocked’ snapshots of statistics. For example:struct { /* version number and in-progress bit */ unsigned seq; /* read-only statistics, too large for atomic load */ unsigned foo; int bar; uint64_t baz; } stats; /* producer (must be serialized, e.g. with mutex(9)) */ stats->seq |= 1; /* mark update in progress */ membar_producer(); stats->foo = count_foo(); stats->bar = measure_bar(); stats->baz = enumerate_baz(); membar_producer(); stats->seq++; /* bump version number */ /* consumer (in parallel w/ producer, other consumers) */ restart: while ((seq = stats->seq) & 1) /* wait for update */ SPINLOCK_BACKOFF_HOOK; membar_consumer(); foo = stats->foo; /* read out a candidate snapshot */ bar = stats->bar; baz = stats->baz; membar_consumer(); if (seq != stats->seq) /* try again if version changed */ goto restart;
membar_datadep_consumer
()- Same as
membar_consumer
(), but limited to loads of addresses dependent on prior loads, or ‘data-dependent’ loads:int **pp, *p, v; p = *pp; membar_datadep_consumer(); v = *p; consume(v);
membar_datadep_consumer
() is typically paired withmembar_exit
() by code that initializes an object before publishing it. However, you should use atomic_store_release(9) and atomic_load_consume(9) instead, to avoid obscure edge cases in case the consumer is not read-only.membar_datadep_consumer
() does not guarantee ordering of loads in branches, or ‘control-dependent’ loads — you must usemembar_consumer
() instead:int *ok, *p, v; if (*ok) { membar_consumer(); v = *p; consume(v); }
Most CPUs do not reorder data-dependent loads (i.e., most CPUs guarantee that cached values are not stale in that case), so
membar_datadep_consumer
() is a no-op on those CPUs. membar_sync
()- All memory operations preceding
membar_sync
() will happen before any memory operations following it.membar_sync
() is a sequential consistency acquire/release barrier, analogous toatomic_thread_fence(memory_order_seq_cst)
in C11.membar_sync
() is typically paired withmembar_sync
().A load followed by
membar_sync
(), serving as a load-acquire operation, may also be paired with a priormembar_exit
() followed by a store, serving as the corresponding store-release operation. However, you should use atomic_load_acquire(9) instead of load-then-membar_sync
() if it is a regular load, ormembar_enter
() instead ofmembar_sync
() if the load is in an atomic read/modify/write operation.
SEE ALSO
HISTORY
The membar_ops
functions first appeared in
NetBSD 5.0. The data-dependent load barrier,
membar_datadep_consumer
(), first appeared in
NetBSD 7.0.