NAME
HAMMER
—
HAMMER file system
SYNOPSIS
To compile this driver into the kernel, place the following line in your kernel configuration file:
options HAMMER
Alternatively, to load the driver as a module at boot time, place the following line in loader.conf(5):
hammer_load="YES"
To mount via fstab(5):
/dev/ad0s1d[:/dev/ad1s1d:...] /mnt hammer rw 2 0
DESCRIPTION
TheHAMMER
file system provides facilities to store file
system data onto disk devices and is intended to replace
ffs(5) as the default file system for
DragonFly.
Among its features are instant crash recovery, large file systems spanning multiple volumes, data integrity checking, data deduplication, fine grained history retention and snapshots, pseudo-filesystems (PFSs), mirroring capability and unlimited number of files and links.
All functions related to managing HAMMER
file systems are provided by the
newfs_hammer(8),
mount_hammer(8),
hammer(8),
sysctl(8),
chflags(1), and
undo(1) utilities.
For a more detailed introduction refer to the paper and slides
listed in the SEE ALSO section. For some
common usages of HAMMER
see the
EXAMPLES section below.
Description of HAMMER
features:
Instant Crash Recovery
After a non-graceful system shutdown,
HAMMER
file systems will be brought back into a
fully coherent state when mounting the file system, usually within a few
seconds.
In the unlikely case HAMMER
mount fails
due redo recovery (stage 2 recovery) being corrupted, a workaround to skip
this stage can be applied by setting the following tunable:
vfs.hammer.skip_redo=<value>
Possible values are:
- 0
- Run redo recovery normally and fail to mount in the case of error (default).
- 1
- Run redo recovery but continue mounting if an error appears.
- 2
- Completely bypass redo recovery.
Related commands: mount_hammer(8)
Large File Systems & Multi Volume
A HAMMER
file system can be up to 1
Exabyte in size. It can span up to 256 volumes, each volume occupies a
DragonFly disk slice or partition, or another
special file, and can be up to 4096 TB in size. Minimum recommended
HAMMER
file system size is 50 GB. For volumes over 2
TB in size gpt(8) and
disklabel64(8) normally need to be used.
Related
hammer(8) commands: volume-add
,
volume-del
, volume-list
,
volume-blkdevs
; see also
newfs_hammer(8)
Data Integrity Checking
HAMMER
has high focus on data integrity,
CRC checks are made for all major structures and data.
HAMMER
snapshots implements features to make data
integrity checking easier: The atime and mtime fields are locked to the
ctime for files accessed via a snapshot. The st_dev
field is based on the PFS shared-uuid and not on any
real device. This means that archiving the contents of a snapshot with e.g.
tar(1) and piping it to something like
md5(1) will yield a consistent result. The consistency is also
retained on mirroring targets.
Data Deduplication
To save disk space data deduplication can be used. Data deduplication will identify data blocks which occur multiple times and only store one copy, multiple reference will be made to this copy.
Related
hammer(8) commands: dedup
,
dedup-simulate
, cleanup
,
config
Transaction IDs
The HAMMER
file system uses 64-bit
transaction ids to refer to historical file or directory data. Transaction
ids used by HAMMER
are monotonically increasing over
time. In other words: when a transaction is made,
HAMMER
will always use higher transaction ids for
following transactions. A transaction id is given in hexadecimal format
0x016llx
, such as
0x00000001061a8ba6
.
Related
hammer(8) commands: snapshot
,
snap
, snaplo
,
snapq
, snapls
,
synctid
History & Snapshots
History metadata on the media is written with every sync
operation, so that by default the resolution of a file's history is 30-60
seconds until the next prune operation. Prior versions of files and
directories are generally accessible by appending
‘@@
’ and a transaction id to the name.
The common way of accessing history, however, is by taking snapshots.
Snapshots are softlinks to prior versions of directories and their
files. Their data will be retained across prune operations for as long as
the softlink exists. Removing the softlink enables the file system to
reclaim the space again upon the next prune & reblock operations. In
HAMMER
Version 3+ snapshots are also maintained as
file system meta-data.
Related
hammer(8) commands: cleanup
,
history
, snapshot
,
snap
, snaplo
,
snapq
, snaprm
,
snapls
, config
,
viconfig
; see also
undo(1)
Pruning & Reblocking
Pruning is the act of deleting file system history. By default
only history used by the given snapshots and history from after the latest
snapshot will be retained. By setting the per PFS parameter
prune-min
, history is guaranteed to be saved at
least this time interval. All other history is deleted. Reblocking will
reorder all elements and thus defragment the file system and free space for
reuse. After pruning a file system must be reblocked to recover all
available space. Reblocking is needed even when using the
nohistory
mount_hammer(8) option or
chflags(1) flag.
Related
hammer(8) commands: cleanup
,
snapshot
, prune
,
prune-everything
, rebalance
,
reblock
, reblock-btree
,
reblock-inodes
,
reblock-dirs
,
reblock-data
Pseudo-Filesystems (PFSs)
A pseudo-filesystem, PFS for short, is a sub file system in a
HAMMER
file system. All disk space in a
HAMMER
file system is shared between all PFSs in it,
so each PFS is free to use all remaining space. A
HAMMER
file system supports up to 65536 PFSs. The
root of a HAMMER
file system is PFS# 0, it is called
the root PFS and is always a master PFS.
A non-root PFS can be either master or slave. Slaves are always read-only, so they can't be updated by normal file operations, only by hammer(8) operations like mirroring and pruning. Upgrading slaves to masters and downgrading masters to slaves are supported.
It is recommended to use a null
mount to
access a PFS, except for root PFS; this way no tools are confused by the PFS
root being a symlink and inodes not being unique across a
HAMMER
file system.
Many hammer(8) operations operates per PFS, this includes mirroring, offline deduping, pruning, reblocking and rebalancing.
Related
hammer(8) commands: pfs-master
,
pfs-slave
, pfs-status
,
pfs-update
, pfs-destroy
,
pfs-upgrade
, pfs-downgrade
;
see also
mount_null(8)
Mirroring
Mirroring is copying of all data in a file system, including
snapshots and other historical data. In order to allow inode numbers to be
duplicated on the slaves HAMMER
mirroring feature
uses PFSs. A master or slave PFS can be mirrored to a slave PFS. I.e. for
mirroring multiple slaves per master are supported, but multiple masters per
slave are not. HAMMER
does not support multi-master
clustering and mirroring.
Related
hammer(8) commands: mirror-copy
,
mirror-stream
, mirror-read
,
mirror-read-stream
,
mirror-write
,
mirror-dump
Fsync Flush Modes
The HAMMER
file system implements several
different
fsync
()
flush modes, the mode used is set via the
vfs.hammer.flush_mode sysctl, see
hammer(8) for details.
Unlimited Number of Files and Links
There is no limit on the number of files or links in a
HAMMER
file system, apart from available disk
space.
NFS Export
HAMMER
file systems support NFS export.
NFS export of PFSs is done using null
mounts (for
file/directory in root PFS null
mount is not
needed). For example, to export the PFS
/hammer/pfs/data, create a
null
mount, e.g. to
/hammer/data and export the latter path.
Don't export a directory containing a PFS (e.g.
/hammer/pfs above). Only
null
mount for PFS root (e.g.
/hammer/data above) should be exported (subdirectory
may be escaped if exported).
File System Versions
As new features have been introduced to
HAMMER
a version number has been bumped. Each
HAMMER
file system has a version, which can be
upgraded to support new features.
Related
hammer(8) commands: version
,
version-upgrade
; see also
newfs_hammer(8)
EXAMPLES
Preparing the File System
To create and mount a HAMMER
file system
use the
newfs_hammer(8) and
mount_hammer(8) commands. Note that all
HAMMER
file systems must have a unique name on a
per-machine basis.
newfs_hammer -L HOME /dev/ad0s1d mount_hammer /dev/ad0s1d /home
Similarly, multi volume file systems can be created and mounted by specifying additional arguments.
newfs_hammer -L MULTIHOME /dev/ad0s1d /dev/ad1s1d mount_hammer /dev/ad0s1d /dev/ad1s1d /home
Once created and mounted, HAMMER
file
systems need periodic clean up making snapshots, pruning and reblocking, in
order to have access to history and file system not to fill up. For this it
is recommended to use the
hammer(8) cleanup
metacommand.
By default, DragonFly is set up to run
hammer
cleanup
nightly via
periodic(8).
It is also possible to perform these operations individually via crontab(5). For example, to reblock the /home file system every night at 2:15 for up to 5 minutes:
15 2 * * * hammer -c /var/run/HOME.reblock -t 300 reblock /home \ >/dev/null 2>&1
Snapshots
The
hammer(8) utility's snapshot
command provides
several ways of taking snapshots. They all assume a directory where
snapshots are kept.
mkdir /snaps hammer snapshot /home /snaps/snap1 (...after some changes in /home...) hammer snapshot /home /snaps/snap2
The softlinks in /snaps point to the state of the /home directory at the time each snapshot was taken, and could now be used to copy the data somewhere else for backup purposes.
By default, DragonFly is set up to create
nightly snapshots of all HAMMER
file systems via
periodic(8) and to keep them for 60 days.
Pruning
A snapshot directory is also the argument to the
hammer(8) prune
command which frees
historical data from the file system that is not pointed to by any snapshot
link and is not from after the latest snapshot and is older than
prune-min
.
rm /snaps/snap1 hammer prune /snaps
Mirroring
Mirroring is set up using HAMMER
pseudo-filesystems (PFSs). To associate the slave with the master its shared
UUID should be set to the master's shared UUID as output by the
hammer
pfs-master
command.
hammer pfs-master /home/pfs/master hammer pfs-slave /home/pfs/slave shared-uuid=<master's shared uuid>
The /home/pfs/slave link is unusable for as long as no mirroring operation has taken place.
To mirror the master's data, either pipe a
mirror-read
command into a
mirror-write
or, as a short-cut, use the
mirror-copy
command (which works across a
ssh(1) connection as well). Initial mirroring operation has to be
done to the PFS path (as
mount_null(8) can't access it yet).
hammer mirror-copy /home/pfs/master /home/pfs/slave
It is also possible to have the target PFS auto created by just
issuing the same mirror-copy
command, if the target
PFS doesn't exist you will be prompted if you would like to create it. You
can even omit the prompting by using the -y
flag:
hammer -y mirror-copy /home/pfs/master /home/pfs/slave
After this initial step null
mount can be
setup for /home/pfs/slave. Further operations can
use null
mounts.
mount_null /home/pfs/master /home/master mount_null /home/pfs/slave /home/slave hammer mirror-copy /home/master /home/slave
NFS Export
To NFS export from the HAMMER
file system
/hammer the directory
/hammer/non-pfs without PFSs, and the PFS
/hammer/pfs/data, the latter is
null
mounted to
/hammer/data.
Add to /etc/fstab (see fstab(5)):
/hammer/pfs/data /hammer/data null rw
Add to /etc/exports (see exports(5)):
/hammer/non-pfs /hammer/data
DIAGNOSTICS
- hammer: System has insuffient buffers to rebalance the tree. nbuf < %d
- Rebalancing a
HAMMER
PFS uses quite a bit of memory and can't be done on low memory systems. It has been reported to fail on 512MB systems. Rebalancing isn't critical forHAMMER
file system operation; it is done byhammer
rebalance
, often as part ofhammer
cleanup
.
SEE ALSO
chflags(1), md5(1), tar(1), undo(1), exports(5), ffs(5), fstab(5), disklabel64(8), gpt(8), hammer(8), mount_hammer(8), mount_null(8), newfs_hammer(8), periodic(8), sysctl(8)
Matthew Dillon, The HAMMER Filesystem, June 2008, http://www.dragonflybsd.org/hammer/hammer.pdf.
Matthew Dillon, Slideshow from NYCBSDCon 2008, October 2008, http://www.dragonflybsd.org/presentations/nycbsdcon08/.
Michael Neumann, Slideshow for a presentation held at KIT (http://www.kit.edu), January 2010, http://www.ntecs.de/talks/HAMMER.pdf.
FILESYSTEM PERFORMANCE
The HAMMER
file system has a front-end
which processes VNOPS and issues necessary block reads from disk, and a
back-end which handles meta-data updates on-media and performs all meta-data
write operations. Bulk file write operations are handled by the front-end.
Because HAMMER
defers meta-data updates virtually no
meta-data read operations will be issued by the frontend while writing large
amounts of data to the file system or even when creating new files or
directories, and even though the kernel prioritizes reads over writes the
fact that writes are cached by the drive itself tends to lead to excessive
priority given to writes.
There are four bioq sysctls, shown below with default values, which can be adjusted to give reads a higher priority:
kern.bioq_reorder_minor_bytes: 262144 kern.bioq_reorder_burst_bytes: 3000000 kern.bioq_reorder_minor_interval: 5 kern.bioq_reorder_burst_interval: 60
If a higher read priority is desired it is recommended that the kern.bioq_reorder_minor_interval be increased to 15, 30, or even 60, and the kern.bioq_reorder_burst_bytes be decreased to 262144 or 524288.
HISTORY
The HAMMER
file system first appeared in
DragonFly 1.11.
AUTHORS
The HAMMER
file system was designed and
implemented by Matthew Dillon
<dillon@backplane.com>,
data deduplication was added by Ilya Dryomov. This
manual page was written by Sascha Wildner and
updated by Thomas Nikolajsen.