NAME
mkuzip
—
compress disk image for use with
geom_uzip(4) class
SYNOPSIS
mkuzip |
[-dSsvZ ] [-A
compression_algorithm] [-C
compression_level] [-j
compression_jobs] [-o
outfile] [-s
cluster_size] infile |
DESCRIPTION
Themkuzip
utility compresses a disk image file so that
the geom_uzip(4) class will be able to decompress the resulting
image at run-time. This allows for a significant reduction of size of disk
image at the expense of some CPU time required to decompress the data each
time it is read. The mkuzip
utility works in two
phases:
- An infile image is split into clusters; each cluster is compressed.
- The resulting set of compressed clusters is written to the output file. In addition, a “table of contents” header is written which allows for efficient seeking.
The options are:
-A
[lzma | zlib | zstd]- Select a specific compression algorithm. If this option is not provided,
the default is zlib.
The lzma algorithm provides noticeable better compression levels than zlib on the same data set. It has vastly slower compression speed and moderately slower decompression speed.
The zstd algorithm provides better compression levels than zlib on the same data set. It also has faster compression and decompression speed than zlib. In the very high compression “level” settings, it does not offer quite as high a compression ratio as lzma. However, its decompression speed does not suffer at high compression “levels”.
-C
compression_level- Select the integer compression level used to parameterize the chosen
compression algorithm.
For any given algorithm, a lesser number selects a faster compression mode. A greater number selects a slower compression mode. Typically, for the same algorithm, a greater compression_level provides better final compression ratio.
For lzma, the range of valid compression levels is 0-9. The
mkuzip
default for lzma is 6.For zlib, the range of valid compression levels is 1-9. The
mkuzip
default for zlib is 9.For zstd, the range of valid compression levels is currently 1-19. The
mkuzip
default for zstd is 9. -d
- Enable de-duplication. When the option is enabled
mkuzip
detects identical blocks in the input and replaces each subsequent occurrence of such block with pointer to the very first one in the output. Setting this option results is moderate decrease of compressed image size, typically around 3-5% of a final size of the compressed image. -j
compression_jobs- Specify the number of compression jobs that
mkuzip
runs in parallel to speed up compression. When option is not specified the number of jobs set to be equal to the value of hw.ncpu sysctl(8) variable. - [
-L
] - Legacy flag that indicates the same thing as
“
-A
lzma”. -o
outfile- Name of the output file outfile. The default is to use the input name with the suffix .uzip for the zlib(3) compression or .ulzma for the lzma(3).
-S
- Print summary about the compression ratio as well as output file size after file has been processed.
-s
cluster_size- Split the image into clusters of cluster_size bytes, 16384 bytes by default. The cluster_size should be a multiple of 512 bytes.
-v
- Display verbose messages.
-Z
- Disable zero-block detection and elimination. When this option is set,
mkuzip
compresses blocks of zero bytes just as it would any other block. When the option is not set,mkuzip
detects and compresses zero blocks in a space-efficient way. Setting-Z
increases compressed image sizes slightly, typically less than 0.1%.
IMPLEMENTATION NOTES
The compression ratio largely depends on the compression algorithm, level, and cluster size used. For large cluster sizes (16kB and higher), typical overall image compression ratios with zlib(3) are only 1-2% less than those achieved with gzip(1) over the entire image. However, it should be kept in mind that larger cluster sizes lead to higher overhead in the geom_uzip(4) class, as the class has to decompress the whole cluster even if only a few bytes from that cluster have to be read.
Additionally, the threshold at 16-32 kB where a larger cluster size does not benefit overall compression ratio is an artifact of the zlib(3) algorithm in particular. Lzma and Zstd will continue to provide better compression ratios as cluster sizes are increased, at high enough compression levels. The same tradeoff continues to apply: reads in geom_uzip(4) become more expensive the greater the cluster size.
The mkuzip
utility inserts a short shell
script at the beginning of the generated image, which makes it possible to
“run” the image just like any other shell script. The script
tries to load the
geom_uzip(4) class if it is not loaded, configure the image
as an md(4) disk device using
mdconfig(8), and automatically mount it using
mount_cd9660(8) on the mount point provided as the first
argument to the script.
The de-duplication is a FreeBSD specific feature and while it does not require any changes to on-disk compressed image format, however it did require some matching changes to the geom_uzip(4) to handle resulting images correctly.
To make use of zstd
mkuzip
images, the kernel must be configured with
ZSTDIO
. It is enabled by default in many
GENERIC
kernels provided as binary distributions by
FreeBSD. The status on any particular system can be
verified by checking
sysctl(8) kern.features.geom_uzip_zstd
for
“1”.
EXIT STATUS
The mkuzip
utility exits 0 on
success, and >0 if an error occurs.
SEE ALSO
gzip(1), xz(1), zstd(1), zlib(3), geom(4), geom_uzip(4), md(4), mdconfig(8), mount_cd9660(8)
AUTHORS
Maxim Sobolev <sobomax@FreeBSD.org>