man.bsd.lv manual page server

Manual Page Search Parameters

VINUM(4) Device Drivers Manual VINUM(4)

vinumLogical Volume Manager

pseudo-device vinum

vinum is a logical volume manager inspired by, but not derived from, the Veritas Volume Manager. It provides the following features:

vinum is currently supplied as a KLD module, and does not require configuration. As with other klds, it is absolutely necessary to match the kld to the version of the operating system. Failure to do so will cause vinum to issue an error message and terminate.

It is possible to configure vinum in the kernel, but this is not recommended. To do so, add this line to the kernel configuration file:

pseudo-device vinum

The current version of vinum, both the kernel module and the user program vinum(8), include significant debugging support. It is not recommended to remove this support at the moment, but if you do you must remove it from both the kernel and the user components. To do this, edit the files /usr/src/sbin/vinum/Makefile and /sys/dev/raid/vinum/Makefile and edit the CFLAGS variable to remove the -DVINUMDEBUG option. If you have configured vinum into the kernel, either specify the line

options VINUMDEBUG

in the kernel configuration file or remove the -DVINUMDEBUG option from /usr/src/sbin/vinum/Makefile as described above.

If the VINUMDEBUG variables do not match, vinum(8) will fail with a message explaining the problem and what to do to correct it.

vinum was previously available in two versions: a freely available version which did not contain RAID-5 functionality, and a full version including RAID-5 functionality, which was available only from Cybernet Systems Inc. The present version of vinum includes the RAID-5 functionality.

vinum is part of the base DragonFly system. It does not require installation. To start it, start the vinum(8) program, which will load the kld if it is not already present. Before using vinum, it must be configured. See vinum(8) for information on how to create a vinum configuration.

Normally, you start a configured version of vinum at boot time. Set the variable start_vinum in /etc/rc.conf to “YES” to start vinum at boot time. (See rc.conf(5) for more details.)

If vinum is loaded as a kld (the recommended way), the vinum stop command will unload it (see vinum(8)). You can also do this with the kldunload(8) command.

The kld can only be unloaded when idle, in other words when no volumes are mounted and no other instances of the vinum(8) program are active. Unloading the kld does not harm the data in the volumes.

Use the vinum(8) utility to configure and start vinum objects.

ioctl(2) calls are intended for the use of the vinum(8) configuration program only. They are described in the header file /sys/dev/raid/vinum/vinumio.h.

Conventional disk special devices have a in the second sector of the device. See disklabel(5) for more details. This disk label describes the layout of the partitions within the device. vinum does not subdivide volumes, so volumes do not contain a physical disk label. For convenience, vinum implements the ioctl calls DIOCGDINFO (get disk label), DIOCGPART (get partition information), DIOCWDINFO (write partition information) and DIOCSDINFO (set partition information). DIOCGDINFO and DIOCGPART refer to an internal representation of the disk label which is not present on the volume. As a result, the -r option of disklabel(8), which reads the “raw disk”, will fail.

In general, disklabel(8) serves no useful purpose on a vinum volume.

vinum ignores the DIOCWDINFO and DIOCSDINFO ioctls, since there is nothing to change. As a result, any attempt to modify the disk label will be silently ignored.

Since vinum volumes do not contain partitions, the names do not need to conform to the standard rules for naming disk partitions. For a physical disk partition, the last letter of the device name specifies the partition identifier (a to p). vinum volumes need not conform to this convention, but if they do not, newfs(8) will complain that it cannot determine the partition. To solve this problem, use the -v flag to newfs(8). For example, if you have a volume concat, use the following command to create a UFS(5) file system on it:

newfs -v /dev/vinum/concat

vinum assigns default names to plexes and subdisks, although they may be overridden. We do not recommend overriding the default names. Experience with the Veritas™ volume manager, which allows arbitrary naming of objects, has shown that this flexibility does not bring a significant advantage, and it can cause confusion.

Names may contain any non-blank character, but it is recommended to restrict them to letters, digits and the underscore characters. The names of volumes, plexes and subdisks may be up to 64 characters long, and the names of drives may up to 32 characters long. When choosing volume and plex names, bear in mind that automatically generated plex and subdisk names are longer than the name from which they are derived.

Assume the vinum objects described in the section CONFIGURATION FILE in vinum(8). The directory /dev/vinum looks like:

# ls -lR /dev/vinum
total 5
crwxr-xr--  1 root  wheel   91,   2 Mar 30 16:08 concat
crwx------  1 root  wheel   91, 0x40000000 Mar 30 16:08 control
crwx------  1 root  wheel   91, 0x40000001 Mar 30 16:08 controld
drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 drive
drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 plex
drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 rvol
drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 sd
crwxr-xr--  1 root  wheel   91,   3 Mar 30 16:08 strcon
crwxr-xr--  1 root  wheel   91,   1 Mar 30 16:08 stripe
crwxr-xr--  1 root  wheel   91,   0 Mar 30 16:08 tinyvol
drwxrwxrwx  7 root  wheel       512 Mar 30 16:08 vol
crwxr-xr--  1 root  wheel   91,   4 Mar 30 16:08 vol5

/dev/vinum/drive:
total 0
crw-r-----  1 root  operator    4,  15 Oct 21 16:51 drive2
crw-r-----  1 root  operator    4,  31 Oct 21 16:51 drive4

/dev/vinum/plex:
total 0
crwxr-xr--  1 root  wheel   91, 0x10000002 Mar 30 16:08 concat.p0
crwxr-xr--  1 root  wheel   91, 0x10010002 Mar 30 16:08 concat.p1
crwxr-xr--  1 root  wheel   91, 0x10000003 Mar 30 16:08 strcon.p0
crwxr-xr--  1 root  wheel   91, 0x10010003 Mar 30 16:08 strcon.p1
crwxr-xr--  1 root  wheel   91, 0x10000001 Mar 30 16:08 stripe.p0
crwxr-xr--  1 root  wheel   91, 0x10000000 Mar 30 16:08 tinyvol.p0
crwxr-xr--  1 root  wheel   91, 0x10000004 Mar 30 16:08 vol5.p0
crwxr-xr--  1 root  wheel   91, 0x10010004 Mar 30 16:08 vol5.p1

/dev/vinum/sd:
total 0
crwxr-xr--  1 root  wheel   91, 0x20000002 Mar 30 16:08 concat.p0.s0
crwxr-xr--  1 root  wheel   91, 0x20100002 Mar 30 16:08 concat.p0.s1
crwxr-xr--  1 root  wheel   91, 0x20010002 Mar 30 16:08 concat.p1.s0
crwxr-xr--  1 root  wheel   91, 0x20000003 Mar 30 16:08 strcon.p0.s0
crwxr-xr--  1 root  wheel   91, 0x20100003 Mar 30 16:08 strcon.p0.s1
crwxr-xr--  1 root  wheel   91, 0x20010003 Mar 30 16:08 strcon.p1.s0
crwxr-xr--  1 root  wheel   91, 0x20110003 Mar 30 16:08 strcon.p1.s1
crwxr-xr--  1 root  wheel   91, 0x20000001 Mar 30 16:08 stripe.p0.s0
crwxr-xr--  1 root  wheel   91, 0x20100001 Mar 30 16:08 stripe.p0.s1
crwxr-xr--  1 root  wheel   91, 0x20000000 Mar 30 16:08 tinyvol.p0.s0
crwxr-xr--  1 root  wheel   91, 0x20100000 Mar 30 16:08 tinyvol.p0.s1
crwxr-xr--  1 root  wheel   91, 0x20000004 Mar 30 16:08 vol5.p0.s0
crwxr-xr--  1 root  wheel   91, 0x20100004 Mar 30 16:08 vol5.p0.s1
crwxr-xr--  1 root  wheel   91, 0x20010004 Mar 30 16:08 vol5.p1.s0
crwxr-xr--  1 root  wheel   91, 0x20110004 Mar 30 16:08 vol5.p1.s1

/dev/vinum/vol:
total 5
crwxr-xr--  1 root  wheel   91,   2 Mar 30 16:08 concat
drwxr-xr-x  4 root  wheel       512 Mar 30 16:08 concat.plex
crwxr-xr--  1 root  wheel   91,   3 Mar 30 16:08 strcon
drwxr-xr-x  4 root  wheel       512 Mar 30 16:08 strcon.plex
crwxr-xr--  1 root  wheel   91,   1 Mar 30 16:08 stripe
drwxr-xr-x  3 root  wheel       512 Mar 30 16:08 stripe.plex
crwxr-xr--  1 root  wheel   91,   0 Mar 30 16:08 tinyvol
drwxr-xr-x  3 root  wheel       512 Mar 30 16:08 tinyvol.plex
crwxr-xr--  1 root  wheel   91,   4 Mar 30 16:08 vol5
drwxr-xr-x  4 root  wheel       512 Mar 30 16:08 vol5.plex

/dev/vinum/vol/concat.plex:
total 2
crwxr-xr--  1 root  wheel   91, 0x10000002 Mar 30 16:08 concat.p0
drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 concat.p0.sd
crwxr-xr--  1 root  wheel   91, 0x10010002 Mar 30 16:08 concat.p1
drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 concat.p1.sd

/dev/vinum/vol/concat.plex/concat.p0.sd:
total 0
crwxr-xr--  1 root  wheel   91, 0x20000002 Mar 30 16:08 concat.p0.s0
crwxr-xr--  1 root  wheel   91, 0x20100002 Mar 30 16:08 concat.p0.s1

/dev/vinum/vol/concat.plex/concat.p1.sd:
total 0
crwxr-xr--  1 root  wheel   91, 0x20010002 Mar 30 16:08 concat.p1.s0

/dev/vinum/vol/strcon.plex:
total 2
crwxr-xr--  1 root  wheel   91, 0x10000003 Mar 30 16:08 strcon.p0
drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 strcon.p0.sd
crwxr-xr--  1 root  wheel   91, 0x10010003 Mar 30 16:08 strcon.p1
drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 strcon.p1.sd

/dev/vinum/vol/strcon.plex/strcon.p0.sd:
total 0
crwxr-xr--  1 root  wheel   91, 0x20000003 Mar 30 16:08 strcon.p0.s0
crwxr-xr--  1 root  wheel   91, 0x20100003 Mar 30 16:08 strcon.p0.s1

/dev/vinum/vol/strcon.plex/strcon.p1.sd:
total 0
crwxr-xr--  1 root  wheel   91, 0x20010003 Mar 30 16:08 strcon.p1.s0
crwxr-xr--  1 root  wheel   91, 0x20110003 Mar 30 16:08 strcon.p1.s1

/dev/vinum/vol/stripe.plex:
total 1
crwxr-xr--  1 root  wheel   91, 0x10000001 Mar 30 16:08 stripe.p0
drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 stripe.p0.sd

/dev/vinum/vol/stripe.plex/stripe.p0.sd:
total 0
crwxr-xr--  1 root  wheel   91, 0x20000001 Mar 30 16:08 stripe.p0.s0
crwxr-xr--  1 root  wheel   91, 0x20100001 Mar 30 16:08 stripe.p0.s1

/dev/vinum/vol/tinyvol.plex:
total 1
crwxr-xr--  1 root  wheel   91, 0x10000000 Mar 30 16:08 tinyvol.p0
drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 tinyvol.p0.sd

/dev/vinum/vol/tinyvol.plex/tinyvol.p0.sd:
total 0
crwxr-xr--  1 root  wheel   91, 0x20000000 Mar 30 16:08 tinyvol.p0.s0
crwxr-xr--  1 root  wheel   91, 0x20100000 Mar 30 16:08 tinyvol.p0.s1

/dev/vinum/vol/vol5.plex:
total 2
crwxr-xr--  1 root  wheel   91, 0x10000004 Mar 30 16:08 vol5.p0
drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 vol5.p0.sd
crwxr-xr--  1 root  wheel   91, 0x10010004 Mar 30 16:08 vol5.p1
drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 vol5.p1.sd

/dev/vinum/vol/vol5.plex/vol5.p0.sd:
total 0
crwxr-xr--  1 root  wheel   91, 0x20000004 Mar 30 16:08 vol5.p0.s0
crwxr-xr--  1 root  wheel   91, 0x20100004 Mar 30 16:08 vol5.p0.s1

/dev/vinum/vol/vol5.plex/vol5.p1.sd:
total 0
crwxr-xr--  1 root  wheel   91, 0x20010004 Mar 30 16:08 vol5.p1.s0
crwxr-xr--  1 root  wheel   91, 0x20110004 Mar 30 16:08 vol5.p1.s1

In the case of unattached plexes and subdisks, the naming is reversed. Subdisks are named after the disk on which they are located, and plexes are named after the subdisk.

This mapping is still to be determined.

Each vinum object has a associated with it. vinum uses this state to determine the handling of the object.

Volumes may have the following states:

The volume is completely inaccessible.
The volume is up and at least partially functional. Not all plexes may be available.

Plexes may have the following states:

A plex entry which has been referenced as part of a volume, but which is currently not known.
A plex which has gone completely down because of I/O errors.
A plex which has been taken down by the administrator.
A plex which is being initialized.

The remaining states represent plexes which are at least partially up.

A plex entry which is at least partially up. Not all subdisks are available, and an inconsistency has occurred. If no other plex is uncorrupted, the volume is no longer consistent.
A RAID-5 plex entry which is accessible, but one subdisk is down, requiring recovery for many I/O requests.
A plex which is really up, but which has a reborn subdisk which we do not completely trust, and which we do not want to read if we can avoid it.
A plex entry which is completely up. All subdisks are up.

Subdisks can have the following states:

A subdisk entry which has been created completely. All fields are correct, and the disk has been updated, but the on the disk is not valid.
A subdisk entry which has been referenced as part of a plex, but which is currently not known.
A subdisk entry which has been created completely and which is currently being initialized.

The following states represent invalid data.

A subdisk entry which has been created completely. All fields are correct, the config on disk has been updated, and the data was valid, but since then the drive has been taken down, and as a result updates have been missed.
A subdisk entry which has been created completely. All fields are correct, the disk has been updated, and the data was valid, but since then the drive has been crashed and updates have been lost.

The following states represent valid, inaccessible data.

A subdisk entry which has been created completely. All fields are correct, the disk has been updated, and the data was valid, but since then the drive has gone down. No attempt has been made to write to the subdisk since the crash, so the data is valid.
A subdisk entry which was up, which contained valid data, and which was taken down by the administrator. The data is valid.
The subdisk is currently in the process of being revived. We can write but not read.

The following states represent accessible subdisks with valid data.

A subdisk entry which has been created completely. All fields are correct, the disk has been updated, and the data was valid, but since then the drive has gone down and up again. No updates were lost, but it is possible that the subdisk has been damaged. We won't read from this subdisk if we have a choice. If this is the only subdisk which covers this address space in the plex, we set its state to up under these circumstances, so this status implies that there is another subdisk to fulfil the request.
A subdisk entry which has been created completely. All fields are correct, the disk has been updated, and the data is valid.

Drives can have the following states:

At least one subdisk refers to the drive, but it is not currently accessible to the system. No device name is known.
The drive is not accessible.
The drive is up and running.

Solving problems with vinum can be a difficult affair. This section suggests some approaches.

It is relatively easy (too easy) to run into problems with the vinum configuration. If you do, the first thing you should do is stop configuration updates:

vinum setdaemon 4

This will stop updates and any further corruption of the on-disk configuration.

Next, look at the on-disk configuration with the vinum dumpconfig command, for example:

# vinum dumpconfig
Drive 4:        Device /dev/da3s0h
                Created on crash.lemis.com at Sat May 20 16:32:44 2000
                Config last updated Sat May 20 16:32:56 2000
                Size:        601052160 bytes (573 MB)
volume obj state up
volume src state up
volume raid state down
volume r state down
volume foo state up
plex name obj.p0 state corrupt org concat vol obj
plex name obj.p1 state corrupt org striped 128b vol obj
plex name src.p0 state corrupt org striped 128b vol src
plex name src.p1 state up org concat vol src
plex name raid.p0 state faulty org disorg vol raid
plex name r.p0 state faulty org disorg vol r
plex name foo.p0 state up org concat vol foo
plex name foo.p1 state faulty org concat vol foo
sd name obj.p0.s0 drive drive2 plex obj.p0 state reborn len 409600b driveoffset 265b plexoffset 0b
sd name obj.p0.s1 drive drive4 plex obj.p0 state up len 409600b driveoffset 265b plexoffset 409600b
sd name obj.p1.s0 drive drive1 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 0b
sd name obj.p1.s1 drive drive2 plex obj.p1 state reborn len 204800b driveoffset 409865b plexoffset 128b
sd name obj.p1.s2 drive drive3 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 256b
sd name obj.p1.s3 drive drive4 plex obj.p1 state up len 204800b driveoffset 409865b plexoffset 384b

The configuration on all disks should be the same. If this is not the case, please save the output to a file and report the problem. There is probably little that can be done to recover the on-disk configuration, but if you keep a copy of the files used to create the objects, you should be able to re-create them. The create command does not change the subdisk data, so this will not cause data corruption. You may need to use the resetconfig command if you have this kind of trouble.

In order to analyse a panic which you suspect comes from vinum you will need to build a debug kernel. See the online handbook at http://www.dragonflybsd.org/docs/user/list/DebugKernelCrashDumps/ for more details of how to do this.

Perform the following steps to analyse a vinum problem:

  1. Copy the following files to the directory in which you will be performing the analysis, typically /var/crash:

    • /sys/dev/raid/vinum/.gdbinit.crash,
    • /sys/dev/raid/vinum/.gdbinit.kernel,
    • /sys/dev/raid/vinum/.gdbinit.serial,
    • /sys/dev/raid/vinum/.gdbinit.vinum and
    • /sys/dev/raid/vinum/.gdbinit.vinum.paths
  2. Make sure that you build the vinum module with debugging information. The standard Makefile builds a module with debugging symbols by default. If the version of vinum in /boot/kernel does not contain symbols, you will not get an error message, but the stack trace will not show the symbols. Check the module before starting kgdb(1):
    $ file /boot/kernel/vinum.ko
    /boot/kernel/vinum.ko: ELF 32-bit LSB shared object, Intel 80386,
      version 1 (SYSV), dynamically linked, not stripped

    If the output shows that /boot/kernel/vinum.ko is stripped, you will have to find a version which is not. Usually this will be either in /usr/obj/usr/src/sys/SYSTEM_NAME/usr/src/sys/dev/raid/vinum/vinum.ko (if you have built vinum with a “make world”) or /sys/dev/raid/vinum/vinum.ko (if you have built vinum in this directory). Modify the file .gdbinit.vinum.paths accordingly.

  3. Either take a dump or use remote serial gdb(1) to analyse the problem. To analyse a dump, say /var/crash/vmcore.5, link /var/crash/.gdbinit.crash to /var/crash/.gdbinit and enter:
    cd /var/crash
    kgdb kernel.debug vmcore.5

    This example assumes that you have installed the correct debug kernel at /var/crash/kernel.debug. If not, substitute the correct name of the debug kernel.

    To perform remote serial debugging, link /var/crash/.gdbinit.serial to /var/crash/.gdbinit and enter

    cd /var/crash
    kgdb kernel.debug

    In this case, the .gdbinit file performs the functions necessary to establish connection. The remote machine must already be in debug mode: enter the kernel debugger and select gdb (see ddb(4) for more details.) The serial .gdbinit file expects the serial connection to run at 38400 bits per second; if you run at a different speed, edit the file accordingly (look for the remotebaud specification).

    The following example shows a remote debugging session using the debug command of vinum(8):

    GDB 4.16 (i386-unknown-dragonfly), Copyright 1996 Free Software Foundation, Inc.
    Debugger (msg=0xf1093174 "vinum debug") at ../../i386/i386/db_interface.c:318
    318                 in_Debugger = 0;
    #1  0xf108d9bc in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6dedee0 "",
        flag=0x3, p=0xf68b7940) at
        /usr/src/sys/dev/raid/vinum/vinumioctl.c:102
    102             Debugger ("vinum debug");
    (kgdb) bt
    #0  Debugger (msg=0xf0f661ac "vinum debug") at ../../i386/i386/db_interface.c:318
    #1  0xf0f60a7c in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6923ed0 "",
          flag=0x3, p=0xf688e6c0) at
          /usr/src/sys/dev/raid/vinum/vinumioctl.c:109
    #2  0xf01833b7 in spec_ioctl (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:424
    #3  0xf0182cc9 in spec_vnoperate (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:129
    #4  0xf01eb3c1 in ufs_vnoperatespec (ap=0xf6923e0c) at ../../ufs/ufs/ufs_vnops.c:2312
    #5  0xf017dbb1 in vn_ioctl (fp=0xf1007ec0, com=0xc008464b, data=0xf6923ed0 "",
          p=0xf688e6c0) at vnode_if.h:395
    #6  0xf015dce0 in ioctl (p=0xf688e6c0, uap=0xf6923f84) at ../../kern/sys_generic.c:473
    #7  0xf0214c0b in syscall (frame={tf_es = 0x27, tf_ds = 0x27, tf_edi = 0xefbfcff8,
          tf_esi = 0x1, tf_ebp = 0xefbfcf90, tf_isp = 0xf6923fd4, tf_ebx = 0x2,
          tf_edx = 0x804b614, tf_ecx = 0x8085d10, tf_eax = 0x36, tf_trapno = 0x7,
          tf_err = 0x2, tf_eip = 0x8060a34, tf_cs = 0x1f, tf_eflags = 0x286,
          tf_esp = 0xefbfcf78, tf_ss = 0x27}) at ../../i386/i386/trap.c:1100
    #8  0xf020a1fc in Xint0x80_syscall ()
    #9  0x804832d in ?? ()
    #10 0x80482ad in ?? ()
    #11 0x80480e9 in ?? ()

    When entering from the debugger, it is important that the source of frame 1 (listed by the .gdbinit file at the top of the example) contains the text “Debugger ("vinum debug");”.

    This is an indication that the address specifications are correct. If you get some other output, your symbols and the kernel module are out of sync, and the trace will be meaningless.

For an initial investigation, the most important information is the output of the bt (backtrace) command above.

If you find any bugs in vinum, please report them to Greg Lehey <grog@lemis.com>. Supply the following information:

  • The output of the vinum list command (see vinum(8)).
  • Any messages printed in /var/log/messages. All such messages will be identified by the text “vinum” at the beginning.
  • If you have a panic, a stack trace as described above.

disklabel(5), disklabel(8), newfs(8), vinum(8)

vinum first appeared in FreeBSD 3.0. The RAID-5 component of vinum was developed by Cybernet Inc. (http://www.cybernet.com/), for its NetMAX product.

Greg Lehey <grog@lemis.com>.

vinum is a new product. Bugs can be expected. The configuration mechanism is not yet fully functional. If you have difficulties, please look at the section DEBUGGING PROBLEMS WITH VINUM before reporting problems.

Kernels with the vinum pseudo-device appear to work, but are not supported. If you have trouble with this configuration, please first replace the kernel with a non-vinum kernel and test with the kld module.

Detection of differences between the version of the kernel and the kld is not yet implemented.

The RAID-5 functionality is new in FreeBSD 3.3. Some problems have been reported with vinum in combination with soft updates, but these are not reproducible on all systems. If you are planning to use vinum in a production environment, please test carefully.

December 12, 2014 DragonFly-5.6.1