Category Archives: ZFS

ZFS support in libvirt

An upcoming release of libvirt, 1.2.8 that should be released early September, will include an initial support of managing ZFS volumes.

That means that it's possible to boot VMs and use ZFS volumes as disks. Additionally, it allows to control volumes using the libvirt API. Currently, supported operations are:

  • list volumes in a pool
  • create and delete volumes
  • upload and download volumes

It's not possible to create and delete pools yet, hope to implement that in the next release.

Defining a pool

Assume we have some pools and want to use one of them in libvirt:

# zpool list
NAME SIZE ALLOC FREE FRAG EXPANDSZ CAP DEDUP HEALTH ALTROOT
filepool 1,98G 56,5K 1,98G 0% - 0% 1.00x ONLINE -
test 186G 7,81G 178G 0% - 4% 1.00x ONLINE -

Let's take filepool and define it with libvirt. This could be done using this virsh command:

virsh # pool-define-as --name zfsfilepool --source-name filepool --type zfs
Pool zfsfilepool defined

virsh # pool-start zfsfilepool
Pool zfsfilepool started

virsh # pool-info zfsfilepool
Name: zfsfilepool
UUID: 5d1a33a9-d8b5-43d8-bebe-c585e9450176
State: running
Persistent: yes
Autostart: no
Capacity: 1,98 GiB
Allocation: 56,50 KiB
Available: 1,98 GiB

virsh #

As you can see, we specify a type of the pool, its source name, such as seen in zpool list output and a name for it in libvirt. We also need to start it using the pool-start command.

Managing volumes

Let's create a couple of volumes in our new pool.


virsh # vol-create-as --pool zfsfilepool --name vol1 --capacity 1G
Vol vol1 created

virsh # vol-create-as --pool zfsfilepool --name vol2 --capacity 700M
Vol vol2 created

virsh # vol-list zfsfilepool
Name Path
------------------------------------------------------------------------------
vol1 /dev/zvol/filepool/vol1
vol2 /dev/zvol/filepool/vol2

virsh #

Dropping a volume is also easy:

virsh # vol-delete --pool zfsfilepool vol2
Vol vol2 deleted

Uploading and downloading data

Let's upload an image to our new volume:

virsh # vol-upload --pool zfsfilepool --vol vol1 --file /home/novel/FreeBSD-10.0-RELEASE-amd64-memstick.img 

... and download

virsh # vol-download --pool zfsfilepool --vol vol1 --file /home/novel/zfsfilepool_vol1.img

Note: if you would check e.g. md5 sum of the downloaded files, the result would be different as downloaded file will be of the same size as a volume. However, if you trim zeros, it'll be the same.

$ md5 FreeBSD-10.0-RELEASE-amd64-memstick.img zfsfilepool_vol1.img 
MD5 (FreeBSD-10.0-RELEASE-amd64-memstick.img) = e8e7cbd41b80457957bd7981452ecf5c
MD5 (zfsfilepool_vol1.img) = a77c3b434b01a57ec091826f81ebbb97
$ truncate -r FreeBSD-10.0-RELEASE-amd64-memstick.img zfsfilepool_vol1.img
$ md5 FreeBSD-10.0-RELEASE-amd64-memstick.img zfsfilepool_vol1.img
MD5 (FreeBSD-10.0-RELEASE-amd64-memstick.img) = e8e7cbd41b80457957bd7981452ecf5c
MD5 (zfsfilepool_vol1.img) = e8e7cbd41b80457957bd7981452ecf5c
$

Booting a VM from volume

Finally got to the most important part. In use a volume as disk device for VM 'devices' section of the domain XML should be updated with something like this:


<disk type='volume' device='disk'>
<source pool='zfsfilepool' volume='vol1'/>
<target dev='vdb' bus='virtio'/>
</disk>

Few notes

Note #1: this code is just a few weeks old, so quite likely there are some rough edges. Feel free to report problems to novel%freebsd.org if you spot any problems.

Note #2: this code is FreeBSD-only for now. However, it should not be hard to make it work on Linux with zfsonlinux.org. Its developers were kind enough to add some useful missing flags in some of the CLI tools. However, these changes are not available in any released version so far. There are some more minor differences between zfs on Linux and FreeBSD, but that should not be hard to address. I was planning to get to it as soon as a new version of zfs on linux with the necessary flags is available. However, if you are interested in that and ready to help with testing -- feel free to poke me so it could be done sooner.


ZFS v28 in FreeBSD 9-CURRENT!

As has been expected, previously announced and tested, ZFS v28 has been committed to FreeBSD HEAD!

New features include:

  • RAID-Z3 (triple parity - one more parity drive than RAID-6)
  • Deduplication
  • Better recovery support during import (forced log rewind, read-only import)
  • Snapshot-level diff (like regular diff but working on file systems)
  • zpool split (split a RAID-1 / mirrored set of drives into separate / independant zpools)

As always, testers are welcome!

Read more...

ZFS v28 in FreeBSD 9-CURRENT!

As has been expected, previously announced and tested, ZFS v28 has been committed to FreeBSD HEAD!

New features include:

  • RAID-Z3 (triple parity - one more parity drive than RAID-6)
  • Deduplication
  • Better recovery support during import (forced log rewind, read-only import)
  • Snapshot-level diff (like regular diff but working on file systems)
  • zpool split (split a RAID-1 / mirrored set of drives into separate / independant zpools)

As always, testers are welcome!

Read more...

ZFS v28 in FreeBSD 9-CURRENT!

As has been expected, previously announced and tested, ZFS v28 has been committed to FreeBSD HEAD!

New features include:

  • RAID-Z3 (triple parity - one more parity drive than RAID-6)
  • Deduplication
  • Better recovery support during import (forced log rewind, read-only import)
  • Snapshot-level diff (like regular diff but working on file systems)
  • zpool split (split a RAID-1 / mirrored set of drives into separate / independant zpools)

As always, testers are welcome!

Read more...

FreeBSD on 4K sector drives

All major FreeBSD filesystems support 4K sectors (UFS, ZFS, ext2), and so does the lower level - GEOM - but currently there's an issue of communicating this configuration between all the layers. A part of the problem is that the current drives (and the situation will probably not change during this new decade) advertise two sector sizes: both 512 byte and 4K, and the system needs to correctly interpret them. All this will be resolved when a consensus on the topic gets achieved, but until that happens (hopefully soon), there is a set of easy workarounds, which I'll describe here.

Read more...

FreeBSD on 4K sector drives

All major FreeBSD filesystems support 4K sectors (UFS, ZFS, ext2), and so does the lower level - GEOM - but currently there's an issue of communicating this configuration between all the layers. A part of the problem is that the current drives (and the situation will probably not change during this new decade) advertise two sector sizes: both 512 byte and 4K, and the system needs to correctly interpret them. All this will be resolved when a consensus on the topic gets achieved, but until that happens (hopefully soon), there is a set of easy workarounds, which I'll describe here.

Read more...

FreeBSD on 4K sector drives

All major FreeBSD filesystems support 4K sectors (UFS, ZFS, ext2), and so does the lower level - GEOM - but currently there's an issue of communicating this configuration between all the layers. A part of the problem is that the current drives (and the situation will probably not change during this new decade) advertise two sector sizes: both 512 byte and 4K, and the system needs to correctly interpret them. All this will be resolved when a consensus on the topic gets achieved, but until that happens (hopefully soon), there is a set of easy workarounds, which I'll describe here.

Read more...

vfs.hirunningspace and disk write latency performance

A while ago I increased the default value for the vfs.hirunningspace tunable - which greatly helps with performance when the disk system supports tagged queueing (e.g. NCQ), allowing many more requests to be offloaded into the controller and/or the drive(s). But deep queues bring their own problems, especially in pathological cases.

Read more...

So how is FreeBSD 9 shaping up?

It's still early to talk about FreeBSD 9.0 release but so far there have been some interesting developments in the systems and a nice core featureset is shaping up. I'm still maintaining the "What's cooking" page and this post is basically an (incomplete) summary of it at this point in time.

Of course, in addition to these features, there are non-stop modifications to all parts of the system, from drivers for new hardware to overall performance enhancements.

Read more...

HP "LeftHand"

I've seen a HP "LeftHand" / StorageWorkd P4000 SAN device recently and got quite good impressions off of it. One thing that occured to me is - why didn't anyone try this before? Certainly both Linux (to lesser extent) and FreeBSD (to a somewhat greater) contain the pieces for it, and have contained for some years now. In fact, several people did such setups privately or internally for their companies but there was apparently never a concentrated effort to sell it.

Read more...

HP "LeftHand"

I've seen a HP "LeftHand" / StorageWorkd P4000 SAN device recently and got quite good impressions off of it. One thing that occured to me is - why didn't anyone try this before? Certainly both Linux (to lesser extent) and FreeBSD (to a somewhat greater) contain the pieces for it, and have contained for some years now. In fact, several people did such setups privately or internally for their companies but there was apparently never a concentrated effort to sell it.

Read more...