Category Archives: pefs

XTS support in pefs

I’ve replaced CTR encryption mode with XTS. Salsa20 stream cipher was also removed. CTR mode was inappropriate design for a filesystem, and allowed encrypted data to be easily manipulated by attacker and could even reveal plantext in cases when previous encrypted data snapshots where available to attacker, i.e. filesystem level snapshots. There should be no visible performance degradation because of switching to XTS.

CTR mode compatibility is not available to prevent further misuse, thus upgrade by hand would be necessary.

Besides I’ve also commited real support for sparse files and file extending, it should make filesystem faster in generic use cases. New version also contains fix for a race in rename operation.

I would like to ask people interested in getting such functionality in FreeBSD to give pefs a try, any feedback is welcome.

Installation instructions may be found in my message to freebsd-current maillist.

pefs and l2filter moved to github

I’ve just moved pefs and l2filter development to github. Hope it helps people follow development.

pefs repository ( can be used to to compile and run pefs without applying any patches.

pefs changelog:
* support running on msdosfs
* enable dircache only on file systems that are known to support it
* add man page
* add pefs getkey command
* intial implementation of pefs PAM module

l2filter repository ( contains only patches. There is fresh patch against 8-STABLE with some minor improvements comparing to 7-STABLE version. 9-CURRENT patch is a bit outdated at the moment, as I’m waiting for Luigi Rizzo to finish ipfw refactoring work first.

Besides I’ve moved my blog to
Please update your bookmarks. I do not intend to update blog freebsdish any more.

pefs dircache benchmark

I’ve recently added directory caching into pefs (sources tarball available here and here).

Despite of being directory listing cache (like dirhash for ufs) it also acts as encrypted file name cache. So that there is no need to decrypt names for the same entries all the time. That was really big issue because directory listing has to be reread on almost every vnode lookup operation. It made operations on directories with 1000 and more files too time consuming.

The cache is getting updated at two points: during vnode lookup operation and during readdir call. Vnode generation attribute is used to monitor directory changes (the same way NFS works) and expire the cache if it changes. There is no per-operation monitoring because that would violate stacked filesystem nature (and also complicate the code). There are some issues regarding large directories handling within dircache. First of all results of consequent readdir calls considered inconsistent, i.e cache expires if user provided buffer is too small to fit entire directory listing. And while doing a vnode lookup search doesn’t terminate if matching directory entry found, it further traverses directory to update the cache.

There is vfs.pefs.dircache_enable sysctl to control cache validity. Setting it to zero would force always treating cache as invalid, and thus dircache would function only as a file name encryption cache.

At the moment caching is only enabled for name decryption, but there are operations like rm or rmdir which perform name encryption on every call to pass data to underlying filesystem. Enabling caching for such operations is not going to be hard, but I just want code to stabilize a bit before moving further.

I’ve performed two types of tests: dbench and handling directories with large number of files. I’ve used pefs mounted on top of tmpfs to measure pefs overhead but not disk io performance. Salsa20 algorithms with 256 bit key was chosen because of being the fastest available. Before each run underlying tmpfs filesystem was remounted. Each test was run for 3 times, and average of results is shown in charts (distribution was less then 2%). Also note that I’ve used kernel with some extra debugging compiled in (invariants, lock debugging).

dbench doesn’t show extra large difference comparing to plain pefs and old pefs without dircache: 143,635 Mb/s against 116,746 Mb/s; it’s 18% improvement witch is very good imho. Also interesting is that result gets just a bit lower after setting vfs.pefs.dircache_enable=0: 141,289 Mb/s with dircache_enable=0 against 143,635 Mb/s.

Dbench uses directories with small number of entries (usually ~20). That perfectly explains the results achieved. Handling large directories is where dircache shines. I’ve used the following trivial script for testing, it creates 1000 or 2000 files, does ‘ls -l’ and removes these files:
for i in `jot 1000`; do
touch test-$i
ls -Al >/dev/null
find . -name test-\* -exec rm '{}' +

The chart speaks for itself. And per file overhead looks much closer to expected linear growth after running the same test for 3000 files:

Encrypting private directory with pefs

pefs is a kernel level cryptographic filesystem. It works transparently on top of other filesystems and doesn’t require root privileges. There is no need to allocate another partition and take additional care of backups, resizing partition when it fills up, etc.

To install pefs grab sources tarball here (mirror 1, mirror 2). Unpack it into /usr/src, compile and install:

# make -C /usr/src/sys/modules/salsa20 obj all install clean
# make -C /usr/src/sys/modules/pefs obj all install clean
# make -C /usr/src/sbin/pefs obj all install clean

Note: It’s being developed on amd64 9-CURRENT, and was tested on i386 8-CURRENT some time ago (before branching). It should also work on 7-STABLE, but I’m not able to test it. Would appreciate any feedback and will try to fix all incompatibilities.

Create a new directory to encrypt. Let it be ~/Private:

% mkdir ~/Private

And mount pefs on top of it (root privileges are necessary to mount filesystem unless you have vfs.usermount sysctl set to non-zero):

% pefs mount ~/Private ~/Private

At this point ~/Private behaves like read-only filesystem because no keys are set up yet. To make it useful add a new key:

% pefs addkey ~/Private

After entering a passphrase, you can check active keys:

% pefs showkeys ~/Private
0 b0bed3f7f33e461b aes256-ctr

As you can see AES algorithm is used by default (in CTR mode with 256 bit key). It can be changed with pefs addkey -a option.

You should take into account that pefs doesn’t save any metadata. That means that there is no way for filesystem to “verify” the key. To work around it key chaining can be used (pefs showchain, setchain, delchain). I’m going show how it works in next posts.

Let’s give it a try:

% echo "Hello WORLD" > ~/Private/test

% ls -Al ~/Private
total 1
-rw-r--r-- 1 gleb gleb 12 Oct 1 12:55 test

% cat ~/Private/test

Here is what it looks like at lower filesystem level:

% pefs unmount ~/Private

% ls -Al ~/Private
total 1
-rw-r--r-- 1 gleb gleb 12 Oct 1 12:55 .DU6eudxZGtO8Ry_2Z3Sl+tq2hV3O75jq

% hd ~/Private/.DU6eudxZGtO8Ry_2Z3Sl+tq2hV3O75jq
00000000 7f 1e 1b 05 fc 8a 5c 38 fc d8 2d 5f |......\8..-_|

Your result is going to be different because pefs uses random tweak value to encrypt files. This tweak is saved in encrypted file name. Using the tweak also means that the same files have different encrypted content.

pefs crypto primitives

This is a second part of a series of posts devoted to pefs (stacked cryptographic filesystem). First part was about benchmarking encryption overhead

So I’m going to shed some light on cryptographic primitives used in pefs.

Supported data encryption algorithms: AES, Camellia, Salsa20.

Salsa20 is a new stream cypher by Daniel Bernstein from eSTREAM portfolio. Available AES and Camellia key sizes: 128, 192 and 256 bits. Adding another block cipher is trivial (choosing one with 128bit block size would save your from writing few more lines of code). Stream ciphers are a bit different as they do not usually support encryption at arbitrary offset in key stream (what CTR mode does for block ciphers). Most of them also lack support for tweaked encryption. Salsa20 fits here well mostly because it’s designed on top of hash function. Another possible candidate for inclusion is Skein – SHA3 hash function by Bruce Schneier et al. submitted to SHA3 competition. Usage of Skein as a stream cipher is suggested by its authors.

File names are always encrypted using AES-128 in CBC mode. Encrypted file name consists of a unique per file tweak, checksum and name itself:
XBase64(checksum || E(tweak || filename))

Checksum is VMAC of encrypted tweak and file name:
checksum = VMAC(E(tweak || filename))

Main reason for not providing alternatives is to keep it simple: “complexity is insecurity“. Data encryption is entirely different: encrypted data is not parsed in any way by pefs and user expects to be able to use secure-fast-best_name cipher.

Name has such structure to work around some of CBC shortcomings. Random tweak value is placed at the beginning of the first encrypted block. That gives us unique encrypted file names and eliminates the need of dealing with initial IV. (For those who cares: initial IV is zero and name is padded with zeros).

Encrypt-then-Authenticate construction is used. In addition to being most secure variant at the moment it allows checking if the name was encrypted by the given key without performing decryption. VMAC was chosen because of it performance characteristics and its ability to produce 64bit MAC (without truncation of original result, like in HMAC case). 64bit is almost mandatory here because larger MAC would result in much larger file name and bigger MAC can hardly improve security. But the real reason is that no real “authentication” performed. It’s designed to be just a cryptographic checksum (sounds incorrect but I can’t find a better wording), so that breaking VMAC wouldn’t result in breaking encrypted data, and this VMAC doesn’t “authenticate” encrypted data. It’s main purpose is to find a key the file is encrypted with.

Encrypted directory name also contains tweak but it’s used solely to randomize first CBC cipher block and keep name structure uniform.

Data encryption utilizes the tweak to get unique per file ciphertext (for files with same data). Block ciphers (AES, Camellia) operate in counter (CTR) mode with 64bits of a block containing the tweak and other 64bits containing file offset. Salsa20 supports tweaked encryption out of the box. Choosing such mode of operation eliminates necessity of performing cipher block size io. Which significantly simplifies entire filesystem code: io, page mapping, doesn’t change file size, etc (it removes a tone of error prone code, believe me, I’ve tried it other way).

As one has probably guessed there are 3 keys involved: one for name encryption, VMAC and data encryption. These are derived from user supplied key using HKDF algorithm (HMAC based key derivation function, IETF draft). The kernel part expects cryptographically strong key from userspace. This key is generated by PBKDF on top of HMAC-SHA512.

Standard implementations of ciphers are used, but I do not use opencrypto framework, so there is no hardware acceleration available. opencrypto is not used mainly because it lacks support for CTR mode. It’s also unlikely to ever support Salsa20 the way it’s used in pefs. opencrypto is rather heavy weight so using it solely for name encryption would probably worsen performance (hw initialization costs for encrypting short chunks with different keys) and add extra dependency.

Besides pefs supports multiply keys, mixing files encrypted with different keys in single directory, transparent(unencrypted) mode, key chaining (adding a series of keys by entering just one of them) and more. I’m going to write about it soon.

pefs benchmark

pefs is a stacked cryptographic filesystem for FreeBSD. It has started as a Goggle Summer of Code’2009.

I’ve just come across performance comparison of eCryptfs against plain ext4 filesystem on Ubuntu, benchmark I was going to perform on my own.

I run dbench benchmarks regularly while working on pefs. But use it mostly as a stress test tool. I haven’t reached the point I can start working on improving performance yet. But measuring pefs overhead is going to be interesting.

Unfortunately I fail to interpret dbench results from the article. They’ve used dbench 4, while I’m using dbench 3 from ports. But never the less result of 4-8 Mb/s looks too strange for me.

I’ve benchmarked 4 and 16 dbench clients on zfs, pefs with salsa20 encryption (256 bit key) on top of same zfs partition and pefs with aes encryption (128 bit key, ctr mode). I executed benchmark for 3 times in each setup.

First of all, cipher throughput:
salsa20 ~205.5 Mb/s
aes128 ~81.3 Mb/s

Benchmark results:

dbench zfs/pefs 4 clients

dbench zfs/pefs 16 clients

In both cases (4 and 16 clients) CPU was limiting factor, disks where mostly idle. This explains such divergence in zfs results, I’ve actually benchmarked zfs arc cache performance. Because of unpredictable zfs inner workings one can get the best aes128 result surprisingly close to the worst salsa20 one (salsa20 is ~2.5 times faster than aes128).

The graph comparing average values:
dbench zfs/pefs 4 clients

Conclusion is that pefs is 2x times slower. But that shouldn’t be solely because of encryption. From my previous testing I can conclude that it’s mostly filesystem overhead:

  • Current pefs implementation avoids data caching (to prevent double caching and restrain one’s paranoia). I had version using buffer management for io (bread/bwrite) it’s performance was awful, something like 20-30 Mb/s with salsa20 encryption.
  • Sparse files (add file resizing here too) is implemented very poorly: it requires exclusive lock and fills gap with zeros. While this “gap” is likely to be filled by application really soon.
  • Lookup operation is very expensive. It calls readdir and decrypts name for each directory entry.

eCryptfs IOzone benchmark also shows 2x difference