Monthly Archive for January, 2007

Pawel Jakub Dawidek: UFS and BIO_DELETE

BIO_DELETE is yet another I/O request. The two most famous I/O requests are of course BIO_READ and BIO_WRITE.

The BIO_DELETE bascially can be used to say “I don’t need this range of data anymore”. The underlying provider, which receives such a request can do various things, depending on its purpose. The geli(8) GEOM class, which implements disk encryption can, for example, fill the range of data with random bits. Memory- and swap-backed md(4) device could just free memory. Unfortunately there is currently no support for BIO_DELETE on file systems side. When UFS free some blocks, it should send BIO_DELETE.

And this is bascially what I recently implemented. Actually, it was surprisingly easy to implement, but it is not implemented correctly yet. The problem is that UFS use the buffer cache for writes. If it sends delayed write request, which is ment to update blocks used by the given inode and update free blocks bitmap I can’t send BIO_DELETE immediately. Sending BIO_DELETE right away means, that if a system crash or a power failure occurs between BIO_DELETE and inode update, we may end up with a file pointing to a garbage.

Why BIO_DELETE is worth implementing? Because once it’s in place we can implement a lot of really nice features on top of it. Let me name a few:

- md swap-backed temporary file system, on which when you delete a file, memory is freed,

- blocks deallocation for gvirstor; gvirstor is a GEOM class implemented by Ivan Voras during Google SoC, which allows to initially define very large virtual provider, but with limited physical storage available, which one can add when needed; unfortunately currently when file is deleted, gvirstor has no way to reassign those blocks elsewhere,

- GEOM compresion layer; actually it could be implemented without BIO_DELETE, but compression layer is about saving space, right? Actually adding compression to gvirstor may be not a bad idea,
- maybe gjournal can hold its gjournal in free blocks and migrate journal when blocks are allocated?:) Silly idea, but having BIO_DELETE in place we can tell which blocks are really used by the file system and this is very powerful information.

Pawel Jakub Dawidek: Endian-independent UFS.

NetBSD has UFS implementation that allows to mount UFS file systems created on architecture with different endianess. In other words, one can do newfs on sparc64 and mount it on i386 or vice versa. Very cool. It works by detecting what endianess file system uses and byteswapping fields in UFS structures at run-time as needed. I wanted to see how hard will it be to implement something simlar on FreeBSD. After one-night hacking mounting file systems read-only seems to works fine. I decided to work a bit lower than NetBSD and I replace bread()s with special functions that byteswap fields when needed. This saves quite a lot of code, but not everything can be implemented that way. I can byteswap the superblock, cylinder groups and inodes, but I can’t do the same for dirents, because ufs_readdir() use plain VOP_READ() to read directory entires, so I need to do the same NetBSD does for dirents. My method is most likely slower than NetBSD’s, because when file system reads one block of inodes, I byteswap them all and only one inode may be used later. We will see if performance impact is too high at some point. On the other hand you probably don’t want to use this functionality very often.

Pawel Jakub Dawidek: Regression tests for file systems.

I’ve spend some time working on a test suite, that verifies if file system works correctly. It mostly checks POSIX complaisance and works for FreeBSD/UFS, FreeBSD/ZFS, Solaris/UFS and Solaris/ZFS. The list of system calls tested is as follows: chflags, chmod, chown, link, mkdir, mkfifo, open, rename, rmdir, symlink, truncate, unlink. There are 3438 regression tests in 184 files and belive me, this was really boring work, but very educational on the other hand. All those tests are already committed to FreeBSD’s HEAD branch under src/tools/regression/fstest/. During the work I also updated many manual pages. At some point I’m planning to make this test suite to work on Darwin and Linux, but not sure yet when exactly.

The main motivation for this work was that there is no free POSIX complaisance test suite, AFAIK. Shame on you.

Pawel Jakub Dawidek: ZFS progress.

This is my first entry about ZFS, but I’m not going to describe what ZFS is, etc., but I need some place to write about my progress and this seems to be the right place.

As you may know or not, I’m porting ZFS file system to FreeBSD. The port is almost finished. Something like 98% functionality is implemented and work. You can read more about ZFS on OpenSolaris page and more about my port on various FreeBSD mailing lists. Today I finished NFS support, so you can now NFS mount ZFS file systems from FreeBSD. The remaining part I coded today was readdir functionality for GFS. GFS (Generic pseudo-FileSystem) is Solaris framework for virtual file systems. ZFS uses this framework to create .zfs/ directory where snapshots are placed. From now on you can list .zfs/ and .zfs/snapshot/ directories too via NFS. This was the only missing piece of NFS support.

Eric Anholt: my first X extension

Today I got to implement my first X request, from design through working code. It's a simple one, but the results are nice.

Up until now, DRI clients haven't been able to report damage. As a result, screencast programs haven't been able to use XDamage to do minimal screen-scraping when GL apps are running. I added a request where you simply hand an XFixes region of the area to be damaged plus a drawable, and the server reports that damage to that drawable. The DRI drivers in cooperation with the GL library now use this at SwapBuffers time to report the drawing they do to the screen.

I tested this using gstreamer, which let me easily create a screencast that displayed on the same screen. The command I used was:

gst-launch ximagesrc ! videoscale ! video-x-raw-rgb, width=400 ! ximagesink sync=false

There was only one issue: gstreamer folks still haven't committed my improvement #342463 to ximagesrc. So if you run glxgears for example, it overwhelms ximagesrc with damage rectangles, and within a few frames it's ground to a halt trying to catch up. Someone needs to commit this (or just OK me to give myself a commit bit :) ) I ended up sticking a sleep in gears to slow things down enough for now with stock ximagesrc.

I didn't actually do this for screencasting, though that was originally why I'd thought about doing this extension. We'd run into a problem with RandR 1.2, which allows for each CRTC to have different rotation, so each CRTC may need a shadow buffer for its rotation of the framebuffer. This means the server wants more control over rotation, which was already a mess to support in the previous architecture of the DRI driver doing the shadow buffer update on its own using rotation information we jammed into the sarea. Now we can nuke a bunch of support code and do shadow updates for rotation entirely within the server, with no concern for 3D.

Next things that could be done to build on this:

  • Make AIGLX support the new DRI method for damage reporting so screencasters work on it, too.
  • Merge to server-1.3 branch once we branch that off of randr-1.2-for-server-1.2 so it hits the next X Server release after 1.2.
  • Fix the driver to report damage for glCopyPixels and front buffer rendering
  • Fix libXvMC to report damage for its direct rendering.

Eric Anholt: jhbuilding xorg

For a while people have been using various hacky scripts to build all of modular xorg. I've never liked them -- when I'm building something, I just want to update one piece, and whatever it depended on. jhbuild, therefore, has been a good tool, except that it can't deal with the Mesa build disaster without using a custom module type.

So, a while back, I went ahead and made that module type. However, getting review from jamesh by prodding on IRC has been a failure, and 4 months later it's still uncommitted. So this is a plea: Can anyone get this committed to gnome, or will I be forced to maintain a fork of jhbuild for something so trivial?

What this is blocking is coverity checking. For coverity, they want something simple to build the whole project. jhbuild would be great for that, I think, except that right now manual intervention is required to get Mesa built, and that's in the middle of building the xorg stack.