ivoras’ FreeBSD blog

June 13, 2008

How many failed drives can take down a RAID5 array?

Filed under: FreeBSD — Tags: , — ivoras @ 2:49 pm

The answer, of course, is “two or more”. And it’s not nice when it happens.
Two of the drives on a nice shiny FC array failed at approximately the same time (possibly within about two minutes of each other), and both were in the same RAID5 array. Definitely not good. On the other hand I confirmed that PostgreSQL can run off NFS (both server and client on FreeBSD) without problems so far (this is the temporary setup until we get a new array).

June 8, 2008

Weekend hack: adfsd, a kqueue-assisted rsync tool

Filed under: FreeBSD — Tags: , , , — ivoras @ 10:44 pm

I created a small program to help me synchronize files in sort-of real-time between two directories (the idea is that one of them is on a NFS server). There are no replicated file systems for FreeBSD and the canonical way to do this is usually to use rsync or something like it. The problem here is that rsync always traverses both directory structures, compares files and then copies them (via a variety of smart algorithms but it’s still very resource-intensive).

So I created a daemon that uses kqueue(2) to monitor which files changed and feed only those files to rsync (it’s not exactly a new idea, I’m sure somebody has also mentioned it somewhere on the FreeBSD lists). This is in many ways a suboptimal solution since it needs to keep an open file descriptor for all the monitored files (which ties in kernel resources and memory) so it won’t scale for really large directories, which could actually benefit the most from this approach. It will work reasonably well for a small number of files (up to several tens of thousands), with modifying kern.maxfiles and kern.maxfilesperproc sysctls and login session limits (if applicable).

Anyone who’s interested can download the adfs daemon and try it. This was hacked together over the weekend so it probably has some problems. I’ll fix those problems that prevent me from using it, but I’ll update the online archive only if there are interested users.

The real power of “scripting” / interpreted languages

Filed under: FreeBSD — ivoras @ 6:03 pm

I’m writing a small project in C (will talk about it later) and I really miss the expressiveness that dynamic languages like Python offer. There’s one more thing in addition to elusive “elegance” and similar nontangible properties: the ability to easily use and implement better algorithms. Yes, since Turing it was obvious that the actual programming language in use is more-or-less syntactic sugar, but you wouldn’t exactly like to spend your days programming infinite tapes of symbols, would you? :)

In this (again, emphasis on “small”) project, there were a couple opportunities where I could make use of a fast data-access structure like a hash table (since I need to store and retrieve a lot of data entries) or dynamic memory allocation (since I don’t want to artificially limit the number of these entries) but I just didn’t feel like writing all that code to implement a hash table in C (or use a heavy external library) and deal with memory reallocation and track all those pointers. Yes, I’m lazy. In a more abstract language I could just instantiate a dictionary and say d[i] = something and this would actually be very efficient and take care of memory allocation automagically. Since I limited myself to basic C, I chose simpler algorithms like linked lists and evil static arrays on stack. Ironically, these structures would be comparatively significantly more inefficient in Python.

Of course, at its roots this can be stripped down to be simply a choice between using pre-packaged routines instead of writing your own (aka the NIH problem), but in this case it would actually make my simple program faster and more efficient – despite the overhead of an interpreted, dynamic language.

There are many more similar cases – programmers write bubblesorts in C because they are easy to implement, while going to a higher level of abstraction they could just write mylist.sort() and would get QuickSort or some other efficient algorithm for free, etc. etc.

Does anyone know of a library / collection of algorithms for C similar to glib only BSD-licensed? (Yes, I know about C++ algorithms, I don’t want to use C++).

June 1, 2008

FreeBSD on Subversion!

Filed under: FreeBSD — ivoras @ 11:12 am

The day has finally come – FreeBSD is using Subversion instead of CVS for the base source tree! Congratulations to everyone involved, especially Peter Wemm :)

FreeBSD’s source CVS is one of the oldest and biggest in existence; it’s approximately 12 years old and has apparently had something like 180,000 commits over the years, or on average slightly more than 41 commits daily. A checkout of RELENG_7 branch holds more than 42,000 files (in 482 MB as du sees it).

This move was discussed extensively during the DevSummit at BSDCan 2008; there have been many issues with CVS over the years, most of which are minor enough to be overlooked, but some of which are just nasty (the inability of CVS to move/rename files, bad handling of branching in the event of constant new development and additions to the directory tree, non-atomic commits) and have frequently required manual interventions in the CVS repository (“repocopy” is one of the relatively frequently requested operations to the CVS admins).

Old infrastructure, of which cvsup / csup is probably the most important part, will continue to work as code will continually be mirrored from SVN to CVS, until suitable replacements or upgrades to the above tools are created. This is also the reason the name “CVS” will be present for some time in the system infrastructure, until all of it is updated. Ports will continue to use CVS for the foreseeable future.

To make a source base this large work efficiently on SVN, version 1.5 had to be used, since it creates its database files in a hash-tree of directories on the file system instead of one huge directory with all the files in it.

Also see the official announcement of Subversion and Peter Wemm’s notes about Subversion (very useful to developers).

Powered by WordPress