Category Archives: FreeBSD

FreeBSD 9.1 ports feature freeze

The FreeBSD 9.1 schedule has been published, http://www.freebsd.org/releases/9.1R/schedule.html. Historically we have done a Feature Freeze at RC1, we are going to try do it with RC2 this time, tentatively scheduled for August 3, subject to schedule slippage.

At the time the the Release Engineering team announces RC2 is ready, we will then enforce “Feature Safe” commits only. This means no sweeping changes will be allowed, see http://www.freebsd.org/portmgr/implementation.html#sweeping_changes

Once portmgr@ is satisfied that the requisite packages are built to ship with FreeBSD 9.1, the ports tree will be re-opened for business.

Thomas
on behalf of portmgr

Reading rate control information from userland..

I've begun working on exporting the ath(4) driver rate control information. The "current" way to extract this information has been to issue a sysctl to the rate control module, which dumps debugging state to the kernel console. This is quite suboptimal. Instead, I'm now packaging that up via a driver ioctl() call. It's temporary (hah, famous last words in the software industry :) and it's driver specific for now. The thirty second summary:

  • There's a new ioctl to ath(4) to query the rate control module for a single associated MAC address (or the BSS MAC when running as a STA);
  • Since the rate control is currently done at the driver level rather than at the VAP level, the call is to the driver rather than via the VAP (wlanX) interface;
  • There's no easy way to get "all" station details whilst maintaining correct locking.

The last point deserves a little more explanation. I've introduced (well, _using_ now) a per-node lock when doing rate control updates. I acquire this lock when copying the rate control data out, so the snapshot is consistent.

So to fetch the state for a node, the following occurs:


  • Call the net80211 layer to find an ieee80211_node for the given mac address - that involves locking the node table and getting a reference for the node (if found); 
  • Then locking the ath_node associated with it;
  • Copy the data out;
  • Unlock the ath_node;
  • Decrement the ieee80211_node reference counter (which requires the node table lock.)


Now, the node table lock only occurs whilst fetching the node reference. It isn't held whilst doing the actual rate control manipulation. Compare to what I'd do if I wanted to walk the node table. The net80211 API for doing this holds the node lock whilst waking the node list. This means that I'll end up holding the node table lock whilst acquiring the ath_node lock. Now, that's fine - however, if I then decide somewhere else to try and do any ieee80211 operation whilst holding the ath_node lock, I may find myself with a lock ordering problem.

So for now the API will just support doing a single lookup for a given MAC, rather than trying to pull all of the rate control table entries down at once.

Here's an example output from the command:


adrian@marilyn:~/work/freebsd/ath/head/src/tools/tools/ath/athratestats]> ./athratestats -i ath1 -m 06:16:16:03:40:d0
static_rix (-1) ratemask 0xf
[ 250] cur rate 5  Mb since switch: packets 1 ticks 43028655
[ 250] last sample (11  Mb) cur sample (0 ) packets sent 10708
[ 250] packets since sample 16 sample tt 6275
[1600] cur rate 11  Mb since switch: packets 15 ticks 43025720
[1600] last sample (5  Mb) cur sample (0 ) packets sent 2423
[1600] packets since sample 7 sample tt 12713
[ 2  Mb: 250]        9:9        (100%) (EWMA 100.0%) T       11 F    0 avg  2803 last 42176930
[ 5  Mb: 250]     3139:3139     (100%) (EWMA 100.0%) T     3273 F    0 avg  1433 last 43028656
[ 5  Mb:1600]       29:29       (100%) (EWMA 100.0%) T       39 F    0 avg  5303 last 42192044
[11  Mb: 250]     7560:7560     (100%) (EWMA 100.0%) T     7838 F    0 avg  1857 last 43026094
[11  Mb:1600]     2394:2394     (100%) (EWMA 100.0%) T     2581 F    0 avg  2919 last 43026411

Ports tree has been migrated to Subversion

The migration to Subversion is done and the SVN->CVS exporter is running.

Before committing please read the Ports Subversion Primer, http://wiki.freebsd.org/PortsSubversionPrimer. Please feel to add missing parts of fix it if something is wrong.

For those who like to mirror the repository, the svn mirror seed will be available in /pub/FreeBSD/development/subversion/ on a mirror near you. First places it will likely be are http://freebsd.isc.org/pub/FreeBSD/development/subversion/ and http://ftp.dk.freebsd.org/pub/FreeBSD/development/subversion/. Be aware that the uncompressed repository is about 14GB.

Many thanks to simon@ for all the work he did this weekend to make the switch happen!

Alexander Leidinger » FreeBSD 2012-07-13 15:10:36

In mid-April a woman from the marketing department of No Starch Press contacted me and asked if I am interested to do a public review of the FreeBSD Device Drivers book by Joseph Kong (no link to a book shop, go and have a look in your preferred one). Just this simple question, no strings attached.

I had my nose in some device drivers in the past, but I never wrote one, and never had a look at the big picture. I was interested to know how everything fits together, so this made me a good victim for a review (novice enough to learn something new and to have a look if enough is explained, and experienced enough to understand what is going on in the FreeBSD kernel).

Some minutes after I agreed to review it (but with a little notice that I do not know how long I need to review it), I had the PDF version of the book. That was faster than I expected (maybe I am too old-school and used to have paper versions of books in my hands).

Let the review begin… but bear with me, this is the first time I do a real public review of a book (instead of a technical review for an author). And as this is my very own personal opinion, I will not allow comments here. This page is all about my opinion while reading the book, questions I have while reading the book shall serve as a hint about the quality of the book and they should be answered in the book, not here.

In short, the book is not perfect, but it is a good book. There is room for improvement, but on a very high level. If you want to write a device driver for FreeBSD, this book is a must. I suggest to read it completely, even chapters which do not belong to the type of driver you want to write (specially the case studies of real drivers). The reason is that each chapter has some notes which may not only apply to the chapter in question, but to all kinds of device drivers. The long review follows now.

The first chapter is titled “Building and running modules�. The author begins with description of the usual device driver types (NIC driver, pseudo-device, …) and how they can be added to the kernel (statically linked in or as a module). The first code example is a small and easy kernel module, so that we do not have to reboot the system we use to develop a driver (except we make a fault during driver development which causes the machine to panic or hang). Every part of the example is well explained. This is followed by an overview about character devices (e.g. disks) and a simple character-device driver (so far a pseudo-device, as we do not have real hardware we access) which is not only as-well explained as the module-example, but there is also a note where the code was simplified and what should be done instead.

After reading this chapter you should be able to write your own kernel module in 5 minutes (well, after 5 minutes it will not be able to do a lot — just a “hello world� – but at least you can already load/unload/execute some code into/from/in the kernel).

I have not tried any example myself, but I compiled a lot of modules and drivers I modified in the past and remember to have seen the described parts.

The second chapter explains how to allocate and free memory in the kernel. There is the possibility to allocate maybe-contiguous memory (the normal case, when your hardware does not do DMA or does not have the requirement that the memory region it makes DMA from/too needs to be contiguous), and really contiguous. For the size argument of the freeing of the the contiguous memory there is the sentence “Generally, size should be equal the amount allocated.�. Immediately I wanted to know what happens if you specify a different size (as a non-native english speaker I understand this sentence in a way that I am allowed to specify a different size and as such are able to free only parts of the allocated memory). Unfortunately this is not answered. I had a look into the source, the kernel frees memory pages, so the size argument (and addr argument) will be rounded to include a full page. This means theoretically I am able to free parts of the allocated memory, but this is a source-maintenance nightmare (needs knowledge about the machine specific page boundaries and you need to make sure that you do the absolutely correct size calculations).  To me this looks more like as long as nobody is pointing a gun at my head and tells me to use a different size, specifying the same size as made during the allocation of this memory region is the way to go.

After reading this chapter you should know how to kill the system by allocating all the RAM in the kernel.

Again, I did not try to compile the examples in this chapter, but the difference of the memory allocation in the kernel compared with memory allocation in the userland is not that big.

The third chapter explains the device communication and control interfaces (ioctl/sysctl) of a driver. The ioctl part teached me some parts I always wanted to know when I touched some ioctls, but never bothered to find out before. Unfortunately this makes me a little bit nervous about the way ioctls are handled in the FreeBSD linuxulator, but this is not urgent ATM (and can probably be handled by a commend in the right place). The sysctl part takes a little bit longer to follow through, but there is also more to learn about it. If you just modify an existing driver with an existing sysctl interface, it probably just comes down to copy&paste with little modifications, but if you need to make more complex changes or want to add a sysctl interface to a driver, this part of the book is a good way to understand what is possible and how everything fits together. Personally I would have wished for a more detailed guide when to pick the ioctl interface and when the sysctl interface than what was written in the conclusion of the chapter, but it is probably not that easy to come up with a good list which fits most drivers.

After reading this chapter you should be able to get data in and out of the kernel in 10 minutes.

As before, I did not compile the examples in this chapter. I already added ioctls and sysctls in various places in the FreeBSD kernel.

Chapter 4 is about thread synchronization – mutexes, shared/exclusive locks, reader/writer locks and condition variables. For me this chapter is not as good as the previous ones. While I got a good explanation of everything, I missed a nice overview table which compares the various methods of thread synchronization. Brendan Gregg did a nice table to give an overview of DTrace variable types and when to use them. Something like this would have been nice in this chapter too. Apart from this I got all the info I need (but hey, I already wrote a NFS client for an experimental computer with more than 200000 CPUs in 1998, so I’m familiar with such synchronization primitives).

Delayed execution is explained in chapter 5. Most of the information presented there was new to me. While there where not much examples presented (there will be some in a later chapter), I got a good overview about what exists. This time there was even an overview when to use which type of delayed execution infrastructure. I would have preferred to have this overview in the beginning of the chapter, but that is maybe some kind of personal preference.

In chapter 6 a complete device driver is dissected. It is the virtual null modem terminal driver. The chapter provides real-world examples of event-handlers, callouts and taskqueues which where not demonstrated in chapter five. At the same time the chapter serves as a description of the functions a TTY driver needs to have.

Automated device detection with Newbus and the corresponding resource allocation (I/O ports, device memory and interrupts) are explained in chapter 7. It is easy… if you have a real device to play with. Unfortunately the chapter missed a paragraph or two about the suspend and resume methods. If you think about it, it is not hard to come up with what they are supposed to do, but a little explicit description of what they shall do, in what state the hardware should be put and what to assume when being called would have been nice.

Chapter 8 is about interrupts. It is easy to add an interrupt handler (or to remove one), the hard part is to generate an interrupt. The example code uses the parallel port, and the chapter also contains a little explanation how to generate an interrupt… if you are not afraid to touch real hardware (the parallel port) with a resistor.

In chapter 9 the lpt(4) driver is explained, as most of the topics discussed so far are used inside. The explanation how everything is used is good, but what I miss sometimes is why they are used. The most prominent (and only) example here for me is why are callouts used to catch stray interrupts? That callouts are a good way of handling this is clear to me, the big question is why can there be stray interrupts. Can this happen only for the parallel port (respectively a limited amount of devices), or does every driver for real interrupt driven hardware need to come with something like this? I assume this is something specific to the device, but a little explanation regarding this would have been nice.

Accessing I/O ports and I/O memory for devices are explained in chapter 10 based upon a driver for a LED device (turn on and off 2 LEDs on an ISA bus). All the functions to read and write data are well explained, just the part about the memory barrier is a little bit short. It is not clear why the CPU reordering of memory accesses matter to what looks like function calls. Those function calls may be macros, but this is not explained in the text. Some little examples when to use the barriers instead of an abstract description would also have been nice at this point.

Chapter 11 is similar to chapter 10, just that a PCI bus driver is discussed instead of an ISA bus driver. The differences are not that big, but important.

In chapter 12 it is explained how to do DMA in a driver. This part is not easy to understand. I would have wanted to have more examples and explanations of the DMA tag and DMA map parts. I am also surprised to see different supported architectures for the flags BUS_DMA_COHERENT and BUS_DMA_NOCACHE for different functions. Either this means FreeBSD is not coherent in those parts, or it is a bug in the book, or it is supposed to be like this and the reasons are not explained in the book. As there is no explicit note about this, it probably leads to confusion of readers which pay enough attention here. It would also have been nice to have an explanation when to use those flags which are only implemented on a subset of the architectures FreeBSD supports. Anyway, the explanations give enough information to understand what is going on and to be able to have a look at other device drivers for real-live examples and to get a deeper understanding of this topic.

Disk drivers and block I/O (bio) requests are described in chapter 13. With this chapter I have a little problem. The author used the word “undefined� in several places where I as a non-native speaker would have used “not set� or “set to 0″. The word “undefined� implies for me that there may be garbage inside, whereas from a technical point of view I can not imagine that some random value in those places would have the desired result. In my opinion each such place is obvious, so I do not expect that an experienced programmer would lose time/hairs/sanity over it, but inexperienced programmers which try to assemble the corresponding structures on the (uninitialized) heap (for whatever reason), may struggle with this.

Chapter 14 is about the CAM layer. While the previous chapter showed how to write a driver for a disk device, chapter 14 gave an overview about how to an HBA to the CAM layer. It is just an overview, it looks like CAM needs a book on its own to be fully described. The simple (and most important) cases are described, with the hardware-specific parts being an exercise for the person writing the device driver. I have the impression it gives enough details to let someone with hardware (or protocol), and more importantly documentation for this device, start writing a driver.

It would have been nice if chapter 13 and 14 would have had a little schematic which describes at which level of the kernel-subsystems the corresponding driver sits. And while I am at it, a schematic with all the driver components discussed in this book at the beginning as an overview, or in the end as an annex, would be great too.

An overview of USB drivers is given in chapter 15 with the USB printer driver as an example for the explanation of the USB driver interfaces. If USB would not be as complex as it is, it would be a nice chapter to start driver-writing experiments (due to the availability of various USB devices). Well… bad luck for curious people. BTW, the author gives pointers to the official USB docs, so if you are really curious, feel free to go ahead. :)

Chapter 16 is the first part about network drivers. It deals with ifnet (e.g. stuff needed for ifconfig), ifmedia (simplified: which kind of cable and speed is supported), mbufs and MSI(-X). As in other chapters before, a little overview and a little picture in the beginning would have been nice.

Finally, in chapter 17, the packet reception and transmission of network drivers is described. Large example code is broken up into several pieces here, for more easy discussion of related information.

One thing I miss after reaching the end of the book is a discussion of sound drivers. And this is surely not the only type of drivers which is not discussed, I can come up with crypto, firewire, gpio, watchdog, smb and iic devices within a few seconds. While I think that it is much more easy to understand all those drivers now after reading the book, it would have been nice to have at least a little overview of other driver types and maybe even a short description of their driver methods.

Conclusion: As I wrote already in the beginning, the book is not perfect, but it is good. While I have not written a device driver for FreeBSD, the book provided enough insight to be able to write one and to understand existing drivers. I really hope there will be a second edition which addresses the minor issues I had while reading it to make it a perfect book.

Share

Book Review: FreeBSD Device Drivers

Joseph Kong strikes again! When I saw Designing BSD Rootkits I was quite excited because:

  1. it was on some topics I had interests in but had not as much time as I would have liked to to dig in;
  2. it was focussed on my main operating system.

I bought the book and although it was rather small (~140 pages) enjoyed learning quite a large amount of things about the FreeBSD kernel.

So, when I was contacted by No Stash Press to know if I would be interested in a review copy of their upcoming book FreeBSD Device Drivers by the same author in exchange of writing what I though about it, I had no hesitation and accepted the deal.

Book cover

Just like the preceding book, this one is based on examples, each chapter focussing on a different aspect of device drivers, and going progressively further and further in the details of the kernel, explaining how things work and why they do work that way so that the reader can better understand the involved mechanisms, even if he is not an Operating Systems guru.

Precious information about the FreeBSD kernel (like what you have in The Design and Implementation of the FreeBSD Operating-System but in a more up-to-date and friendlier fashion) are spread all along the book, and — apart from the long C listings — the style is quite pleasant to read.

To sum-up, this book is definitively a must have for anybody interested in how are designed FreeBSD device drivers, not mentioning those who are interested in writing their very own ones for the FreeBSD operating system!

Many thanks Joseph for this awesome book!

Ports tree migration to Subversion

The FreeBSD ports tree will migrate from CVS to Subversion soon. The anticipated date for the migration is July 14th. This will have no impact for ports tree users as there will be a SVN to CVS exporter.

Please note that cvsup will still work after the migration. Nevertheless c(v)sup is pretty dated so you may want to see if portsnap(8) will fit your needs.

Beat and Thomas
on behalf of portmgr@

The return of the FreeBSD desktop

I have a confession to make: I haven't used FreeBSD as a desktop OS for years. The reason is twofold:

  1. Since 2005, my work has required me to run Linux (Debian and Ubuntu at Linpro, RedHat at the University of Oslo) and, briefly, Windows at Kongsberg Maritime. I eventually stopped using stationary computers, resorting instead to a (company-provided) laptop running either Ubuntu, or Windows with Ubuntu in VirtualBox.
  2. More importantly, around the time I started at Linpro, it became increasingly difficult to maintain a FreeBSD desktop. The modularization of X.org and the increasing complexity of desktop environments mean that the number of packages required for a complete desktop system has grown from a bit over 100 to well over 600 (in addition to the kernel and base operating system, which is monolithic in FreeBSD). The FreeBSD ports system does not scale well, and the lack of a proper binary update procedure makes it almost impossible to keep that many packages up-to-date.

This is about to change. Thanks to what may very well be the most important innovations in recent FreeBSD history, I am now once again running a FreeBSD desktop.

I am referring, of course, to PKGNG and its companion, Poudrière.

(Actually, I've been running PC-BSD for a while, but I'm not a big fan of PBIs, and they don't really address the updating problem. This is about building a FreeBSD desktop from scratch, and keeping it up-to-date.)

Here is the procedure I followed:

I created a new VirtualBox VM with a 128 GB disk—this may not seem like much, but most of my storage requirements are met by the host system, which has two 2 TB disks, and an older file server with over 2 TB of storage, including ~40 GB of source code.

Instead of the default NAT configuration, I set up a bridged network interface with promiscuous mode enabled; I run my own DNS and DHCP servers (on a soekris, of course), so within my home network, the VM has its own fixed IP address and DNS name.

I then installed FreeBSD 9.0-RELEASE amd64 from the disc1 ISO. I did not select any packages, nor did I install the ports tree. However, as soon as the system booted, I downloaded and extracted an up-to-date ports tree using portsnap and built pkg and poudriere.

I had initially planned to rely entirely on the pkgbeta repository, but there are two problems with this: firstly, all the packages there are built with the default options, which in many cases make no sense at all; and secondly and most importantly, it did not at that time have a full set of Gnome 2 packages.

I therefore set up my own Poudrière. I ran a first build with only ports-mgmt/pkg and ports-mgmt/poudriere, then updgraded those two packages and started over with a slightly larger set of packages, including mail/postfix, security/sudo, shells/zsh and other essentials which don't take long to build. Finally, I added emulators/virtualbox-ose-additions, x11/xorg, x11/gnome2, and a number of desktop applications (e.g. Emacs) and development tools (e.g. Subversion) which I use. Then it was just a matter of

% sudo pkg install x11/xorg x11/gnome2 editors/emacs

Whenever I need an additional package, I add it to the package list and re-run Poudrière. I don't really need to do that—I could just as easily build it straight from the ports tree—but it ensures that I get a “clean” package that I can safely use on another VM or machine, should I ever need to, and that it is rebuilt when updated, along with all the other packages I use:

% sudo poudriere ports -u
% sudo poudriere bulk -f ~/poudriere-packages -j 83amd64 -k

I had some minor configuration trouble. First, X.org's autoconfiguration feature still doesn't work very well, at least in VirtualBox, so I used the xorg.conf from a previous post, with only minor modifications:

Section "InputDevice"
        Identifier      "Generic Keyboard"
        Driver          "kbd"
        Option          "XkbRules"      "xorg"
        Option          "XkbModel"      "pc105"
        Option          "XkbLayout"     "us"
EndSection

Section "InputDevice"
        Identifier      "VBox Mouse"
        Driver          "vboxmouse"
        Option          "CorePointer"
EndSection

Section "Device"
        Identifier      "VBox Video"
        Driver          "vboxvideo"
EndSection

Section "Monitor"
        Identifier      "VBox Monitor"
EndSection

Section "Screen"
        Identifier      "Default Screen"
        Device          "VBox Video"
        Monitor         "VBox Monitor"
EndSection

Section "ServerLayout"
        Identifier      "Default Layout"
        Screen          "Default Screen"
        InputDevice     "Generic Keyboard"
        InputDevice     "VBox Mouse"
EndSection

With the VirtualBox guest additions installed and this xorg.conf in place, everything works beautifully—mouse integration, clipboard integration and dynamic desktop resizing.

The second issue I had was with GDM's greeter, which did not display my user in the user list and would not let me type in my user name. The first part is understandable, as it only displays users who have logged in at least once already. The second is more surprising, but I did not have the energy to try to figure out why. Instead, I worked around it by disabling the user list:

% sudo -u gdm gconftool-2 --type bool \
    --set /apps/gdm/simple-greeter/disable_user_list true

Shared folders are still not implemented for FreeBSD guests, but I rarely need to transfer files between the host and the guest; when I do, I use FileZilla to SFTP them over, and there is always SMB for more complex use.

If I were running on real hardware or a weaker host (this is a four-core i7 with 16 GB RAM), I might use a subset of PC-BSD's tricked-out sysctl.conf, but I don't need audio and I haven't experienced any interactivity issues yet.

That's it—try it yourself, and share your experience below!

Alexander Leidinger » FreeBSD 2012-06-15 20:18:07

FreeBSD is on its way to move from CVS to SVN  for the version control system for the Ports Collection. The decision was made to keep the complete history, so the complete CVS repository has to be converted to SVN.

As CVS has no way to record a copy or move of files inside the repository, we copied the CVS files inside the repository in case we wanted to copy or move a file (the so called “repocopy�). While this allows to see the full history of a file, the drawback is that you do not really know when a file was copied/moved if you are not strict at recording this info after doing a copy. Guess what, we where not.

Now with the move to SVN which has a build-in way for copies/moves, it would be nice if we could record this info. In an internal discussion someone told its not possible to detect a repocopy reliably.

Well, I thought otherwise and an hour later my mail went out how to detect one. The longest time was needed to write how to do it, not to come up with a solution. I do not know if someone picked up this algorithm and implemented something for the cvs2svn converter, but I decided to publish the algorithm here if someone needs a similar functionality somewhere else. Note, the following is tailored to the structure of the Ports Collection. This allows to speed up some things (no need to do all steps on all files). If you want to use this in a generic repository where the structure is not as regular as in our Ports Collection, you have to run this algorithm on all files.

It also detects commits where multiple files where committed at once in one commit (sweeping commits).

Preparation

  • check only category/name/Makefile
  • generate a hash of each commitlog+committer
  • if you are memory-limited use ha/sh/ed/dirs/cvs-rev and store pathname in the list cvs-rev (pathname = “category-nameâ€?) as storage
  • store the hash also in pathname/cvs-rev

If you have only one item in ha/sh/ed/dirs/cvs-rev in the end, there was no repocopy and no sweeping commit, you can delete this ha/sh/ed/dirs/cvs-rev.

If you have more than … let’s say … 10 (subject to tuning) pathnames in ha/sh/ed/dirs/cvs-rev you found a sweeping commit and you can delete the ha/sh/ed/dirs/cvs-rev.

The meat

The remaining ha/sh/ed/dirs/cvs-rev are probably repocopies. Take one ha/sh/ed/dirs/cvs-rev and for each pathname (there may be more than 2 pathnames) in there have a look at pathname/. Take the first cvs-rev of each and check if they have the same hash. Continue with the next rev-number for each until you found a cvs-rev which does not contain the same hash. If the number of cvs-revs since the beginning is >= … let’s say … 3 (subject to tuning), you have a candidate for a repocopy. If it is >=  … 10 (subject to tuning), you have a very good indicator for a repocopy. You have to proceed until you have only one pathname left.

You may detect multiple repocopies like A->B->C->D or A->B + A->D + A->C here.

Write out the repocopy candidate to a list and delete the ha/sh/ed/dirs/cvs-rev for each cvs-rev in a detected sequence.

This finds repocopy candidates for category/name/Makefile. To detect the correct repocopy-date (there are maybe cases where another file was changed after the Makefile but before the repocopy), you now have to look at all the files for a given repocopy-pair and check if there is a matching commit after the Makefile-commit-date. If you want to be 100% sure, you compare the complete commit-history of all files for a given repocopy-pair.

Share

Writing a GEOM GATE module, part 1

For various reasons, including the common good, I'd like to write a short-ish tutorial about writing GEOM GATE modules for FreeBSD. I hope to do this in a series of a few blog posts, dealing with the steps of writing such a module, this being the first post in the series. Finally, I'd like to do it while also creating a usable module, so this will not be an academic excercise but (hopefully) useful work.

I'd like to start with explaing what GEOM and GEOM GATE are.

Read more...

Don’t let anyone tell you that FreeBSD doesn’t "do" 802.11n:

This is from my FreeBSD-HEAD 802.11n access point, currently doing ~ 130MBit/s TCP:



# athstats -i ath0
41838297 data frames received
31028383 data frames transmit
78260 short on-chip tx retries
3672 long on-chip tx retries
197 tx failed 'cuz too many retries
MCS13 current transmit rate
8834 tx failed 'cuz destination filtered
477 tx frames with no ack marked
239517 rx failed 'cuz of bad CRC
10 rx failed 'cuz of PHY err
10 OFDM restart
42043 beacons transmitted
143 periodic calibrations
-0/+0 TDMA slot adjust (usecs, smoothed)
45 rssi of last ack
51 avg recv rssi
-96 rx noise floor
812 tx frames through raw api
41664029 A-MPDU sub-frames received
42075948 Half-GI frames received
42075981 40MHz frames received
13191 CRC errors for non-last A-MPDU subframes
129 CRC errors for last subframe in an A-MPDU
2645042 Frames transmitted with HT Protection
351457 Number of frames retransmitted in software
23299 Number of frames exceeding software retry
30674735 A-MPDU sub-frame TX attempt success
374408 A-MPDU sub-frame TX attempt failures
8676 A-MPDU TX frame failures

443 listen time
6435 cumulative OFDM phy error count
161 ANI forced listen time to zero
3672 missing ACK's
78260 RTS without CTS
1469003 successful RTS
239605 bad FCS
2 average rssi (beacons only)
Antenna profile:
[0] tx 1466665 rx 1
[1] tx 0 rx 41838296

A tale of two sequence numbers, or "when QoS seqno and CCMP PN don’t match up"..

Many moons ago (say, 3 or 4 weeks - so hm, most-of-a-moon-ago actually) I found a rather curious failure condition in the ath(4) TX aggregation path. The colourful history is documented in FreeBSD kern/166190. In short - there are situations where sequence numbers were allocated in a different order to how frames were being added to the block-ack window tracking, and if you got unlucky, you'd cause the stack to think a frame was (far) outside the BAW.

The 30 second explanation:

Imagine you allocated four frames - sequence numbers 1, 2, 3 and 4. They have to be added to the block-ack window in precisely that order. Ie:

  1. Starting condition: Window is at 0:63 (64 frame window, starting at 0, so ending at 63)
  2. Add 1: Window is now at 0:63, starting at 1
  3. Add 2: Window is now at 0:63, starting at 2
  4. Add 3: Window is now at 0:63, starting at 3.
The reason the window pointer isn't moving along is because although you've sent the frames (or you're about to), you can't advance it until the other end has ACKed it (via a block-ack or a normal ACK.) For more information, google how 802.11n aggregation works.

The important bit here is that the window is still 0:63 and the starting point is now '3'. This continues all the way to trying to queue frame 64, where it will be outside of the current BAW and not be allowed to be transmitted. It'll sit in the software queue and wait until frame '0' has been ACKed and the BAW has been advanced to be 1:64 - at which point 63 will fall inside the window and will be transmitted.

So yes, the sender is tracking two things - the BAW and what the starting point is that they've added to the BAW.

Now, imagine instead of (1, 2, 3, 4) on the software queue, I somehow get preempted (or race between two sending threads, when using SMP) between 'allocated seqno' and 'queue to software queue'. In the existing code, a lock was held when:
  • Allocating a sequence number, then it was dropped; then
  • Adding it to the software queue.
Now because there was a period where no lock was being held, it's quite possible that what ends up on the software queue is (2, 1, 3, 4.) So:
  1. Starting condition: Window is at 0:63
  2. Add 2: Window is now 0:63, starting at 2.
  3. Add 1: Window is 0:63, starting at 2; 1 is outside of the BAW (it's treated as a 'wraparound', so imagine it's 4095 seqno's away) so TX stalls.
This was the cause of the TX stalls that I was seeing originally in kern/166190. I "fixed" it by only allocating sequence numbers when the frame was about to be transmitted for the first time, and then adding it to the BAW right there. Since both sequence number allocation and adding to the BAW happened inside the same lock, everything was sweet.

Except, I totally forgot about CCMP PN. So under high enough UDP TX loads (say, > 200MBit), I'd hit the same race, but between 802.11 sequence numbers and CCMP PN sequence numbers.

CCMP PN is assigned during 802.11 encapsulation time, in the driver. In the ath(4) case, it's done during transmit and before being queued to the software queue. And it was being done outside of any locking.  So it's very possible that frames would end up on the software queue with 802.11 and CCMP PN sequence numbers out of lock-step.

What would happen?

Simply - after the 802.11n reordering occured on the receive side, the CCMP PN replay detector would notice sequence numbers out of order, and start tossing said out of order frames. Lots of packet loss ensued.

So, I sat down and started trying to address it. The simplest thing - wrap the whole encapsulation path between ieee80211_crypto_encap(), 802.11 sequence number assignment and software/hardware queueing behind the TID (well, hardware TXQ) lock. It took some time; I had to revert two earlier commits which introduced the delayed sequence number allocation.

This didn't fix things. So I was back to square one.

I started looking at all the places where the frames were being queued to the software queue and .. well, let's just say I spent Sunday swearing _at myself_ for all the weird and wonderful stupid mistakes I had made when writing/porting this code over.

The short version follows (the long version is "read the sys/dev/ath/ commit logs and the PR history"):
  1. When I was queueing frames to the software queue, I'd check how deep the hardware queue was. If the hardware queue was shallow/empty, I'd direct dispatch up to two frames to the hardware to get things 'busy'. That will (hopefully) let further frames come along in the meantime and be aggregated. However, I was queueing the new frame to the hardware rather than queueing the new frame to the tail of the queue, and queueing the head frame of the queue to the hardware. That led to some out of order behaviour.
  2. ath_tx_xmit_aggr() would check if the sequence number was within the block-ack window and if it wasn't, it'd queue the frame to the tail of the queue. This meant that any new frames that came along would be queued to the end of the queue, even if they had been dequeued from the head of the queue. This lead to frames on the software queue being out of order.
    1. Frames on the software queue don't have to be in-sequence (as retries are prepended to the beginning of the list, and new frames are appended to the end) however they have to be in-order. If they end up being out of order, the BAW logic fails.
So, now that I allocate sequence numbers at packet queue time, I have to be triply sure that what ends up on the software queue is correctly in order, or the BAW logic will cause traffic stalls and potentially duplicate sequence number issues. Yes, this means that the old behaviour, whilst it now works right with all the right locking, requires me to correctly handle putting frames on the software queue. (Or, as I like to say, "keeping the bastard (me) honest.")

TL;DR - 802.11n aggregation works again. Now, to fix those pesky "queue full and I want to send a BAR frame so I can unblock the full queue and transmit" problems. At least that one is more tractable and easier to solve. Or is it.


Wish me luck: taking the git plunge

Wish me luck.

I just typed the following commands:

  • git clone https://github.org/freebsd/freebsd-head.git
  • cd freebsd-head
  • git branch army
  • git checkout army
and started my own local branch.  Wish me luck.  Oh, I've said it, but with git, I think I might need it.  I actually didn't type exactly the above, but that's what I should have typed.  git reset seems to have been my friend.  We'll see next time I pull from github...  and if I can get the rebase command to work for me better than mercurial has been working for me at work.

Now, to figure out how to do  a similar thing to patch queues, except I really want to just correct commits rather than use a whole different set of commands.

[CFT] Xorg 7.7 ready for testing!

Hi Fans,

The FreeBSD Xorg Team is pleased to announce Xorg 7.7 Release. We are very happy to be able to Call for testing shortly after the Xorg team annouced 7.7 release. This CFT is also open for discussion on how we should move forward with xorg release as we are facing some issues and we would like to ask for your opinion. Right now we have 2 existing xorg versions in our Ports Tree. The situation is quite bad due to our poor graphic card support. That means we do not have much choice but to take it as how it is now. But with regards to mesa support, we have to face some new challanges.

With the new mesa 8.0 release, accelerated support for a number of older graphic cards was dropped. At the moment we are not sure how to deal with that.We are thinking of just replacing mesa 7.11 with 8.0 or making a new flag like WITH_MESA= 7.11.2 / 8.0 in combination with WITH_NEW_XORG, and let the mesa 7.6.1 set as default together with the old xorg version. Obviosly the latter option make the already complex situation more complex. The problem is, users, especially  the new ones can easily get confused with it. Another issue is, the packages.We can’t deliver a package set with the new Xorg releases. This means users with new hardware will have to compile everything by themselves. Though I’m totally fine with compiling, not everyone has the CPU power to compile everything. What I’m trying to say is, I would love to see the newer xorg released as the default version, but i know this will break a lot of old hardware. The thing is, when we want to try to become a Modern Operating System, I dont see any other way to make the new xorg as default but to give Users the chance to compile the old xorg with a flag like WITH_OLD_XORG.

Some notes regarding KMS support:
KMS Support has been completely migrated to FreeBSD 10. The MFC to 9 will come soon, that means so long its not MFC’d to 9-Stable, users need to get the latest patch from our x11 mailing list.

This testing includes
* libdrm 2.4.34 (including KMS support)
* mesa 8.0.3
* full Xorg 7.7 release

Checkout Xorg Development Repo:
You will need to install devel/subversion in order to checkout the xorg repo. Next, you will need to add WITH_NEW_XORG=yes in your /etc/make.conf if you want to try out the new Xorg and mesa. Note
that if you are not qualified for the KMS patch, you shouldn’t use WITH_NEW_XORG=yes because the old intel driver doesn’t build with the new X server. If you are qualified, you should also set WITH_KMS=yes
in /etc/make.conf. Nvidia and ATI users should set WITH_NEW_XORG=yes.

svn co https://trillian.chruetertee.ch/svn/ports/trunk

A small merge script to merge the svn checkout into the real portstree can be found here:

http://people.freebsd.org/~miwi/xorg/xorgmerge

The script is a modified version of the old kdemerge script. Please set the KDEDIR variable to the path of your X.org ports. After merging, run one of the following command, depending on which
tool you use to manage your installed packages.

portupgrade -af \*
portmaster -a

After installing these, you will have to rebuild all xf86-* ports. We will bump all releated ports during the commit to the ports tree.

Roadmap:
Our current plan is to let the CFT running for a while, and see what the outcome of the discussion above is. We hope to get a lot of feedback to solve as many problems as possible. Also we are working on the
libglut to freeglut migration, this will definitely complete before we import Xorg 7.7. So we still have enough time.  We are looking forward for your feedback.

- miwi on behalf of the FreeBSD X11 Team

FreeBSD, Netflix, CDN

The big news this week is the Netflix Openconnect platform, which was just announced. It uses off the shelf hardware - and FreeBSD + nginx.


The question is how you could spin it.


You could say "Netflix chose FreeBSD because they can keep their changes proprietary." Sure, they could. But they're not making appliances that they're selling - they're owning the infrastructure and servers. It's unclear whether they'd have to contribute back any Linux changes if they ran Linux on their open connect platform. They're making a conscious, public decision to distribute their changes back to FreeBSD - even though they don't have to.


You could say "Netflix chose FreeBSD because the people inside the company knew FreeBSD." Sure, they may have. The same thing could be said about why start-ups and tech companies choose Linux. A lot of the time its because they're chasing enterprise support from Redhat. But technology startups using Ubuntu or Debian tend not to be paying support fees - they hire smart people who know the technology. So, yes - "using what they know."


According to the Netflix Openconnect website:


"This was selected for its balance of stability and features, a strong development community and staff expertise. We will contribute changes we make as part of our project to the community through the FreeBSD committers on our team."



Let's pull this apart a little.

  1. "Balance of stability and features." FreeBSD has long been derided for how slowly it moves in some areas. The FreeBSD developers tend to be a conservative bunch, trying to find the balance between new feature development and maintaining both stability and backwards compatibility.
  2. "Strong community." FreeBSD has a strong technical development community and Netflix finds this very important. They're also willing to join and participate in the community like many other companies do.
  3. "Staff expertise." So yes, their staff are familiar with FreeBSD. They're also familiar with Linux. They chose a platform which they have the expertise to develop, use and improve. They didn't just choose an unfamiliar platform because of marketing brochures or sales promises. I don't see any negatives here. I'm sure that Google engineers chose Linux to begin with because they were familiar with Linux.
  4. "Contribute changes we make as part of our project to the community." Netflix  has committed to push improvements and fixes back to the upstream project They contributed some bug fixes in the 10GE Intel driver and IPv6 stack this week. This is collaborative open source working the way it should.
Why would Netflix push back changes and improvements into a project when they're not required to? That's something you should likely ask them. But the same good practice arguments hold for both Linux AND BSD projects:
  • The project is a constantly moving target. If you don't push your changes back upstream, you risk carrying around increasingly larger changes as your project and your BSD upstream project diverge. This will just make things more difficult in the long run.
  • By pushing your changes upstream, you make it easier to move with the project - including adopting improvements and new features. If you keep large changes to yourself, you will likely find it increasingly difficult to update your software to the newer upstream versions. And that upstream project is likely adding bug fixes, improvements and new features - which at some point you may wish to leverage. By pushing your changes upstream, you make it a lot easier to move to future versions of the upstream project, allowing you to leverage all those fixes and improvements without too much engineering time.
  • By participating, you encourage others to adopt your technology. By pushing your changes and improvements upstream, you decrease the amount of software you have to maintain yourself (and keep patching as the upstream project moves along.) But you also start to foster technology adoption. The FreeBSD jail project started out of the desire by a hosting company to support virtualisation. Since then, the Jail infrastructure has been adopted by many other companies and individuals.
  • When others use your technology, they also find and fix bugs in your technology; they may even improve it. The FreeBSD jail support has been extended to include IPv6 support, shared memory support and integrates into the VIMAGE (virtualised networking) stack (which, by the way, came from Ironport/Cisco.) As a company, you may find that the community will do quite a lot of the work that you would normally have to hire engineers to do yourself. This saves time and saves money.
  • When companies contribute upstream, it encourages other companies to also contribute upstream. A common issue is "reinventing the wheel", where companies end up having to reinvent the same technology privately because no-one has contributed it upstream. They solve the same problems, they implement the same new features .. and they all spend engineering time and resources to do so.
  • And when companies contribute upstream, it encourages (private) developers to contribute. Open source developers love to see their code out there in the wild, in places they never quite thought of. It's encouraging to see companies build products with their code and contribute back bug fixes and improvements. It fosters a sense of community and participation, of "give and take", rather than just "take". This is exactly the kind of thing that keeps developers coming back to contribute more - and it attracts new developers. Honestly, who wouldn't want to say that some popular device is running code that they wrote in their spare time?
So, you could rant and rave about the conspiracy side of Linux versus FreeBSD. You could rant on about GPL versus BSD. Or, you could see the more useful side of things. You could see a large company who didn't have to participate at all, agreeing to contribute back their improvements to an open source operating system. You could see that by doing so, the entire open source ecosystem benefits - not just FreeBSD. There's nothing stopping Linux or other BSD projects from keeping an eye on the improvements made by Netflix and incorporating those improvements into their own project. And it's another case of a company participating and engaging the open source community - and having that community engage them right back.

Good show, Netflix. Good show.

Florent Thoumie » FreeBSD 2012-06-06 12:50:30

It’s been a while since I’ve written anything in here so I’ll try to remember how things work…

A few weeks, like a lot of people, I ordered a RaspberryPi to use as a mediacenter. Unfortunately it still hasn’t arrived and I got tired of my good old xbox that’s starting to be quite limiting in terms of what I can ask it. Anyway, I had an old-ish desktop in the attic that I decided to use in place of the RPi in the meantime. The thing is, it was running FreeBSD 8.1-BETA1, which also was quite old. I could have simply reinstalled everything but: a) I didn’t have an install cd and b) I decided to give pkgng a proper try (as opposed to just running it in a vm).

First step: upgrading to 9.0-RELEASE. Pretty easy thanks to freebsd-update, I did have to go via 8.1-RELEASE because of some metadata integrity check failing. Didn’t add too long to the process, so I’m not complaining.

Second step: updating the ports tree with portsnap (my tool of choice) and installing ports-mgmt/pkg.

Last step (ok this one is fairly long and more complicated): point PACKAGESITE to pkgbeta.freebsd.org and force-upgrade all packages (pkg update && pkg upgrade -fy).

Tada!

Backtracking slightly, the last step didn’t go without any trouble, pkgng complained about ports having file conflicts. In some cases, ports had been merged/deleted (so the easy fix was to pkg delete them first). In other cases, we were going from a versioned port to another one (e.g. mysql50-client to mysql55-client) so the register origin for the package had to be changed with pkg set –o databases/mysql50–client:databases/mysql55–client. Finally pkg upgrade -fy wasn’t complaining anymore.

So that was a few days ago, yesterday a new package set was uploaded. Since I pointed PACKAGESITE to http://pkgbeta.freebsd.org/freebsd-9-i386/latest/, pkg update picked up on the new packageset and more packages were available for upgrade. pkg upgrade worked without a hitch but a couple of apps that I had installed via ports (and not yet available as packages) weren’t upgraded along with graphics/png, which caused x11/slim to look for a non-existent png library. Note that I only used pkg(ng), not portmaster or portupgrade which would have picked up on the update since PORTREVISION for x11/slim was bumped. I am told that preserving shared libraries might be added to pkgng v1.1.

Anyway, congratulations to bapt@ and everybody else working on this! There are still a few rough edges but it’s working pretty darn well already!