UDP RSS update: ixbge(4) turned out to have issues..

I started digging deeper into the RSS performance on my home test platform. Four cores and one (desktop) socket isn't all that much, but it's a good starting point for this.

It turns out that there was some lock contention inside netisr. Which made no sense, as RSS should be keeping all the flows local to each CPU.

After a bunch of digging, I discovered that the NIC was occasionally receiving packets into the wrong ring. Have a look at tihs:

Sep 12 08:04:32 adrian-hackbox kernel: ix0: ixgbe_rxeof: 100034:
m=0xfffff80047713d00; flowid=0x21f7db62; rxr->me=3
Sep 12 08:04:32 adrian-hackbox kernel: ix0: ixgbe_rxeof: 100034:
m=0xfffff8004742e100; flowid=0x21f7db62; rxr->me=3
Sep 12 08:04:32 adrian-hackbox kernel: ix0: ixgbe_rxeof: 100034:
m=0xfffff800474c2e00; flowid=0x21f7db62; rxr->me=3
Sep 12 08:04:32 adrian-hackbox kernel: ix0: ixgbe_rxeof: 100034:
m=0xfffff800474c5000; flowid=0x21f7db62; rxr->me=3
Sep 12 08:04:32 adrian-hackbox kernel: ix0: ixgbe_rxeof: 100034:
m=0xfffff8004742ec00; flowid=0x21f7db62; rxr->me=3
Sep 12 08:04:32 adrian-hackbox kernel: ix0: ixgbe_rxeof: 100032:
m=0xfffff8004727a700; flowid=0x335a5c03; rxr->me=2
Sep 12 08:04:32 adrian-hackbox kernel: ix0: ixgbe_rxeof: 100032:
m=0xfffff80006f11600; flowid=0x335a5c03; rxr->me=2
Sep 12 08:04:32 adrian-hackbox kernel: ix0: ixgbe_rxeof: 100032:
m=0xfffff80047279b00; flowid=0x335a5c03; rxr->me=2
Sep 12 08:04:32 adrian-hackbox kernel: ix0: ixgbe_rxeof: 100032:
m=0xfffff80006f0b700; flowid=0x335a5c03; rxr->me=2

The RX flowid was correct - I hashed the packets in software too and verified the software hash equaled the hardware hash. But they were turning up on the wrong receive queue. "rxr->me" is the queue id; the hardware should be hashing on the last 7 bits. 0x3 -> ring 3, 0x2 -> ring 2.

It also only happened when I was sending traffic to more than one receive ring. Everything was okay if I just transmitted to a single receive ring.

Luckily for me, some developers from Verisign saw some odd behaviour in their TCP stress testing and had dug in a bit further. They were seeing corrupted frames on the receive side that looked a lot like internal NIC configuration state. They figured out that the ixgbe(4) driver wasn't initialising the flow director and receive units correctly - the FreeBSD driver was not correctly setting up the amount of memory each was allocated on the NIC and they were overlapping. They also found a handful of incorrectly handled errors and double-freed mbufs.

So, with that all fixed, their TCP problem went away and my UDP tests started properly behaving themselves. Now all the flows are ending up on the right CPUs.

The flow director code was also dynamically programming flows into the NIC to try and rebalance traffic. Trouble is, I think it's a bit buggy and it's likely not working well with generic receive offload (LRO).

What's it mean for normal people? Well, it's fixed in FreeBSD-HEAD now. I'm hoping I or someone else will backport it to FreeBSD-10 soon. It fixes my UDP tests - now I hit around 1.3 million packets per second transmit and receive on my test rig; the server now has around 10-15% CPU free. It also fixed issues that Verisign were seeing with their high transaction rate TCP tests. I'm hoping that it fixes the odd corner cases that people have seen with Intel 10 gigabit hardware on FreeBSD and makes LRO generally more useful and stable.

Next up - some code refactoring, then finishing off IPv6 RSS!


FreeBSD 10.1-BETA1 Now Available

The first BETA build of the 10.1-RELEASE release cycle is now available on the FTP servers for the amd64, armv6, i386, ia64, powerpc, powerpc64 and sparc64 architectures.

The image checksums follow are included in the original announcement email.

Installer images and memory stick images are available here.

If you notice problems you can report them through the Bugzilla PR system or on the -stable mailing list.

If you would like to use SVN to do a source based update of an existing system, use the "stable/10" branch.

A list of changes since 10.0-RELEASE are available on the stable/10 release notes page.

Pre-installed virtual machine images for 10.1-BETA1 are also available for amd64 and i386 architectures.  The images are located here.

The disk images are available in QCOW2, VHD, VMDK, and raw disk image formats.  The image download size is approximately 135 MB, which decompress to a 20GB sparse image.

The partition layout is:
  • 512k - freebsd-boot GPT partition type (bootfs GPT label)
  • 1GB  - freebsd-swap GPT partition type (swapfs GPT label)
  • ~17GB - freebsd-ufs GPT partition type (rootfs GPT label)
Note to consumers of the dvd1.iso image: The packages included on the dvd do not have a corresponding pkg(8) repository due to an incompatibility with pkg-1.2.x and pkg-1.3.x.  This will be fixed for BETA2.

The packages will not be recognized by bsdconfig(8), however can be  installed manually.

To install packages from the dvd1.iso installer, create and mount the /dist directory:

# mkdir -p /dist
# mount -t cd9660 /dev/cd0 /dist

Next, install pkg(8) from the DVD:

# env REPOS_DIR=/dist/packages/repos pkg add \

At this point, pkg-add(8) can be used to install additional packages from the DVD.  Please note, the REPOS_DIR environment variable should be used each time using the DVD as the package repository, otherwise conflicts with packages from the upstream mirrors may occur when they are fetched.  For example, to install the Subversion, Gnome, and Xorg, run:

# env REPOS_DIR=/dist/packages/repos pkg add \
  /dist/packages/freebsd:10:*:*/subversion [...]

The freebsd-update(8) utility supports binary upgrades of amd64 and i386 systems running earlier FreeBSD releases.  Systems running earlier
FreeBSD releases can upgrade as follows:

# freebsd-update upgrade -r 10.1-BETA1

During this process, freebsd-update(8) may ask the user to help by merging some configuration files or by confirming that the automatically
performed merging was done correctly.

# freebsd-update install

The system must be rebooted with the newly installed kernel before continuing.

# shutdown -r now

After rebooting, freebsd-update needs to be run again to install the new userland components:

# freebsd-update install
It is recommended to rebuild and install all applications if possible, especially if upgrading from an earlier FreeBSD release, for example,
FreeBSD 8.x.  Alternatively, the user can install misc/compat9x and other compatibility libraries, afterwards the system must be rebooted
into the new userland:

# shutdown -r now

Finally, after rebooting, freebsd-update needs to be run again to remove stale files:

# freebsd-update install

Love FreeBSD?  Support this and future releases with a donation to the FreeBSD Foundation!

FreeBSD 10.1-BETA1 Available

The first BETA build for the FreeBSD 10.1 release cycle is now available. ISO images for the amd64, armv6, i386, ia64, powerpc, powerpc64 and sparc64 architectures are available on most of our FreeBSD mirror sites.

Receive side scaling: testing UDP throughput

I think it's about time I shared some more details about the RSS stuff going into FreeBSD and how I'm testing it.

For now I'm focusing on IPv4 + UDP on the Intel 10GE NICs. The TCP side of things is done (and the IPv6 side of things works too!) but enough of the performance walls show up in the IPv4 UDP case that it's worth sticking to it for now.

I'm testing on a pair of 4-core boxes at home. They're not special - and they're very specifically not trying to be server-class hardware. I'd like to see where these bottlenecks are even at low core count.

The test setup in question:

Testing software:

  • http://github.com/erikarn/freebsd-rss
  • It requires libevent2 - an updated copy; previous versions of libevent2 didn't handle FreeBSD specific errors gracefully and would early error out of the IO loop.


  • CPU: Intel(R) Core(TM) i5-3550 CPU @ 3.30GHz (3292.59-MHz K8-class CPU)
  • There's no SMT/HTT, but I've disabled it in the BIOS just to be sure
  • 4GB RAM
  • FreeBSD-HEAD, amd64
  • NIC:  '82599EB 10-Gigabit SFI/SFP+ Network Connection
  • ix0:

# for now redirect processing just makes the lock overhead suck even more.
# disable it.



# experiment with deferred dispatch for RSS

kernel config:

include GENERIC

device netmap
options RSS
options PCBGROUP

# in-system lock profiling

# Flowtable - the rtentry locking is a bit .. slow.
options   FLOWTABLE

# This debugging code has too much overhead to do accurate
# testing with.
nooptions         INVARIANTS
nooptions         INVARIANT_SUPPORT
nooptions         WITNESS
nooptions         WITNESS_SKIPSPIN

The server runs the "rss-udp-srv" process, which behaves like a multi-threaded UDP echo server on port 8080.


The client box is slightly more powerful to compensate for (currently) not using completely affinity-aware RSS UDP transmit code.

  • CPU: Intel(R) Core(TM) i5-4460  CPU @ 3.20GHz (3192.68-MHz K8-class CPU)
  • SMT/HTT: Disabled in BIOS
  • 8GB RAM
  • FreeBSD-HEAD amd64
  • Same kernel config, loader and sysctl config as the server
  • ix0: configured as,,,,
The client runs 'udp-clt' programs to source and sink traffic to the server.

Running things

The server-side simply runs the listen server, configured to respond to each frame:

$ rss-udp-srv 1

The client-side runs four couples of udp-clt, each from different IP addresses. These are run in parallel (i do it in different screens, so I can quickly see what's going on):

$ ./udp-clt -l -r -p 8080 -n 10000000000 -s 510
$ ./udp-clt -l -r -p 8080 -n 10000000000 -s 510
$ ./udp-clt -l -r -p 8080 -n 10000000000 -s 510
$ ./udp-clt -l -r -p 8080 -n 10000000000 -s 510

The IP addresses are chosen so that the 2-tuple topelitz hash using the default Microsoft key hash to different RSS buckets that live on individual CPUs.

Results: Round one

When the server is responding to each frame, the following occurs. The numbers are "number of frames generated by the client (netstat)", "number of frames received by the server (netstat)", "number of frames seen by udp-rss-srv", "number of responses transmitted from udp-rss-srv", "number of frames seen by the server (netstat)"
  • 1 udp-clt process: 710,000; 710,000; 296,000; 283,000; 281,000
  • 2 udp-clt processes: 1,300,000; 1,300,000; 592,000; 592,000; 575,000
  • 3 udp-clt processes: 1,800,000; 1,800,000; 636,000; 636,000; 600,000
  • 4 udp-clt processes: 2,100,000; 2,100,000; 255,000; 255,000; 255,000
So, it's not actually linear past two cores. The question here is: why?

There are a couple of parts to this.

Firstly - I had left turbo boost on. What this translated to:

  • One core active: ~ 30% increase in clock speed
  • Two cores active: ~ 30% increase in clock speed
  • Three cores active: ~ 25% increase in clock speed
  • Four cores active: ~ 15% increase in clock speed.
Secondly and more importantly - I had left flow control enabled. This made a world of difference.

The revised results are mostly linear - with more active RSS buckets (and thus CPUs) things seem to get slightly more efficient:
  • 1 udp-clt process: 710,000; 710,000; 266,000; 266,000; 266,000
  • 2 udp-clt processes: 1,300,000; 1,300,000; 512,000; 512,000; 512,000
  • 3 udp-clt processes: 1,800,000; 1,800,000; 810,000; 810,000; 810,000
  • 4 udp-clt processes: 2,100,000; 2,100,000; 1,120,000; 1,120,000; 1,120,000

Finally, let's repeat the process but only receiving instead also echoing back the packet to the client:

$ rss-udp-srv 0
  • 1 udp-clt process: 710,000; 710,000; 204,000
  • 2 udp-clt processes: 1,300,000; 1,300,000; 378,000
  • 3 udp-clt processes: 1,800,000; 1,800,000; 645,000
  • 4 udp-clt processes: 2,100,000; 2,100,000; 900,000
The receive-only workload is actually worse off versus the transmit + receive workload!

What's going on here?

Well, a little digging shows that in both instances - even with a single udp-clt thread running which means only one CPU on the server side is actually active! - there's active lock contention.

Here's an example dtrace output for measuring lock contention with only one active process, where one CPU is involved (and the other three are idle):

Receive only, 5 seconds:

root@adrian-hackbox:/home/adrian/git/github/erikarn/freebsd-rss # dtrace -n 'lockstat:::adaptive-block { @[stack()] = sum(arg1); }'
dtrace: description 'lockstat:::adaptive-block ' matched 1 probe


Transmit + receive, 5 seconds:

dtrace: description 'lockstat:::adaptive-block ' matched 1 probe




Somehow it seems there's less lock contention / blocking going on when both transmit and receive is running!

So then I dug into it using the lock profiling suite. This is for 5 seconds with receive-only traffic on a single RSS bucket / CPU (all other CPUs are idle):

# sysctl debug.lock.prof.enable = 1; sleep 5 ; sysctl debug.lock.prof.enable=0

root@adrian-hackbox:/home/adrian/git/github/erikarn/freebsd-rss # sysctl debug.lock.prof.enable=1 ; sleep 5 ; sysctl debug.lock.prof.enable=0
debug.lock.prof.enable: 1 -> 1
debug.lock.prof.enable: 1 -> 0

root@adrian-hackbox:/home/adrian/git/github/erikarn/freebsd-rss # sysctl debug.lock.prof.stats | head -2 ; sysctl debug.lock.prof.stats | sort -nk4 | tail -10
     max  wait_max       total  wait_total       count    avg wait_avg cnt_hold cnt_lock name
    1496         0       10900           0          28    389      0  0      0 /usr/home/adrian/work/freebsd/head/src/sys/dev/usb/usb_device.c:2755 (sx:USB config SX lock)
       0         0          31           1          67      0      0  0      4 /usr/home/adrian/work/freebsd/head/src/sys/kern/sched_ule.c:888 (spin mutex:sched lock 2)
       0         0        2715           1       49740      0      0  0      7 /usr/home/adrian/work/freebsd/head/src/sys/dev/random/random_harvestq.c:294 (spin mutex:entropy harvest mutex)
       1         0          51           1         131      0      0  0      2 /usr/home/adrian/work/freebsd/head/src/sys/kern/sched_ule.c:1179 (spin mutex:sched lock 1)
       0         0          69           2         170      0      0  0      8 /usr/home/adrian/work/freebsd/head/src/sys/kern/sched_ule.c:886 (spin mutex:sched lock 2)
       0         0       40389           2      287649      0      0  0      8 /usr/home/adrian/work/freebsd/head/src/sys/kern/kern_intr.c:1359 (spin mutex:sched lock 2)
       0         2           2           4          12      0      0  0      2 /usr/home/adrian/work/freebsd/head/src/sys/dev/usb/usb_device.c:2762 (sleep mutex:Giant)
      15        20        6556         520        2254      2      0  0    105 /usr/home/adrian/work/freebsd/head/src/sys/dev/acpica/Osd/OsdSynch.c:535 (spin mutex:ACPI lock (0xfffff80002b10f00))
       4         5      195967       65888     3445501      0      0  0  28975 /usr/home/adrian/work/freebsd/head/src/sys/netinet/udp_usrreq.c:369 (sleep mutex:so_rcv)

Notice the lock contention for the so_rcv (socket receive buffer) handling? What's going on here is pretty amusing - it turns out that because there's so much receive traffic going on, the userland process receiving the data is being preempted by the NIC receive thread very often - and when this happens, there's a good chance it's going to be within the small window that the receive socket buffer lock is held. Once this happens, the NIC receive thread processes frames until it gets to one that requires it to grab the same sock buffer lock that is already held by userland - and it fails - so the NIC thread sleeps until the userland thread finishes consuming a packet. Then the CPU flips back to the NIC thread and continues processing a packet.

When the userland code is also transmitting frames it's increasing the amount of time in between socket receives and decreasing the probability of hitting the lock contention condition above.

Note there's no contention between CPUs here - this is entirely contention within a single CPU.

So for now I'm happy that the UDP IPv4 path is scaling well enough with RSS on a single core. The main performance problem here is the socket receive buffer locking (and, yes, copyin() / copyout().)


PC-BSD at Fossetcon

Fossetcon will take place September 11–13 at the Rosen Plaza Hotel in Orlando, FL. Registration for this event ranges from $10 to $85 and includes meals and a t-shirt.

There will be a BSD booth in the expo area on both Friday and Saturday from 10:30–18:30. As usual, we’ll be giving out a bunch of cool swag, PC-BSD DVDs, and FreeNAS CDs, as well as accepting donations for the FreeBSD Foundation.  Dru Lavigne will present “ZFS 101″ at 11:30 on Saturday. The BSDA certification exam will be held at 15:00 On Saturday.

PC-BSD 10.0.3 Quarterly Package Update Released

The PC-BSD team is pleased to announce the availability of the next PC-BSD quarterly package update, version 10.0.3!

This update includes a number of important bug-fixes, as well as newer packages and desktops. Packages such as Chromium 37.0.2062.94, Cinnamon 2.2.14, Lumina 0.6.2 and more. This release also includes a CD-sized ISO of TrueOS, for users who want to install a server without X. For more details and updating instructions, refer to the notes below.

We are already hard at work on the next major release of PC-BSD, 10.1 later this fall, which will include FreeBSD 10.1-RELEASE under the hood. Users interested in following along with development should sign up for our Testing mailing list.

PC-BSD Notable Changes

* Cinnamon 2.2.14
* Chromium 37.0.2062.94
* NVIDIA Driver 340.24
* Lumina desktop 0.6.2-beta
* Pkg 1.3.7
* Various fixes to the Appcafe Qt UI
* Bugfixes to Warden / jail creation
* Fixed a bug with USB media not always being bootable
* Fixed several issues with Xorg setup
* Improved Boot-Environments to allow “beadm activate” to set default
* Support for jail “bulk” creation via Warden
* Fixes for relative ZFS dataset mount-point creation via Warden
* Support for full-disk (GELI) encryption without an unencrypted /boot partition


Along with our traditional PC-BSD DVD ISO image, we have also created a CD-sized ISO image of TrueOS, our server edition.

This is a text-based installer which includes FreeBSD 10.0-Release under the hood. It includes the following features:

* ZFS on Root installation
* Boot-Environment support
* Command-Line versions of PC-BSD utilities, such as Warden, Life-Preserver and more.
* Support for full-disk (GELI) encryption without an unencrypted /boot partition

We have some additional features also in the works for 10.1 and later, stay tuned this fall for more information.


Due to some changes with how pkgng works, it is recommended that all users update via the command-line using the following steps:

# pkg update –f
# pkg upgrade pkg
# pkg update –f
# pkg upgrade
# pc-extractoverlay ports
# reboot

PKGNG may need to re-install many of your packages to fix an issue with shared library version detection. If you run into issues doing this, or have conflicts, please open a bug report with the output of the above commands.

If you run into shared library issues running programs after upgrading, you may need to do a full-upgrade with the following:

# pkg upgrade –f

Getting media

10.0.3 DVD/USB media can be downloaded from this URL via HTTP or Torrent.

Reporting Bugs
Found a bug in 10.0.3? Please report it (with as much detail as possible) to our new RedMine Database.

BSDDay Argentina Trip Report: Damian Vicino

The Foundation recently sponsored Damian Vicino to attend BSDDay Argentina. Here is his trip report:

BSDday is the only BSD conference in South America as far as I know. The event's inception was in 2008 by 2 BSD Users Groups in Buenos Aires City. participated as part of the organisation committee from 2009 - 2012. In 2013, the event had no edition because of some big changes in the livesof the people participating in the organisation committee. In my case, I moved out of the country (and the continent). Thanks to the FreeBSD Foundation, I was able to return to South America for a few weeks this year to re-float the committee and the event, making possible the run of a 5th edition.

We started the preparation a few months before by coordinating remotely, but there was a lot of stuff to be done in-place, so I traveled 10 days earlier. In the days before the event, I coordinated with Universidad de Buenos Aires to finish the arrangements for the space to run the event and the supplies needed for the event. I worked as the main contact for the university and dealt with all the paperwork; being the largest university in Argentina, there is a lot of paperwork for everything. An interesting institutional plus this year is that the Faculty of Science and Department of Computer Science of Universidad de Buenos Aires declared officially the BSDday as an Event of Interest. Simultaneously, Hernan Constante and Matias Celani were coordinating accommodations for one of the speakers who traveled from Mar del Plata and making arrangements to have food & coffee for the event. Thanks for their help and also to Alejandro Lazaro who was helping in all he could remotely since he also moved out of Buenos Aires.

The quantity of proposals for talks received this year was about half the usual. We contacted previous speakers for feedback and we decided to include discussion spaces to find out why and how we can make it better for next year. On August 9th, a few minutes before the event started, the first speaker had family emergency. We decided to delay the opening talk and use the time for a first open discussion about the event and its future. The attendance was the lowest ever, so we focused the first discussion space on this topic. It appears to be a consensus that August is not a good month for the conference, because of the power outages in Buenos Aires in summer. From previous years, we knew that November is not good either. Another apparent reason is the break in continuity of the event (in 2013). Everyone in the room actively participated in the open discussion spaces. We noticed from discussions that the demographics of the event had changed. This time, we had a group of desktop users, mostly from FreeBSD, while in previous years we had mostly sysadmins from OpenBSD working in large companies or ISPs.

After the discussion, I did the opening talk with the help of Hernan Constante. The talk was also open to discussion so it extended a little longer than programmed; lucky for us, having only 1 track, it didn't affect the schedule much. The second talk was for 40 minutes, but was extended up to 2 hours and ended up in a different topic than the one it started with. We were tempted to stop it, but people were asking so many questions that we let it flow. We then had 4 more talks (including mine) and 2 more spaces for open discussion about anything-BSD where we collected opinions about the event, about BSD in Argentina, and the future of BSD advocacy actions. Since we didn't have sponsors for the food/coffee/supplies, we asked if anyone wanted to contribute at the end of the event. We were glad to see that everyone in the room put in money and we almost covered every expense for the event in this way. After the event, about 90% of the people moved to the bar across the street to share some beers and we kept discussing until the bar closed and kicked us out.

The week after the event, I met again with some organisers to discuss ideas for next year and do some analysis of what happened this year. One week later, I met with some companies and professionals to check sponsoring possibilities for next year's edition.

Last week, I collected and processed the materials we obtained from the event: videos, photos, and slides from every presentation. I still need to recover a few videos that we had to download to one of the organiser's computer (who left the country before me). In the following weeks, we will upload the videos, slides and pictures and formally close this year's event in order to start working for the 6th edition, expected to happen in 2016.

Once again, thank you very much to the FreeBSD Foundation for helping me with the expenses for this trip, to the University of Buenos Aires Faculty of Science and Computer Science department for giving us the space and support, to Hernan Constante, Alejandro Lazaro, Matias Celani, the speakers, and all those who helped to make this event possible once again.

New Lumina source repo and FreeBSD port

By popular demand, the source tree for the Lumina project has just been moved to its own repository within the main PC-BSD project tree on GitHub.

In addition to this, an official FreeBSD port for Lumina was just committed to the FreeBSD ports tree which uses the new repo.


By the way, here is a quick usage summary for those that are interested in how “light” Lumina 0.6.2 is on PC-BSD 10.0.3:

System: Netbook with a single 1.6GHz atom processor and 2GB of memory (Fresh installation of PC-BSD 10.0.3 with Lumina 0.6.2)

Usage: ~0.2–0.4% CPU and ~120MB active memory use (no apps running except an xterm with “top” after a couple minutes for the PC-BSD tray applications to start up and settle down)


10.0.3-RC2 Available for Testing

PC-BSD 10.0.3-RC2 ISO images are now available for testing.

Users on the EDGE package set, or 10.0.3-RC1 can update to the newer set with the following commands:

# pkg update –f
# pkg upgrade
# pc-extractoverlay ports

This update brings in the newer pkgng 1.3.7, which may need to re-install many of your packages in order to properly fix an issue with shared-library version detection in previous pkgng releases.

The current plan is to release 10.0.3 early next week, so please let us know of any issues right away via our RedMine bug tracker.

pkg(8) is now the only package management tool

The ports tree has been modified to only support pkg(8) as package management system for all supported version of FreeBSD.

if you were still using pkg_install (pkg_* tools) you will have to upgrade your system.

The simplest way is

cd /usr/ports/ports-mgmt/pkg
make install

then run


You will have lots of warning, don’t be scared, they are expected, pkg_*  databases used to get easily mangled. pkg2ng is most of the time able to deal
with it.

If however you encounter a problem then please report to [email protected]

A tag has been applied to the ports tree if you need to get the latest ports tree before the EOL of pkg_install:


A branch has been created if some committers want to provides updates on the for pkg_install users:


Please note that this branch is not officially maintained and that we strongly recommend that you do migrate to pkg(8)

The ports tree is now stage only

The ports tree is now fully staged (only 2% has been left unstaged, marked as broken and will be removed from the ports tree if no PR to stage them are pending in bugzilla).

I would like to thank every committer and maintainers for their work on staging!
It allowed us to convert more than 23k packages to support stage in only 11 months!

Staging is a very important state, it allows us to right now be able to run quality testing scripts on the packages (which already allowed to fix tons of hidden problems) and it allows use to be able to build packages as a regular user!

It also opens the gates to new features that users have been requesting for many years:

  • flavors
  • multiple packages

Expect those features to happen in the near future.

FreeBSD 10.1-BETA1 Available

The first BETA build for the FreeBSD 10.1 release cycle is now available. ISO images for the amd64, armv6, i386, ia64, powerpc, powerpc64 and sparc64 architectures are available on most of our FreeBSD mirror sites.

helping out with VC4

I've had a couple of questions about whether there's a way for others to contribute to the VC4 driver project.  There is!  I haven't posted about it before because things aren't as ready as I'd like for others to do development (it has a tendency to lock up, and the X implementation isn't really ready yet so you don't get to see your results), but that shouldn't actually stop anyone.

To get your environment set up, build the kernel (https://github.com/anholt/linux.git vc4 branch), Mesa (git://anongit.freedesktop.org/mesa/mesa) with --with-gallium-drivers=vc4, and piglit (git://anongit.freedesktop.org/git/piglit).  For working on the Pi, I highly recommend having a serial cable and doing NFS root so that you don't have to write things to slow, unreliable SD cards.

You can run an existing piglit test that should work, to check your environment: env PIGLIT_PLATFORM=gbm VC4_DEBUG=qir ./bin/shader_runner tests/shaders/glsl-algebraic-add-add-1.shader_test -auto -fbo -- you should see a dump of the IR for this shader, and a pass report.  The kernel will make some noise about how it's rendered a frame.

Now the actual work:  I've left some of the TGSI opcodes unfinished (SCS, DST, DPH, and XPD, for example), so the driver just aborts when a shader tries to use them.  How they work is described in src/gallium/docs/source/tgsi.rst. The TGSI-to_QIR code is in vc4_program.c (where you'll find all the opcodes that are implemented currently), and vc4_qir.h has all the opcodes that are available to you and helpers for generating them.  Once it's in QIR (which I think should have all the opcodes you need for this work), vc4_qpu_emit.c will turn the QIR into actual QPU code like you find described in the chip specs.

You can dump the shaders being generated by the driver using VC4_DEBUG=tgsi,qir,qpu in the environment (that gets you 3/4 stages of code dumped -- at times you might want some subset of that just to quiet things down).

Since we've still got a lot of GPU hangs, and I don't have reset wokring, you can't even complete a piglit run to find all the problems or to test your changes to see if your changes are good.  What I can offer currently is that you could run PIGLIT_PLATFORM=gbm VC4_DEBUG=norast ./piglit-run.py tests/quick.py results/vc4-norast; piglit-summary-html.py --overwrite summary/mysum results/vc4-norast will get you a list of all the tests (which mostly failed, since we didn't render anything), some of which will have assertion failed.  Now that you have which tests were assertion failing from the opcode you worked on, you can run them manually, like PIGLIT_PLATFORM=gbm /home/anholt/src/piglit/bin/shader_runner /home/anholt/src/piglit/generated_tests/spec/glsl-1.10/execution/built-in-functions/vs-asin-vec4.shader_test -auto (copy-and-pasted from the results) or PIGLIT_PLATFORM=gbm PIGLIT_TEST="XPD test 2 (same src and dst arg)" ./bin/glean -o -v -v -v -t +vertProg1 --quick (also copy and pasted from the results, but note that you need the other env var for glean to pick out the subtest to run).

Other things you might want eventually: I do my development using cross-builds instead of on the Pi, install to a prefix in my homedir, then rsync that into my NFS root and use LD_LIBRARY_PATH/LIBGL_DRIVERS_PATH on the Pi to point my tests at the driver in the homedir prefix.  Cross-builds were a *huge* pain to set up (debian's multiarch doesn't ship the .so symlink with the libary, and the -dev packages that do install them don't install simultaneously for multiple arches), but it's worth it in the end.  If you look into cross-build, what I'm using is rpi-tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64/bin/arm-linux-gnueabihf-gcc and you'll want --enable-malloc0returnsnull if you cross-build a bunch of X-related packages.

FreeBSD Foundation announces IPsec Enhancement Project

The Internet Protocol Security (IPsec) suite is used to implement virtual private networks on FreeBSD and other operating systems. As the networking world continues its transition from 1 to 10, to 40 gigabit per second speeds, and faster, improvements in IPsec’s cryptographic building blocks are necessary to keep pace. The FreeBSD Foundation is pleased to announce that long-time FreeBSD developer John-Mark Gurney is adding modern AES modes to FreeBSD’s cryptographic framework and IPsec. This project is co-sponsored by the FreeBSD Foundation and Netgate, a leading vendor of BSD-based firewalls and networking gear.

The project adds new encryption modes while also importing infrastructure updates from OpenBSD giving FreeBSD users unprecedented support for high performance, encrypted communications.  New modes include AES-CTR and AES-GCM with hardware acceleration using Intel’s AES-NI instructions. According to John-Mark, “on a modern 64-bit x86 CPU one core can process about 1 gigabyte per second of data” using the new AES-GCM mode.

Concurrent with this project, FreeBSD committer and pfSense employee Ermal Luçi will update the FreeBSD IPsec stack to take advantage of the new cryptographic modes.

Jim Thompson, a co-owner of both Netgate and ESF (the company behind pfSense), said “We are pleased to contribute to this project.  Our interest in high-performance IPsec is obvious, however we also recognize the importance of contributing this capability to the FreeBSD project. Not only because our own software is based on FreeBSD, but for the benefit it brings to the entire community.  We plan to have AES-GCM support for IPsec with AES-NI acceleration available in the 2.2 release of pfSense software.”

The project is currently in progress, with a planned completion at the end of September 2014.