Category Archives: FreeBSD

DNS improvements in FreeBSD 11

Erwin Lansing just posted a summary of the DNS session at the FreeBSD DevSummit that was held in conjunction with BSDCan 2014 in May. It gives a good overview of the current state of affairs, including known bugs and plans for the future.

I’ve been working on some of these issues recently (in between $dayjob and other projects). I fixed two issues in the last 48 hours, and am working on two more.

Reverse lookups in private networks

Fixed in 11.

In its default configuration, Unbound 1.4.22 does not allow reverse lookups for private addresses (RFC 1918 and the like). NLNet backported a patch from the development version of Unbound which adds a configuration option, unblock-lan-zones, which disables this filtering. But that alone is not enough, because the reverse zones are actually signed (EDIT: the problem is more subtle than that, details in comments); Unbound will attempt to validate the reply, and will reject it because the zone is supposed to be empty. Thus, for reverse lookups to work, the reverse zones for all private address ranges must be declared as insecure:

server:
    # Unblock reverse lookups for LAN addresses
    unblock-lan-zones: yes
    domain-insecure:      10.in-addr.arpa.
    domain-insecure:     127.in-addr.arpa.
    domain-insecure: 254.169.in-addr.arpa.
    domain-insecure:  16.172.in-addr.arpa.
    # ...
    domain-insecure:  31.172.in-addr.arpa.
    domain-insecure: 168.192.in-addr.arpa.
    domain-insecure: 8.e.ip6.arpa.
    domain-insecure: 9.e.ip6.arpa.
    domain-insecure: a.e.ip6.arpa.
    domain-insecure: b.e.ip6.arpa.
    domain-insecure: d.f.ip6.arpa.

FreeBSD 11 now has both the unblock-lan-zones patch and an updated local-unbound-setup script which sets up the reverse zones. To take advantage of this, simply run the following command to regenerate your configuration:

# service local_unbound setup

This feature will be available in FreeBSD 10.1.

Building libunbound writes to /usr/src (#190739)

Fixed in 11.

The configuration lexer and parser were included in the source tree instead of being generated at build time. Under certain circumstances, make(1) would decide that they needed to be regenerated. At best, this inserted spurious changes into the source tree; at worst, it broke the build.

Part of the reason for this is that Unbound uses preprocessor macros to place the code generated by lex(1) and yacc(1) in its own separate namespace. FreeBSD’s lex(1) is actually Flex, which has a command-line option to achieve the same effect in a much simpler manner, but to take advantage of this, the lexer needed to be cleaned up a bit.

Allow local domain control sockets

Work in progress

An Unbound server can be controlled by sending commands (such as “reload your configuration file”, “flush your cache”, “give me usage statistics”) over a control socket. Currently, this can only be a TCP socket. Ilya Bakulin has developed a patch, which I am currently testing, that allows Unbound to use a local domain (aka “unix”) socket instead.

Allow unauthenticated control sockets

Work in progress

If the control socket is a local domain socket instead of a TCP socket, there is no need for encryption and little need for authentication. In the local resolver case, only root on the local machine needs access, and this can be enforced by the ownership and file permissions of the socket. A second patch by Ilya Bakulin makes encryption and authentication optional so there is no need to generate and store a client certificate in order to use unbound-control(8).

Cross building ports with qemu-user and poudriere-devel on #FreeBSD

I’ve spent the last few months banging though the bits and pieces of the work that Stacey Son implemented for QEMU to allow us to more or less chroot into a foreign architecture as though it were a normal chroot.  This has opened up a lot of opportunities to bootstrap the non-x86 architectures on FreeBSD.

Before I get started, I’d like to thank Stacey Son, Ed Maste, Juergen Lock, Peter Wemm, Justin Hibbits, Alexander Kabaev, Baptiste Daroussin and Bryan Drewery for the group effort in getting us the point of working ARMv6, MIPS32 and MIPS64 builds.  This has been a group effort for sure.

This will require a 10stable or 11current machine, as this uses Stacey’s binary activator patch to redirect execution of binaries through QEMU depending on the ELF header of the file.  See binmiscctl(8) for more details.

Mechanically, this is a pretty easy setup.  You’ll need to install ports-mgmt/poudriere-devel with the qemu-user option selected.  This will pull in the qemu-user code to emulate the environment we need to get things going.

I’ll pretend that you want an ARMv6 environment here.  This is suitable to build packages for the Rasberry PI and Beagle Bone Black.  Run this as root:

binmiscctl add armv6 –interpreter “/usr/local/bin/qemu-arm” –magic
“x7fx45x4cx46x01x01x01x00x00x00x00x00x00x00x00x00x02
x00x28x00″ –mask “xffxffxffxffxffxffxffx00xffxffxffxff
xffxffxffxffxfexffxffxff” –size 20 –set-enabled

This magic will load the imgact_binmisc.ko kernel module.  The rest of the command line instructs the kernel to redirect execution though /usr/local/bin/qemu-arm if the ELF header of the file matches an ARMv6 signature.

Build your poudriere jail (remember to install poudriere-devel for now as it has not been moved to stable at the time of this writing) with the following command:

poudriere jail -c -j 11armv632 -m svn -a armv6 -v head

Once this is done, you will be able to start a package build via poudriere bulk as you normally would:

poudriere bulk -j 11armv632 -a

or

poudriere bulk -j 11armv632 -f <my_file_of_ports_to_build>

Currently, we are running the first builds in the FreeBSD project to determine what needs the most attention first.  Hopefully, soon we’ll have something that looks like a coherent package set for non-x86 architectures.

For more information, work in progress things and possible bugs in qemu-user code, see Stacey’s list of things at:

https://wiki.freebsd.org/QemuUserModeHowTo

https://wiki.freebsd.org/QemuUserModeToDo

BSDCan 2014 – Ports and Packages WG

Baptiste Daroussin started the session with a status update on package building. All packages are now built with poudriere. The FreeBSD Foundation sponsored some large machines on which it takes around 16 hours to build a full tree. Each Wednesday at 01:00UTC the tree is snapshot and an incremental build is started for all supported released, the 2 stable branches (9 and 10) and quarterly branches for 9.x-RELEASE and 10.x-RELEASE. The catalogue is signed on a dedicated signing machine before upload. Packages can be downloaded from 4 mirrors (us-west, us-east, UK, and Russia) and feedback so far has been very positive.

He went on to note that ports people need better coordination with src people on ABI breakage. We currently only support i386 and amd64, with future plans for ARM and a MIPS variant. Distfiles are not currently mirrored (since fixed), and while it has seen no progress, it’s still a good idea to build a pkg of the ports tree itself.

pkg 1.3 will include a new solver, which will help 'pkg upgrade' understand that an old packages needs to be replaced with a newer one, with no more need for 'pkg set' and other chicanery. Cross building ports has been added to the ports tree, but is waiting for pkg-1.3. All the dangerous operations in pkg have now been sandboxed as well.

EOL for pkg_tools has been set for September 1st. An errata notice has gone out that adds a default pkg.conf and keys to all supported branches, and nagging delays have been added to ports.

Quarterly branches based on 3 month support cycle has been started on an evaluation basis. We’re still unsure about the manpower needed to maintain those. Every quarter a snapshot of the tree is created and only security fixed, build and runtime fixed, and upgrades to pkg are allowed to be committed to it. Using the MFH tag in a commit message will automatically send an approval request to portmgr and an mfh script on Tools/ makes it easy to do the merge.

Experience so far has been good, some minor issues to the insufficient testing. MFHs should only contain the above mentioned fixes; cleanups and other improvements should be done in separate commits only to HEAD. A policy needs to be written and announced about this. Do we want to automatically merge VuXML commits, or just remove VuXML from the branch and only use the one in HEAD?

A large number of new infrastructure changes have been introduces over the past few months, some of which require a huge migration of all ports. To speed these changes up, a new policy was set to allow some specific fixes to be committed without maintainer approval. Experience so far has been good, things actually are being fixed faster than before and not many maintainers have complained. There was agreement that the list of fixes allowed to be committed without explicit approval should be a specific whitelist published by portmgr, and not made too broad in scope.

Erwin Lansing quickly measured the temperature of the room on changing the default protocol for fetching distils from MASTER_SITE_BACKUP from ftp to http. Agreement all around and erwin committed the change.

Ben Kaduk gave an introduction and update on MIT’s Athena Environment with some food for thought. While currently not FreeBSD based, he would like to see it become so. Based on debian/ubuntu and rolled out on hundreds of machines, it now has it’s software split into about 150 different packages and metapackages.

Dag-Erling Smørgrav discussed changes to how dependencies are handled, especially splitting dependencies that are needed at install time (or staging time) and those needed at run time. This may break several things, but pkg-1.3 will come with better dependency tracking solving part of the problem.

Ed Maste presented the idea of “package transparency”, loosely based on Google’s Certificate Transparency. By logging certificate issuance to a log server, which can be publicly checked, domain owners can search for certificates issued for their domains, and notice when a certificate is issued without their authority. Can this model be extended to packages? Mostly useful for individually signed packages, while we currently only sign the catalogue. Can we do this with the current infrastructure?

Stacy Son gave an update on Qemu user mode, which is now working with Qemu 2.0.0. Both static and dynamic binaries are supported, though only a handful of system call are supported.

Baptiste introduced the idea of having pre-/post-install scripts be a library of services, like Casper, for common actions. This reduces the ability of maintainers to perform arbitrary actions and can be sandboxed easily. This would be a huge security improvement and could also enhance performance.

Cross building is coming along quite well and most of the tree should be able to be build by a simple 'make package'. Major blockers include perl and python.

Bryan Drewery talked about a design for a PortsCI system. The idea is that committer easily can schedule a build, be it an exp-run, reference, QAT, or other, either via a web interface or something similar to a pull request, which can fire off a build.

Steve Wills talked about using Jenkins for ports. The current system polls SVN for commits and batches several changes together for a build. It uses 8 bhyve VMs instances, but is slow. Sean Bruno commented that there are several package building clusters right now, can they be unified? Also how much hardware would be needed to speed up Jenkins? We could duse Jenkins as a fronted for the system Bryan just talked about. Also, it should be able to integrate with phabricator.

Erwin opened up the floor to talk about freebsd-version(1) once more. It was introduced as a mechanism to find out the version of user land currently running as uname -r only represents the kernel version, and would thus miss updates of the base system that do no touch the kernel. Unfortunately, freebsd-version(1) cannot really be used like this in all cases, it may work for freebsd-update, but not in general. No real solution was found this time either.

The session ended with a discussion about packaging the base system. It’s a target for FreeBSD 11, but lots of questions are still to be answered. What granularity to use? What should be packages into how many packages? How to handle options? Where do we put the metadata for this? How do upgrades work? How to replace shared libraries in multiuser mode? This part also included the quote of the day: “Our buildsystem is not a paragon of configurability, but a bunch of hacks that annoyed people the most.”

Thanks to all who participated in the working group, and thanks again to DK Hostmaster for sponsoring my trip to BSDCan this year, and see you at the Ports and Packages WG meet up at EuroBSDCon in Sofia in September.

FreeBSD Developer Summit – BSDCan 2014 – Ports and Packages WG

Baptiste Daroussin started the session with a status update on package building. All packages are now built with poudriere. The FreeBSD Foundation sponsored some large machines on which it takes around 16 hours to build a full tree. Each Wednesday at 01:00UTC the tree is snapshot and an incremental build is started for all supported released, the 2 stable branches (9 and 10) and quarterly branches for 9.x-RELEASE and 10.x-RELEASE. The catalogue is signed on a dedicated signing machine before upload. Packages can be downloaded from 4 mirrors (us-west, us-east, UK, and Russia) and feedback so far has been very positive.

He went on to note that ports people need better coordination with src people on ABI breakage. We currently only support i386 and amd64, with future plans for ARM and a MIPS variant. Distfiles are not currently mirrored (since fixed), and while it has seen no progress, it’s still a good idea to build a pkg of the ports tree itself.

pkg 1.3 will include a new solver, which will help 'pkg upgrade' understand that an old packages needs to be replaced with a newer one, with no more need for 'pkg set' and other chicanery. Cross building ports has been added to the ports tree, but is waiting for pkg-1.3. All the dangerous operations in pkg have now been sandboxed as well.

EOL for pkg_tools has been set for September 1st. An errata notice has gone out that adds a default pkg.conf and keys to all supported branches, and nagging delays have been added to ports.

Quarterly branches based on 3 month support cycle has been started on an evaluation basis. We’re still unsure about the manpower needed to maintain those. Every quarter a snapshot of the tree is created and only security fixed, build and runtime fixed, and upgrades to pkg are allowed to be committed to it. Using the MFH tag in a commit message will automatically send an approval request to portmgr and an mfh script on Tools/ makes it easy to do the merge.

Experience so far has been good, some minor issues to the insufficient testing. MFHs should only contain the above mentioned fixes; cleanups and other improvements should be done in separate commits only to HEAD. A policy needs to be written and announced about this. Do we want to automatically merge VuXML commits, or just remove VuXML from the branch and only use the one in HEAD?

A large number of new infrastructure changes have been introduces over the past few months, some of which require a huge migration of all ports. To speed these changes up, a new policy was set to allow some specific fixes to be committed without maintainer approval. Experience so far has been good, things actually are being fixed faster than before and not many maintainers have complained. There was agreement that the list of fixes allowed to be committed without explicit approval should be a specific whitelist published by portmgr, and not made too broad in scope.

Erwin Lansing quickly measured the temperature of the room on changing the default protocol for fetching distils from MASTER_SITE_BACKUP from ftp to http. Agreement all around and erwin committed the change.

Ben Kaduk gave an introduction and update on MIT’s Athena Environment with some food for thought. While currently not FreeBSD based, he would like to see it become so. Based on debian/ubuntu and rolled out on hundreds of machines, it now has it’s software split into about 150 different packages and metapackages.

Dag-Erling Smørgrav discussed changes to how dependencies are handled, especially splitting dependencies that are needed at install time (or staging time) and those needed at run time. This may break several things, but pkg-1.3 will come with better dependency tracking solving part of the problem.

Ed Maste presented the idea of “package transparency”, loosely based on Google’s Certificate Transparency. By logging certificate issuance to a log server, which can be publicly checked, domain owners can search for certificates issued for their domains, and notice when a certificate is issued without their authority. Can this model be extended to packages? Mostly useful for individually signed packages, while we currently only sign the catalogue. Can we do this with the current infrastructure?

Stacy Son gave an update on Qemu user mode, which is now working with Qemu 2.0.0. Both static and dynamic binaries are supported, though only a handful of system call are supported.

Baptiste introduced the idea of having pre-/post-install scripts be a library of services, like Casper, for common actions. This reduces the ability of maintainers to perform arbitrary actions and can be sandboxed easily. This would be a huge security improvement and could also enhance performance.

Cross building is coming along quite well and most of the tree should be able to be build by a simple 'make package'. Major blockers include perl and python.

Bryan Drewery talked about a design for a PortsCI system. The idea is that committer easily can schedule a build, be it an exp-run, reference, QAT, or other, either via a web interface or something similar to a pull request, which can fire off a build.

Steve Wills talked about using Jenkins for ports. The current system polls SVN for commits and batches several changes together for a build. It uses 8 bhyve VMs instances, but is slow. Sean Bruno commented that there are several package building clusters right now, can they be unified? Also how much hardware would be needed to speed up Jenkins? We could duse Jenkins as a fronted for the system Bryan just talked about. Also, it should be able to integrate with phabricator.

Erwin opened up the floor to talk about freebsd-version(1) once more. It was introduced as a mechanism to find out the version of user land currently running as uname -r only represents the kernel version, and would thus miss updates of the base system that do no touch the kernel. Unfortunately, freebsd-version(1) cannot really be used like this in all cases, it may work for freebsd-update, but not in general. No real solution was found this time either.

The session ended with a discussion about packaging the base system. It’s a target for FreeBSD 11, but lots of questions are still to be answered. What granularity to use? What should be packages into how many packages? How to handle options? Where do we put the metadata for this? How do upgrades work? How to replace shared libraries in multiuser mode? This part also included the quote of the day: “Our buildsystem is not a paragon of configurability, but a bunch of hacks that annoyed people the most.”

Thanks to all who participated in the working group, and thanks again to DK Hostmaster for sponsoring my trip to BSDCan this year, and see you at the Ports and Packages WG meet up at EuroBSDCon in Sofia in September.

FreeBSD Developer Summit – BSDCan 2014 – DNS WG

The DNS Working Group at the FreeBSD Developer Summit at BSDCan this year was off to a good start by noticing that DNSSEC validation could not work on the University of Ottawa’s wireless network. The university’s resolvers added additional records to the root zone, thus failing validation at the root. This led to some discussion on how to provide a user-friendly way to explain this in an understandable way to the user and giver the user a choice of turning off validation or find another network. This certainly is going to be a major problem when turning on validation by default as broken resolvers are very common at hotels, coffee shops, etc. etc.

On a more positive note, all the FreeBSD projects zones are DNSSEC signed and all project-owned servers have SSHFP records in the zone. Dog food was eaten.

Dag-Erling Smørgrav started off by giving an overview of the current state of affairs. ldns and unbound are imported into base in HEAD and 10.x. unbound is meant to act as a local resolver only and as it is not linked to libevent, it will not scale to anything else. For a network-wide resolver or any other configuration, it is recommended to install unbound from ports. DES further went into some of the implementation details on how the base unbound is installed to make sure it does not conflict with an unbound installed from ports.

DES explained some issues he encountered with local and RFC1918 zones which are filtered by default by unbound. Others reported no issues with the right configuration options, so more investigation is needed.

Some people reported having difficulty getting patches accepted upstream by NLNetLabs, which gave some cause for concern as we clearly want a good and active working relationship with our DNS vendor. Others reported no problem working with NLNetLabs, quite the opposite, they are very interested to see the work going on in operation systems, so we’ll just need to build upon that relationship and make sure to invite them to the next WG meeting. Patches that are currently being worked on, DES has some code cleanups, Björn a DNS64 feature, should be submitted through the “normal” submission process and review with NLNetLabs and we’ll see how that goes.

Erwin Lansing started the brainstorm session on future work. Some command line tools would be nice to have; drill does most things one wants, but people are too used to writing dig and dig has many more options; Peter Wemm would like to see contrib scripts line ldns-dane, which are just really easy to use; the control socket should be a unix socket, there’s a patch floating around and should be submitted upstream.

The “Starbucks” problem came up again, with a proposal to turn on val-permissive-mode by default. Another solution may be by looking at how unbound-trigger does its magic.

After a coffee break, Peter Losher, ISC, went over some of the recent changes at ISC. BIND10 development has been handed over to a new project and ISC will concentrate on BIND9 and a stand-alone project for the DHCP component. BIND 9.10 was recently released and plans are in place for 9.11. ISC is open to suggestions and feature requests.

Peter brought up the topic of clientID for which a IETF draft (draft-edns0-client-subnet) is available. This would help client find the nearest CDN node, etc. ISC wants this to be an opt-out in operating systems as it will peel off a layer of anonymisation, and should be controllable by the user.

Next up was Michael Bentkofsky, Verisign, who, while not involved in the project himself, gave an introduction into the getDNS API, which is a replacement for getaddrinfo and allows the stub resolver to get validation information down at the client level. It’s available in ports. The discussion went into more of a brainstorm on how applications should get DNS and DNSSEC information and who gets to make decisions about its security. There should be a clear separation between policy and mechanism, where application programmers should not have to worry about this; it should be a system policy. There should be a higher level API where an application basically can ask the operating system for a “connection” and the operation system takes care of everything behind the scenes, DNS, DNSSEC, SSL, DANE, etc. and just return a socket, with some information on how the connection was established and which security mechanisms were used. In FreeBSD, it would make sense to let the Casper daemon hand out the different sub-tasks to ensure all lookups, cryptography, etc. are properly compartmentalised. One potential problem with passing on additional information is that all DNS lookups currently go through nsswitch, which would need to grow knowledge about that data as well. Are people still using other mechanisms for hostname lookups besides the hosts file and DNS? We can probably just remove nsswitch for the hostname lookups.

The session ended with some aims for the 11.0 release. We’ll need to have a wider discussion about the aforementioned removal of nsswitch out of the hostname lookups. We’ll also need a better understanding of what API capabilities applications may need. Can Casper provide all these? Can it run unbound behind the scenes to do all the DNS “stuff” for it? Can we capsicumize unbound and will that be accepted upstream? Enough food for thought and even more for writing code.

Thanks again to DK Hostmaster for sponsoring my trip to BSDCan this year, and see you at the DNS WG meet up at EuroBSDCon in Sofia in September.

2014Q3 Branched

The 2014Q3 branch has just been branched and the package builder has been
updated to use that branch. This means that the next update on the quarterly
packages will be on the 2014Q3 branch.

What happened during the last 3 months:
- 177 different committers have participated
- 9918 commits happened
- diffstat says: 23646 files changed, 554070 insertions(+), 577210 deletions(-)

What does that means for users:
- default Java is now 1.7
- massive conversion to stagedir (93% of the ports are now properly staged)
- massive improvement of the usage of libtool (which reduces a lot overlinking)
- new USES: mono, objc, drupal, gecko, cpe, gssapi, makeinfo
- new Keywords for plist: @sample, @shell
- LibreOffice has been updated to 4.2.5
- Firefox has been updated to 30.0
- Firefox-esr has been updated to 24.6
- Default postgresql has moved from 9.0 to 9.2
- nginx has been updated to 1.6.0
- Default lua is 5.2
- subversion has been split into multiple ports for each features
- On FreeBSD 9-STABLE and 10-STABLE the default xorg 1.12.4 (for default binary
packages it is still 1.7.7)
- Improved QA checking in the infrastructure
- Info files are handle correctly even if base has been built WITHOUT_INFO
- Ancient emacs version has been cleaned out

fossil in prison

Since my last post about how I do host fossil I have been asked write about the new setup I do have

The jail content

I have created a minimal jail:

$ find /usr/local/jails/fossil -print
/usr/local/jails/fossil/var
/usr/local/jails/fossil/var/tmp
/usr/local/jails/fossil/libexec
/usr/local/jails/fossil/libexec/ld-elf.so.1
/usr/local/jails/fossil/bin
/usr/local/jails/fossil/bin/sh
/usr/local/jails/fossil/bin/fossil
/usr/local/jails/fossil/lib
/usr/local/jails/fossil/lib/libc.so.7
/usr/local/jails/fossil/lib/libssl.so.7
/usr/local/jails/fossil/lib/libreadline.so.8
/usr/local/jails/fossil/lib/libz.so.6
/usr/local/jails/fossil/lib/libcrypto.so.7
/usr/local/jails/fossil/lib/libncurses.so.8
/usr/local/jails/fossil/lib/libedit.so.7
/usr/local/jails/fossil/data
/usr/local/jails/fossil/dev

/bin/sh is necessary to get the exec.start jail argument to work /var/tmp is necessary to get fossil to open his temporary files (I created it with 1777 credential) /data is a empty directory where the fossil files will be stored

Jail configuration

The configuration file is the following:

fossil {
	path = "/usr/local/jails/fossil";
	host.hostname = "fossil.etoilebsd.net";
	mount.devfs;
	ip4.addr="127.0.0.1";
	exec.start = "/bin/fossil server -P 8084 --localhost --files *.json,*.html,*.js,*.css,*.txt --notfound /index.html /data &";
	exec.system_jail_user = "true";
	exec.jail_user = "www";
	exec.consolelog = "/var/log/jails/fossil.log" ;
}

More about fossil itself

In /data I created an index.html which is an almost empty html with a bit of Javascript.

When loading the javascript will request a list.txt file.

This file contain the list of repositories I want to show publically (one per line).

For each of them the javascript will use the json interface of fossil (meaning your fossil has to be built with json) and gather the name and the description of the repo to print them on the index.

Starting/Stopping the service

2 simple command are necessary to manage the service:

Starting up:

# jail -c fossil

Stopping:

# jail -r fossil

The service is only listening on the localhost, it is up to you to create your reverse proxy, in my case I do use nginx with the following config:

server {
	server_name fossil.etoilebsd.net;
	listen       [::]:443 ssl;
	listen       443 ssl;
	ssl_certificate     ssl/fossil.crt;
	ssl_certificate_key ssl/fossil.key;

	location / {
		client_max_body_size 10M;
		proxy_buffering off;
		proxy_pass http://127.0.0.1:8084/;
		proxy_set_header HTTPS on;
		proxy_set_header   Host             $host;
		proxy_set_header   X-Real-IP        $remote_addr;
		proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;
	}
}

Hacking on Receive Side Scaling (RSS) on FreeBSD

RSS is a Microsoft invention that tries to keep a given TCP or UDP flow (and I think IP, but I haven't yet tried that) on a given CPU core. The idea is to try and keep both flow-local data and flow-local locking on a single CPU core, increasing the chances that data is hot in the CPU core cache and reducing the chance of lock overhead.

You can find the RSS overview and programming details here:

http://msdn.microsoft.com/en-us/library/windows/hardware/ff567236(v=vs.85).aspx

RSS and supporting technology has been making its way into FreeBSD for quite some time but it's not in any real shape that application developers can take advantage of.

Firstly, there's "PCBGROUPS", which looks to group PCB (protocol control block) data for a connection local to a CPU. Instead of there being one global PCB table for the system (well, VIMAGE for FreeBSD - each virtual image instance has its own PCB table) with one lock protecting it, there's now multiple PCB tables, one per "thing". Here, the thing is whatever the kernel developer thinks is worth grouping them by.

http://www.ece.rice.edu/~willmann/pubs/paranet_usenix.pdf

Now, until the RSS work went in, this code was in FreeBSD but sat unused. A kernel developer could provide the hooks needed to map TCP (and maybe UDP later) flows to a "thing" and have that map to a PCB group table - but it required some glue to stamp incoming connections and outgoing packets with some identifier (which we call a "flowid" in FreeBSD) with something that can map to said "thing". Then whenever a PCB lookup was needed, it would first try the lookup in the table mapped to by the mapping between the "flowid" and "thing" - if it was successful, it wouldn't have to use the global PCB table to do the lookup.

This is only good for established connections - creating and destroying a connection still requires manipulating that global PCB table and the single PCB table lock. I'm going to ignore fixing that for now, as that is a bigger issue.

Then Robert Watson added the RSS work done under contract to Juniper Networks, Inc. RSS provides one kind of mapping between the flowid from the NIC and which CPU to run work on. So that part worked great - but there wasn't any way for the application user to take advantage of it. Additionally, there's no driver awareness of it yet - I'll discuss this shortly.

So I grabbed a bunch of this work whilst at Netflix and tried to make sense of it. It turns out that if you can keep the work local to a CPU, a lot of the lock contention in the networking stack melts away. Here's what's going on:

  • The receive thread(s) in the NIC driver processing packets are typically doing direct dispatch to the network stack - so they're running the receive side of the TCP stack;
  • .. and the receive side of the network stack includes ACKs, which triggers the transmit side of the network stack;
  • There's typically some deferred thread(s) in the NIC driver transmitting packets to each NIC queue;
  • There's also application threads trying to queue data to the TCP socket, which also can dig into the socket and TCP stack state, which involves grabbing locks;
  • And there's also timers firing to update state, and doing this involves grabbing locks.
Without RSS and without lining everything up on CPU cores, all the above can run on different cores. Whenever any of them try running at the same time, lock contention can occur and that particular task can stop. If the lock contention blocks the transmit or receive NIC threads, then not only is that connection affected - the whole NIC processing is affected.

There's still lock contention in the network stack - especially if you're doing a lot of new, short connections. The good folk at Verisign are working on that particular corner of the problem so I'm happy to defer to them.

So, I ended up doing a bunch of little pieces to get this lined up right:
  • The per-CPU timer callwheels can now be optionally pinned to their CPU cores, so timer events running on CPU X actually do run on CPU X (yes, that was amusing to find..);
  • There's support in the TCP stack for per-CPU timers, but it's not enabled by default;
  • ... and it also didn't query RSS, netisr or anything to figure out how to map a flowid to a given CPU to run a timer on;
  • Then to make matters worse, incoming TCP sessions didn't have a flowid assigned to the PCB until after the first data packet was read - which meant that the initial timer work would all assume CPU 0 and any queries on that particular PCB would return flowid=0 - so it would not find it in the right PCBGROUP.
So those are fixed in FreeBSD-HEAD. The per-CPU TCP timer and pinned-CPU timers aren't enabled by default - I'll only flip that on when I'm confident that the RSS stuff is working.

So that lets all the RSS stuff correctly work. But there wasn't a nice way to query the per-connection flowid or RSS information. So I then extended netstat to have 'R' as a flag - it returns the flowid and the flowid type. I'll add RSS information once I have a nice way to extract it out in bulk. It's still a good diagnostic tool to ensure that the IPv4/IPv6 hashing is working correctly.

Then I had to teach a driver about RSS so I could actually test it all out. I have some igb(4) hardware at home, so I did the minimal work required to teach it about the RSS key and assigning things to the correct CPUs. It's still incomplete but it's good enough to get off the ground. I'll go into more details about the driver requirements in a follow-up blogpost.

Finally, how are application developers supposed to use it? I'll cover that particular bit in another follow-up blog post as there's quite a lot to cover there.

Playing nice with others. git(1) and patches on #FreeBSD

I’ve been spending a lot of time massaging a branch of patches and other assorted bits and pieces for QEMU user mode on github

This led me down the path of being a good git user and contributor, so I’ll leave these notes for myself and others in the event you come into a situation where you need FreeBSD to play nice with people who are very git(1) centric.

After an update by [email protected] to the devel/git port, you can now install git(1) and have it work out of the box.  The most frustrating thing, after using git for like 5 minutes, is to figure out how to extract a patch out of it and send it all pretty-like to the mailing list(s) that would be consuming the patch.

In its simplest incarnation, you can simply reference a commit hash and us it to generate a patch via git format-patch, but this will give you the entire commit diff between the referenced version and HEAD.  This, in my case generated approximately 3000 patch files.

e.g. git format-patch –output-directory ~/patches –to=”[email protected]” c60a1f1b2823a4937535ecb97ddf21d06cfd3d3b

What I want, is a diff of one revision, which requires a start and ending hash:

format-patch –output-directory ~/patches –to=”[email protected]” c60a1f1b2823a4937535ecb97ddf21d06cfd3d3b…c6ad44bb288c1fe85d4695b6a48d89823823552b

Now I send this to the mailing lists via my client.  Here is where I kind of head-desked a bit.  If you are like me and run a mail server yourself and you use SSL with self-signed certs, then this little bit if for you.  I lost about an hour trying to figure this little bit out.

The way to dump patches from your patch director (~/patches) is to use:

git send-email patches/*

This will use the following variables in your git environment:

sendemail.smtpserver=mail.ignoranthack.me
sendemail.smtpencryption=ssl
sendemail.smtpuser=[email protected]
sendemail.smtpserverport=465
sendemail.smtpsslcertpath=
sendemail.annotate=yes

Notice the empty “sendemail.smtpcertpath” variable.  Without that set to EMPTY, git would repeatedly fail on the self-signed cert that I use.  So, I’m pretty sure something still isn’t setup correctly.  However, it must be set to EMPTY and not undefined.  Else, you will repeatedly fail with certificate validation errors.

Getting to know your portmgr@ – Steve Wills

It is my pleasure to introduce Steve Wills, the newest member of the portmgr team. Steve has done a tremendous work on the ports tree, especially in the field of testing and quality. Here is a short interview to get to know him better.

Name
Steve Wills

Committer name
swills

Inspiration for your IRC nick
Boring, it’s my userid.

TLD of origin
.us

Current TLD (if different from above)
Same.

Occupation
Sysadmin.

When did you join portmgr@
2014

Blog
Used to have one, use twitter more now (@swills)

Inspiration for using FreeBSD
Simplicty and learning.

Who was your first contact in FreeBSD
Can’t recall, it was ages ago.

Who was your mentor(s)
pgollucci

What was your most embarrassing moment in FreeBSD
Trying to migrate Ruby default version from 1.8 to 1.9 and having to roll back.

Boxers / Briefs / other
Heh, question assume survey taker is male, which I am, but I think we need to
work on diversity (but not in that “hey, let’s work on diversity and get some
women” way, but more in that we make something everyone wants to use)

What is your role in your circle of friends
The FreeBSD user. ;)

vi(m) / emacs / other
vi(m)

What keeps you motivated in FreeBSD
New users, new committers.

Favourite musician/band
I listen to a decent variety of stuff, but I suppose the thing I come back to
most is NIN.

What book do you have on your bedside table
I have an iPad by my bed, which I bought to read, but mostly I browse news on
it.

coffee / tea / other
Don’t drink caffeine, so don’t drink coffee much. I do drink good beer tho.

Do you have a guilty pleasure
Good dark chocolate. :)

How would you describe yourself
Mostly standard in many ways, husband, father, FreeBSD hacker, sysadmin, in
that order.

sendmail / postfix / other
Sendmail, tho dma is nice too.

Do you have a hobby outside of FreeBSD
Used to play guitar, still have one, don’t find time to pick it up much any
more.

What is your favourite TV show
Futurama

Claim to Fame
Ported Acidwarp from DOS to svgalib.

What did you have for breakfast today
Everything bagel with plain cream chese.

What sports team do you support
The only sport I watch is University of North Carolina Basketball.

What else do you do in the world of FreeBSD
ruby ports, perl ports sometimes

What can you tell us about yourself that most people don’t know
I was an employee at Red Hat way way back

Any parting words you want to share
Not really.

What is your .sig at the moment
Null.

Steve

Frederic Culot takes over as portmgr-secretary@

It is with great pleasure that the FreeBSD Ports Management Team announces that Frederic (culot@) Culot will take over responsibilities of team secretary effective immediately.

Frederic became a ports committer in October 2010, and joined the ranks of portmgr-lurkers@ in March 2014 as the shadow secretary.

Please drop him a note and congratulate him (or offer condolences).

 

Thomas
on behalf of portmgr@

 

Have you never been to BSDCan?

I remember a time when I’d never been to a conference related to my passions. Once I went, things changed. I realized that making strong working relationships with others who share my passion is important. Not only does this solidify the community of which you are a member, it also helps you personally. Every conference […]

The Short List #8: Using #lldb with a core file on #FreeBSD

Debugging qemu this evening and it took me a minute or two to figure out the syntax for debugging a core file with lldb.

lldb mips-bsd-user/qemu-mips -c /mipsbuild/qemu-mips.core

Make sure you have permissions to access both the binary and the core, else you get a super unhelpful error of:

error: Unable to find process plug-in for core file ‘/mipsbuild/qemu-mips.core’

But, after that, you can start poking around:

Core file ‘/mipsbuild/qemu-mips.core’ (x86_64) was loaded.

Process 0 stopped

* thread #1: tid = 0, 0x00000000601816fa qemu-mips`_kill + 10, name = ‘qemu-mips’, stop reason = signal SIGILL

frame #0: 0x00000000601816fa qemu-mips`_kill + 10

qemu-mips`_kill + 10:

-> 0x601816fa: jb 0x60182f5c ; .cerror

0x60181700: ret

0x60181701: nop

0x60181702: nop

(lldb) bt

* thread #1: tid = 0, 0x00000000601816fa qemu-mips`_kill + 10, name = ‘qemu-mips’, stop reason = signal SIGILL

* frame #0: 0x00000000601816fa qemu-mips`_kill + 10

frame #1: 0x000000006003753b qemu-mips`force_sig(target_sig=<unavailable>) + 283 at signal.c:352

frame #2: 0x00000000600376dc qemu-mips`queue_signal(env=<unavailable>, sig=4, info=0x00007ffffffe8878) + 380 at signal.c:395

frame #3: 0x0000000060035566 qemu-mips`cpu_loop [inlined] target_cpu_loop(env=<unavailable>) + 1266 at target_arch_cpu.h:239

frame #4: 0x0000000060035074 qemu-mips`cpu_loop(env=<unavailable>) + 20 at main.c:201

frame #5: 0x00000000600362ae qemu-mips`main(argc=1623883776, argv=0x00007fffffffd898) + 2542 at main.c:588

frame #6: 0x000000006000030f qemu-mips`_start + 367

 

How I killed 13 500 000 pages in the Google search engine

Talk about a loaded title, en par with the quality (or lack there of) of the various click bait titles on the postings I see on Facebook and friends...

I was told by my hosting provider that my index to the FreeBSD mailinglists at http://www.mavetju.org/mail/ was using more bandwidth alone than all of their public websites together. Now this is not much of a record, since they have only low-bandwidth websites, but still...

Looking through the logs, I saw that the Googlebot and the Bingbot and some bot from China were happily fighting over CPU and bandwidth to index all of the files. Going at it on a speed of about 50 requests per seconds for 24 hours per day.

So what could I do? Checking in Google for site:mavetju.org/mail/, I saw that there were about 13 500 000 pages indexed. For what goal? Not much anymore, I have stopped following all except the FreeBSD Announcement mailinglists a couple of years ago. I still use it on my laptops, but that is all.

So... That mailinglist archive has been shut down. You can still find the cached version of it in Google by using the above search terms, but that will disappear too.

And that is the story on how I killed 13 500 000 pages in Google. I wonder how much many computers in their data center that frees up for other things. Probably none...

Sometimes you have to sit down and write #FreeBSD documentation

When working on new projects or hacks, sometimes you just have to stop, think and start writing down your processes and discoveries. While working on bootstrapping the DLink DIR-825C1, I realized that I had accumulated a lot of new (to me) knowledge from the FreeBSD Community (namely, Adrian Chadd and Warner Losh).

There is a less than clear way of constructing images for these embedded devices that has an analogue in the Linux community under the OpenWRT project. Many of the processes are the same, but enough are different that I thought it wise to write down some of the processes into the beginning of a hacker’s guide to doing stuff and/or things in this space.

The first document I came up with was based on the idea that we can netboot these little devices without touching the on-board flash device. This is what you should use to get the machine bootstrapped and figure out where all the calibration data for the wireless adapters exist. This is crucial to not destroying your device. The wireless calibration data (ART) is unique to each device, destroying it will mean you have to RMA this device.

The second document I’ve created is a description of how to construct the flash device hints entries in the kernel hints file for FreeBSD. I found the kernel hints file to be cumbersome in comparison to the linux kernel way of using device specific C files for unique characteristics.

Its interesting stuff if you have the hankering to dig a bit deeper into systems that aren’t PC class machines.

Meraki Sparky boards, and constant resetting

There's a Mesh internet project at Sudo Room and they've been doing some great work getting a platform up and running. However, like a lot of volunteer projects, they're working with whatever time and equipment they've been donated.

A few months ago they were donated a few hundred Meraki Sparky boards. They're an Atheros AR2317 SoC based device with an integrated 2GHz 802.11bg radio, 10/100 ethernet and.. well, a hardware watchdog that resets the board after five minutes.

Now, annoyingly, this reset occurs inside of Redboot too - which precludes them from being (fully) flashed before the unit reboots. Once the unit was flashed with OpenWRT, the unit still reboots every five minutes.

So, I started down the path of trying to debug this.

What did I know?

Firstly, the AR2317 watchdog doesn't have a way of resetting things itself - instead, all it can do is post an interrupt. The AR7161 and later SoCs do indeed have a way to do a full hardware reset if the watchdog is tickled.

Secondly, redboot has a few tricksy ways to manipulate the hardware:

  • 'x' can examine registers. Since we need them in KSEG1 (unmapped, uncached) then the reset registers (0x11000xxx becomes 0xb1000xxx.) Since its hardware access, we should do them as DWORDS and not bytes.
  • 'mfill' can be used to write to registers.
Thirdly, there's an Atheros specific command - bdshow - which is surprisingly informative:

RedBoot> bdshow
name:     Meraki Outdoor 1.0
magic:    35333131
cksum:    2a1b
rev:      10
major:    1
minor:    0
pciid:    0013
wlan0:    yes 00:18:0a:50:7b:ae
wlan1:    no  00:00:00:00:00:00
enet0:    yes 00:18:0a:50:7b:ae
enet1:    no  00:00:00:00:00:00
uart0:    yes
sysled:   no, gpio 0
factory:  no, gpio 0
serclk:   internal
cpufreq:  calculated 184000000 Hz
sysfreq:  calculated 92000000 Hz
memcap:   disabled
watchdg:  disabled (WARNING: for debugging only!)

serialNo: Q2AJYS5XMYZ8
Watchdog Gpio pin: 6
secret number: e2f019a200ee517e30ded15cdbd27ba72f9e30c8


.. hm. Watchdog GPIO pin 6? What's that?

Next, I tried manually manipulating the watchdog registers but nothing actually happened.

Then I wondered - what about manipulating the GPIO registers? Maybe there's a hardware reset circuit hooked up to GPIO 6 that needs to be toggled to keep the board from resetting.

Board: ap61
RAM: 0x80000000-0x82000000, [0x8003ddd0-0x80fe1000] available
FLASH: 0xa8000000 - 0xa87e0000, 128 blocks of 0x00010000 bytes each.
== Executing boot script in 2.000 seconds - enter ^C to abort
^C
RedBoot> # set direction of gpio6 to out
RedBoot> mfill -b 0xb1000098 -l 4 -p 0x00000043
RedBoot> x -b 0xb1000098
B1000098: 00 00 00 43 00 00 00 00  00 00 00 00 00 00 00 03  |...C............|
B10000A8: FF EF F7 B9 7D DF 5F FF  00 00 00 00 00 00 00 00  |....}._.........|

RedBoot> # pat gpio6 - set it high, then low.
RedBoot> mfill -b 0xb1000090 -l 4 -p 0x00000042
RedBoot> mfill -b 0xb1000090 -l 4 -p 0x00000002

.. then I manually did this every minute or so.

RedBoot>
RedBoot> mfill -b 0xb1000090 -l 4 -p 0x00000042
RedBoot> mfill -b 0xb1000090 -l 4 -p 0x00000002
RedBoot> mfill -b 0xb1000090 -l 4 -p 0x00000042
RedBoot> mfill -b 0xb1000090 -l 4 -p 0x00000002

.. so, the solution here seems to be to "set gpio6 to be output", then "pat it every 60 seconds."

I hope this helps people bring OpenWRT up on this board finally. There seems to be a few of them out there!

The Short List #6: Make the CD drive do something useful on #FreeBSD

Noted today that while grip could access the CD drive on my machine, clemetine-player and xfburn could not.

Figure out which device node your CD drive is with camcontrol:

camcontrol devlist | grep cd
at scbus4 target 0 lun 0 (cd0,pass2)

Simply add the following to /etc/devfs.conf and restart devfs to get access to the CD device:

perm /dev/cd0 0666
perm /dev/xpt0 0666
perm /dev/pass2 0666

Now bear in mind, that this means any user of your machine has access to the device now. Hopefully, on a desktop computer, you know all the users of your machine.

Using Jenkins libvirt-slave-plugin with bhyve

I've played with libvirt-slave-plugin today to make it work with libvirt/bhyve and decided to document my steps in case it would be useful for somebody.

libvirt-slave-plugin

Assuming that you already have Jenkins up and running, installation of libvirt-slave-plugin is as follows. As we need a slightly modified version, we need to build it ourselves. I've made a fork which contains a required modification which could be cloned like that:

git clone -b bhyve [email protected]:jenkinsci/libvirt-slave-plugin.git

The only change I made is adding a single line with 'BHYVE' hypervisor type, you could find the pull request here. When that would be merged, this step will be not required.

So, getting back to the build. You'll need maven that could be installed from ports:

cd /usr/ports/devel/maven2 && make install clean

When it's installed, go back to the plugin we cloned and do:

mvn package -DskipTests=true

When done, login to the Jenkins web interface, go to Manage Jenkins -> Manage Plugins -> Advanced -> Upload Pluging. It'll ask to provide a path to the plugin. It would be target/libvirt-slave.hpi in our plugin directory.

After plugin is installed, please follow to Manage Jenkins -> Configure System -> Add new cloud. Then you'll need to specify hypervisor type BHYVE and configure credentials so Jenkins could reach your libvirtd using SSH. There's a handy 'Test Connection' you could use your configuration.

Once done with that, we can go to Manage Jenkins -> Manage Nodes -> New Node and choose 'libvirt' node type. Then you'll need to choose a libvirt domain to use for the node. From now on, node configuration is pretty straightforward, expect, probably an IP address of the slave. To find out an IP address, you'd need to find out its MAC address (just run virsh dumpxml and you'll find it there) and then find the corresponding file in dnsmasq/default.leases file.

Guest Preparation

The only thing guest OS needs is to have jdk installed. I preferred to download a package with java/openjdk7, but I had to configure network first. My VMs use bridged networking on virbr0, so NAT config looks like that in /etc/pf.conf:


ext_if="re0"
int_if="virbr0"

virt_net="192.168.122.0/24"

scrub all

nat on $ext_if from $virt_net to any -> ($ext_if)

Now openjdk could be installed from the guest using:

pkg install java/openjdk7

Finally, find the nodes in node management menu and press 'Launch slave agent' button. It should be ready for the builds now.

PS It might be useful to sync clock on both guest and host systems using ntpdate.

PPS libvirt version should be at least 1.2.2.


Getting to know your portmgr-lurker — Frederic Culot

Name

Frederic Culot

Committer name

culot

Inspiration for your IRC nick

lack of inspiration actually…

TLD of origin

.fr

Current TLD (if different from above)

.lu

Occupation

IT consultant in the banking sector in Luxembourg, but I don’t always do IT.
I am also interested in business and management and my wife and I are working
on starting our own business.

When did you join portmgr@

Joined FreeBSD as a committer in October 2010 and the portmgr-lurkers program in
March 2014, but never been part of portmgr@.

Blog

http://people.freebsd.org/~culot is the closest thing I have to a blog

Inspiration for using FreeBSD

I was a longtime OpenBSD user until I worked in the same company as clement@
(former portmgr) who successfully managed to convert me to FreeBSD. I did not
feel the need to look into another system since then.

Who was your first contact in FreeBSD

clement@. But when I really started to get involved in FreeBSD it was jadawin@
who first contacted me. He is one of the kindest person I ever worked with and
while we’ve known each others for about 4 years now I’ve never been able to
meet him in person. But that’s the way it is with projects such as FreeBSD:
teams are virtual and gathering together might be difficult unfortunately.

Who was your mentor(s)

My mentors were sahil@ and wen@. Thanks to them I believe my mentorship at
FreeBSD was the best induction program I ever experienced. I was also amused to
realize that whereas companies spend huge amounts to design reward systems, it
is sometimes when nothing is to be expected in return that people are the most
caring and helpful.

What was your most embarrassing moment in FreeBSD

My first pointy hat: a bit after my first 700 commits when I started to feel
confident I finally managed to break INDEX :’(

Boxers / Briefs / other

Any 15-year old single material does it.

What is your role in your circle of friends

uncork the bottles usually…

vi(m) / emacs / other

vim

What keeps you motivated in FreeBSD

The people behind it. There are lots of great guys behind this project, and a
day when I could not meet with other developers on irc is a sad day for me :’(

But FreeBSD is also one of my sources of inspiration when it comes to how
organizations behave and innovate (which is a topic of interest I got into
during my MBA studies) and I find it very interesting to compare FreeBSD with
the for-profit companies I work for. I even wrote an article for BSDmag in case
some would also be interested in those aspects:

http://people.freebsd.org/~culot/BSDmag.pdf

> Favourite musician/band

I don’t listen to much music. The cause might be that I work in a very noisy
environment (large open-space), so I more and more enjoy silence and calm when
I’m back home. But recently when I listen to music I enjoy Moby’s “wait for me”
album (ambient edition), Erik Mongrain, or a bit of merengue to remind me of my
holidays.

What book do you have on your bedside table

Nietzsche’s Thus spoke Zarathoustra.

I even extracted my favorite quotes and created the
french/fortune-mod-zarathoustra port.

coffee / tea / other

Both, depends on the time of day

Do you have a guilty pleasure

To enjoy a 7-course meal with my wife at a 3-star michelin restaurant and
finish relaxing in a club chair in front of the fireplace with a 40 years old
armagnac.

How would you describe yourself

Sober, clever, and motivated in the morning.
Drunk, stupid, and depressed in the evening.
Or is it the opposite?

sendmail / postfix / other

sendmail as it’s in base, but not for long apparently so I could have to make a
more reasoned choice soon

Do you have a hobby outside of FreeBSD

Sports (I go to the gym almost everyday day, did quite some scuba diving and
snowboarding when I was younger), but I enjoy good food and wine when I’m done
training. I also enjoy traveling. My last trips were to India, Dominican
Republic, and Lapland: so many nice places to visit!

What is your favourite TV show

My favorites to day are twin peaks, the prisoner, and battlestar galactica

Claim to Fame

I spent one night at the pub with bapt@, and survived.

What did you have for breakfast today

oat flakes with water

What sports team do you support

If you want to torture me, just fasten me in front of the TV with a soccer game
on. I could even confess I enjoy tabthorpe’s jokes just to shorten the ordeal.

<Editors note: I have know idea what he means by this :>

What else do you do in the world of FreeBSD

Apart from my work on ports I also did some French translations (translated
the contributing-ports, linux-users, and building-products articles). I also
try to participate in IT exhibits and promote FreeBSD by managing booths such
as at Solutions Linux Paris for which I designed a poster to attract visitors:

http://people.freebsd.org/~culot/SolutionsLinux.png

But most importantly I offer beers and whisky to other FreeBSD developers when
I meet them :)

What can you tell us about yourself that most people don’t know

I actually enjoy tabthorpe’s jokes. Sometimes.

<Editor’s note: again, no clue what he is talking about :>

Any parting words you want to share

I repeat it but my main motivation to work on this project is to get in touch
with other FreeBSD enthusiasts, so do not hesitate to ping me on irc if you
feel like sharing some of your thoughts with me. I would be most pleased.

What is your .sig at the moment

Regards,
Frederic Culot

Adding chipset powersave support to FreeBSD’s Atheros driver

I've started adding some basic powersave support to the FreeBSD Atheros ath(4) driver. The NICs support putting parts of the device to sleep to conserve power but.. well, it's tricky.

In order to make things consistent, I either need to not do things when the NIC is asleep (for example, doing calibration when the NIC isn't running), but I also need to ensure that I force the NIC awake when the NIC may be asleep. During normal running, the NIC may have put itself into temporary sleep whilst waiting for some packets from the AP to signal that it needs to wake up. So I will also need to force the NIC awake before programming it.

So, before I start down the path of handling the whole dynamic power management stuff, I figured I'd tackle the initial bits - handling powering on the NIC at startup and powering it off when it's not in use. This includes powering it down during device detach and suspend, as well as when all of the VAPs are down.

This is turning out to be slightly more complicated than I'd like it to be.

The first really stupid thing I found was that during the interface down process, the VAP state change from RUN -> INIT would reset the BSS, which included re-programming the slot time. So, I have to wake up the hardware when programming that. It can then go back to sleep when I'm done with it.

Now there's some issues in the suspend path with the NIC being marked as asleep when it is being reset, which is confusing - the NIC should be woken up when ath_reset() is called. So, I'll have to debug these.

The really annoying bit is that if I read a register whilst the silicon is asleep, the reads return 0xDEADBEEF. So if I am storing the register contents anywhere, I'll end up storing and programming a potentially totally invalid value.

There's also some real problems with race conditions. I can put the power state changes behind a lock, but imagine something like this:

* ATH_LOCK; force awake; do something; ATH_UNLOCK .. ATH LOCK; do some more; put back to sleep; ATH_UNLOCK

Now, if a second thread puts the NIC back to sleep in between those two lock sections, the second "do some more" work may occur once the NIC was put to sleep by said second thread. So I have to correctly track if the NIC is being forced awake by refcounting how many times its being forced awake, then when the refcount hits zero and we can put it to sleep, put it back to sleep.

Once this is all done, I can start down the path of supporting proper network sleep - where the NIC stays asleep and wakes up to listen for beacons and received frames from the AP. I then choose to force the NIC awake and do more work. I have to make absolute sure that I don't queue things like transmitted frames or add more frames to the receive queue if it may fall asleep. There's also some mechanisms to have a transmit frame put the NIC to sleep - there's a bit that says "when this frame is transmitted, transition the NIC back to sleep." I have to go and figure out how that works and implement that.

But for now, let's keep it simple and debug just putting the NIC to sleep when it's not in use.