Huge Announcement for PC-BSD

Hey PC-BSDers!

Some of you may have heard the speculation surrounding the last couple weeks of development here at the PC-BSD project, and I’d like to go ahead and address those rumors in today’s post.  We know there has been a large amount of misinformation circulating so we recommend that all of our users please read the following explanation so the information is clear.

For the last couple of months we have been beating our heads against the wall trying to find a way to put us on par with the best Linux distros out there such as Ubuntu.  During a phone meeting on Monday between Kris, Dru, Ken and myself (Josh), we began tossing around ideas on how we can make that happen.  One of the ideas that was presented was making Lumina DE the only supported desktop environment in PC-BSD and by doing so focus all of our development time exclusively on it.  Ken Moore argued that if we could create a desktop environment that everyone loved, and “Unify” the user experience, that no one would have any reason ever to use anything else because it was the best.  Although we did toss the idea around for a while we thought that might be a little bit of an issue with a few of our users that like to use other crappy desktop environments.

After an hour of spirited back and forth debate, many good ideas were presented.  As we all took a moment during our video call to think about the ideas that were just discussed, I interrupted the awkward silence and told the rest of the group about an idea I had been thinking about for a few days.  “What if we could do something even better than Unify the user experience…What if we could make the system boot extremely fast”. I explained that if we could hard code a “boot shim” so to speak into the kernel, that we could be the fastest booting unix-like distribution out there.  We all looked at each other and started to realize this was the “holy grail” so to speak that we had been looking for.  After perusing the internet for about 3 and a half minutes we found out there was already a Linux solution available.

We hammered out the details and now want to present you with our new “Everything Manager” the new SystemBS-D.  Most of you are familiar with how we can port different types of Linux software and run them through our emulation layer.  Using the same emulation magic we have taken what one kid that lives on my street called “the best piece of software ever. The end.”, and ported it over to PC-BSD.  SystemBS-D not only makes your system boot faster, but it can basically manage everything on your system as well.  Sure it still crashes a lot and it has trouble displaying log files, but we’re working on that.  In a discussion with Kris Moore he stated “SystemBS-D may be unstable, but it DID make the system boot super fast.  I feel like that’s a pretty good trade.”

Taking cues from other popular software companies we have also thought about integrating a really cool storefront into PC-BSD and making people look at it before they can go to their desktop.  We could also consider locking down the user’s system with “grub-lock” so no other operating systems could be installed on it… oops I meant “secure” the user’s system.

We look forward to hearing your feedback on these new developments.

P.S. if you thought any of this was real look at the date on your computer.  Happy 1st!

Best Regards,

Josh

 

Lumina Desktop 0.8.3 Released!

The next version of the Lumina Desktop Environment has just been released!

This is mainly a bugfix release to correct an urgent issue with the system tray on FreeBSD 11, but there are a number of other slight improvements/updates included as well. The full list of changes is included at the bottom of this announcement, but the notable changes are as follows:

  • New Panel Plugin: “Application Launcher“
    • This allows the user to pin the shortcut for an application directly to a panel.
  • New Utility: “lumina-xconfig“
    • This utility allows the user to easily enable/disable additional monitors/screens within the desktop session.
  • Fix the issue with transparent system tray icons on FreeBSD 11
  • Add support for the XDG autostart specifications.

 

The FreeBSD port has already been updated and this version will be included in the next set of PC-BSD package updates (“Edge” packages being created now) as well as included in the next PC-BSD 11 image for April (coming soon).

 

Reminder: The Lumina desktop environment is still considered to be “beta-quality”, so if you find things that either don’t work or don’t work well, please report them on the PC-BSD bug tracker so that they can get fixed as soon as possible. Feel free to also post tickets for any feature requests or improvements that you think might be useful!

 

Lumina-0.8.3

 

Changes Since 0.8.2

FreeDesktop Standards Compliance:

  •  A number of bugs related to detecting/using XDG mimetypes were fixed.
  • Support for the XDG autostart specifications was added (usage only  — more work is still necessary to convert the current Lumina autostart spec over)
  •  Add some additional fallback routines to account for possible errors in *.desktop files.

 

New Utility: “lumina-xconfig”

  •  This utility is a graphical front-end to xrandr, and allows the user to easily add/remove screens from the current X session.
  • Shortcuts to this utility are available in the user button plugin, and the settings menu plugin

 

Insight File Manager:

  •  Add support for creating new (empty) files.
  •  Add an option for enabling/disabling the use of image thumbnails (useful if you have massive directories of pictures — just be sure you disable thumbnails *before* loading the directory).
  •  Add initial drag-and-drop support for moving files/dirs within a directory.
  •  Load the specific icon for any application shortcuts.
  •  Add the ability to view checksums of files.
  •  Add some additional checks/excludes for copy/move operations in the background to prevent the user from performing illegal operations (such as moving a directory into itself).
  •  Add support for listing statistics about the current directory on the window (number of files, total size of files, percent of the filesystem which is used).
  •  Streamline the frequency of the background directory checker — now it runs much less often.

 

Desktop Changes:

  •  Disable the shutdown/restart options on PC-BSD if the system is in the middle of performing updates. The system may still be shutdown/restart from within PCDM — this just adds an extra layer of safety for users.
  •  Have the shutdown/restart options use the “-o” option on FreeBSD/PC-BSD so that the system performs the action much faster.
  •  Add support for thumbnails, increasing/decreasing icon sizes, removing files, and  cut/copy files to the “desktopview” desktop plugin (the plugin which provides traditional desktop icons).
  •  Add support for increasing/decreasing the icon size for the application launcher desktop plugin.
  •  Update the icon used for the “favorites” system in the user button and the file manager.
  •  Add the ability to display alternate timezones in the system clock. This does *not* change the system time at the moment, it is just a setting for the visual clocks/plugins.
  •  Add a new panel plugin for pinning application shortcuts directly to the panel. (just like the “applauncher” desktop plugin, but on the panel).
  •  Perform the initial search for applications on the system within the session initialization. This ensure that buttons/plugins are responsive as soon as the desktop becomes visible.
  •  Fix an issue with transparent system tray icons on FreeBSD 11, and convert the system tray embed/unembed routines to use the XCB library instead of XLib.

EuroBSDCon 2015

EuroBSDCon 2015 (http://2015.eurobsdcon.org/), Stockholm University, Stockholm, Sweden 1 - 4 October, 2015. EuroBSDcon is the premier European conference on the open source BSD operating systems attracting about 250 highly skilled engineering professionals, software developers, computer science students and professors, and users from all over Europe and other parts of the world. The goal of EuroBSDcon is to exchange knowledge about the BSD operating systems, facilitate coordination and cooperation among users and developers. The dates for EuroBSDCon 2015 in Stockholm have been set to October 1-2nd for tutorials and October 3-4th for the main conference.

Using the arswitch ethernet switch on FreeBSD

I sat down a few weeks ago to make the AR8327 ethernet switch work and in doing so I wanted to add per-port and 802.1q VLAN support. It turned out that I .. didn't know as much I thought I did about the etherswitch support. So, after a whole bunch of trial-and-error, I wrapped my head around things. This post is mostly a braindump so if I do forget I have something written down about it - at least until I turn it into a FreeBSD manpage.

There's three modes:
  • default - all ports are in the same VLAN;
  • per-port - each port can be in a VLAN 'group';
  • dot1q - each port can be in multiple VLAN groups, with 802.1q tagging going on.
The per-port VLAN group is for switches that don't have an arbitrary VLAN table - you just assign each port an ID from some low set of values (say, 16), and then the VLAN tag can either be added or not added. I think the RTL8366 switch is like this, but I'd have to check.

The dot1q VLAN is for switches that support multiple VLANs, each can have an arbitrary VLAN ID (0..4095) with optional other VLAN options (like tag-in-tag support.)

The etherswitch configuration side has a few options and they're supported by different hardware:
  • Each port has a port VLAN ID - this is the "native port" for dot1q support. I don't think it has any particular meaning in the per-port VLAN code in arswitch but I could be terribly wrong. I thought it did when I initially did the port, but the documentation is .. lacking.
  • Then there's a set of per-port flags - eg q-in-q, 802.1q tagging, etc.
  • Then there's the vlangroup - each vlangroup has a vlan ID, and then a set of port members. Each port member can be tagged or untagged.
This is where things get odd.

Firstly - the AR934x SoC switch support doesn't include VLANs. I need to add that. I'm not sure which side of the wall this falls.

The switches previous to the AR8327 support per-port and VLAN configuration, but they don't support per-port-per-VLAN tagging. Ie, you can configure 802.1q VLANs, and you can enable tagging on the port - but it tags all packets that aren't the port 'VLAN ID'.

The per-port VLAN ID seems ignored by the arswitch code - it's only used by the dot1q support.

So I think (and it hasn't yet been tested) that on the earlier switches, I can use per-port VLANs with tagging by:
  • Configuring per port vlans - "etherswitch config vlan_mode port"
  • Adding vlangroups as appropriate with membership - tag/untag doesn't matter
  • Set the CPU port up to have tagging - "etherswitch port0 addtag"
When configuring dot1q VLANs, the mode is "config vlan_mode dot1q" and the 802.1q VLAN IDs are used, but the above still holds - the port is tagged or untagged.

But on the AR8327, the VLAN map hardware actually supports enabling/disabling tagging on a per-port-per-VLAN basis. Ie, when the VLAN table is programmed with the port membership, it takes a list of both the ports and whether the ports are tagged/untagged/open/filtered. So, I don't think per-port VLAN tagging works - only dot1q tagging. Maybe I can make it work, but I haven't really sat down for long enough with the documentation to see what combinations are required.
  • Configure the hardware - "etherswitch config vlan_mode dot1q"
  • Add vlangroups as appropriate, set pvid as appropriate
  • For each vlangroup membership, the port can be tagged or untagged - eg to tag the cpu port 0, you'd use '0t' as the port member. That says "port0 is a member, and it's tagged."
I still have a whole lot more to add - the ingress/egress filters aren't configurable, the per-port vlan stuff needs to be made much more sensible and consistent - and the AR934x SoC switch needs to support VLANs. Oh, and much more documentation. But, hey, I can get the thing spitting out VLAN tags, so when it's time to setup my home network with some VLANs, i'll be sure to document what I did and share it with everyone.

bsdtalk252 – devio.us with Brian Callahan

This episode has been brought to you by jot, the utility for printing sequential or random data.  Jot first appeared in 4.2 BSD.

An interview with devio.us admin Brian Callahan.  http://devio.us is a free shell provider that runs on OpenBSD.

File Info: 18Min, 8MB.

Ogg Link: https://archive.org/download/bsdtalk252/bsdtalk252.ogg

Cache Line Aliasing #2, or "What happens when you page align everything"

After a little more digging into the Intel performance side of things, I discovered one of the big reasons for the performance drop on this particular workload: how Intel CPUs do memory reordering.

The TL;DR is this - there's some hardware inside the Intel CPUs that tracks memory ordering and cache contents - but they don't use all the address bits.

The relevant chapter in the intel optimisation guide is 3.6.8 - Capacity Limits and Aliasing in Caches. The specific thing I was hitting was in 3.6.8.2 - Store Forwarding Aliasing.

Assembly/Compiler Coding Rule 56. (H impact, M generality) Avoid having a store followed by a non-dependent load with addresses that differ by a multiple of 4 KBytes. Also, lay out data or order computation to avoid having cache lines that have linear addresses that are a multiple of 64 KBytes apart in the same working set. Avoid having more than 4 cache lines that are some multiple of 2 KBytes apart in the same first-level cache working set, and avoid having more than 8 cache lines that are some multiple of 4 KBytes apart in the same first-level cache working set.

So, given this, what can be done? In this workload, a bunch of large matrices were allocated via jemalloc, which page aligns large allocations. In the default invocation of the benchmark (where the allocation padding size is 0), the memory access patterns showed a very large number of counter events on "LD_BLOCKS_PARTIAL.ADDRESS_ALIAS" - which is the number of 64k address aliases on the Sandy Bridge Xeon processors I've been testing on. (The same occurs on Westmere, Ivy Bridge and Haswell.) As I vary the padding size, the address aliasing value drops, the memory access counters increase, and the general performance increases.

On the test boxes I have (running pmcstat -w 120 -C -p LD_BLOCKS_PARTIAL.ADDRESS_ALIAS ./himenobmtxpa M )

0 217799413 830.995025
64 18138386 1624.296713
96 8876469 1662.486298
128 19281984 1645.370750
192 18247069 1643.119908
256 18511952 1661.426341
320 19636951 1674.154119
352 19716236 1686.694053
384 19684863 1681.110499
448 18189029 1683.163673
512 19380987 1691.937818

So there's still plenty of aliasing going on at different padding offsets, however it's a very marked drop between 0 and, well, anything.

It turns out that someone's gone and done a bunch more digging into the effects of various CPU magic under the hood. The last paper in the list (Analysing Contextual Bias..) looks at Aliasing and Cache Effects and the effect of memory layout. There's some cute (and sobering!) analysis of the performance changes due to something as simple as the length of your login name in the UNIX environment. It's worth reading.

The summary? Maybe page alignment of all of your memory accesses isn't the way to go.

For further reading:

Mesa 10.x and i915 on FreeBSD 9.x

Currently, the i915 driver on FreeBSD 9.3 has no support for hardware context. This means we are stuck with Mesa 9.1 and xserver 1.14.

To solve this, the plan is to provide the i915 kernel driver as a port. xf86-video-intel is modified to automatically load this driver, instead of the i915kms.ko module shipped with FreeBSD 9.3-RELEASE.

A patch is available: we would like users running FreeBSD 9.3 on an i915-based desktop to test it with their daily workload.

When this works, we can get rid of Mesa 9.1 and move forward with Mesa 10.x and xserver 1.17+.

Unifying Mesa ports’ configure

To minimize the time to build each Mesa port and limit the dependencies, we always configured Mesa with different flags, depending on the port being compiled. For instance, graphics/libEGL was built without Gallium, so it didn’t pull LLVM.

We discovered this strategy produces a non-working Mesa:

  • libEGL is built without the “drm” platform because it is enabled only if Gallium drivers are enabled too. This prevented us from activating glamor support in the X.Org server. We thought it was something too Linux-centric and it made the server crash on FreeBSD but we were wrong: once libEGL is built with the same flags as graphics/dri, everything works out-of-the-box.
  • Likewise with Clover, the OpenCL implementation. Without Gallium drivers, pipeloader modules are not compiled. Therefore, libOpenCL.so is installed but it is non-functional.

Therefore, we are unifying all Mesa ports so they all use the same configure flags and dependencies. This means:

  • The GALLIUM option will be removed and the support for Gallium will turned on for everybody. We do this because building half of Mesa with Gallium and not the other half leads to hard to debug situations.
  • All Mesa ports will depend on LLVM, at least on x86. Fortunately graphics/dri already depends on it with default options and Mesa is useless without it.

This greatly simplifies Mesa ports. Furthermore, as said above, this will allow us to enable glamor in the X.Org server. Glamor is the only 2D acceleration API supported by recent Radeon cards where EXA was not implemented.

DRM 3.8 update committed!

The DRM device-independent code was updated to match Linux 3.8 in r280183 in FreeBSD 11-CURRENT! The update will be merged to FreeBSD 10-STABLE after the dust settled.

I want to thank everyone who tested the patch in the past year and reported problems.

What’s new?

Changes were covered by previous articles:

And now?

In the kernel, the next step is to take care of the i915 driver: we badly need Haswell support.

15th Anniversary and Spring Fundraising Kickoff

I'm so excited to announce our spring fundraising campaign. I know it's not officially spring yet, but it sure feels like it here at Foundation headquarters in Boulder, Colorado. We're kicking off our fundraising campaign in conjunction with some other exciting events. There's so much to celebrate. First, we are proud to be a Platinum sponsor of AsiaBSDCon. This is the tenth AsiaBSDCon, with over 140 attendees planned, and 31 talks, providing a venue for all things BSD in Asia. People from around the world attend this conference to learn about the BSD operating systems, share their knowledge and experience, and work together to develop, hack, fix, improve, and document the various BSD operating systems.

The most exciting news we have is that we are celebrating our 15th anniversary supporting the FreeBSD Project and community worldwide! We have grown from our president and founder, Justin Gibbs, creating a non-profit to support FreeBSD, to an eight member board with 7 staff members. In case you missed it earlier, check out Justin's interview about the history of the Foundation on BSDNow

As the first employee, 9 years ago, I've witnessed incredible growth in our ability to support the Project and community. The year we were founded we raised a whopping $7,000. My first year with the Foundation, in 2006, we raised a little over $100,000. And, last year we raised $2,436,194, spending $877,412 on the project.

When we first started out, we focused on funding project development, conference sponsorships, and travel grants. Fifteen years later, we have increased support in those areas and have now grown to providing legal support for the Project; purchasing and helping manage hardware for FreeBSD infrastructure; providing release engineering support for consistent and timely releases; creating marketing literature and presentations that not only inform people of what FreeBSD is, but also provides detailed information on what's in new releases; attending more conferences to promote FreeBSD; and publishing a professional online FreeBSD magazine, The FreeBSD Journal.

To celebrate our anniversary, we are kicking off a fundraising campaign to help broaden the reach of our mission, by adding 500 new community investors in the next four weeks. What's a new community investor? An individual or organization that makes their first 2015 donation during this spring campaign. 

Why donate to the Foundation? Your donations will help us continue and increase our support in the following areas:
  • Funding improvement and development projects, including: Native ISCSI kernel Stack, Updated video console (Newcons), UEFI system boot support, Capsicum component framework, IPv6 support in FreeBSD, Auditdistd improvements for FreeBSD cluster, and adding modern AES modes to OpenCrypto (to support IP/SEC).
  • Helping to provide consistent and on-time releases.
  • Educating the public and promoting FreeBSD with tools like our high-quality FreeBSD 10X Brochure and company visits to help
  • facilitate collaboration efforts with the Project.
  • Sponsoring BSD conferences and summits in Europe, Japan, Canada, and the US.
  • Protecting FreeBSD IP and providing legal support to the Project.
  • Purchasing hardware to build and improve FreeBSD project infrastructure.
For the last 15 years, you as a community have allowed us to make an major impact on the FreeBSD Project and Community. Please help us continue and increase our support by making a donation today.


Deb Goodkin, Executive Director

FreeBSD From the Trenches: Using autofs(5) to Mount Removable Media

This next FreeBSD From the Trenches story come to us from Edward Tomasz Napierała who shares his work on the new FreeBSD automounter.

My big project for 2014 was the new FreeBSD automounter.  Like any proper FreeBSD Foundation sponsored project, it included the usual kind of documentation - man pages and the Handbook chapter.  But there is no document that shows how it works inside, from the advanced system administrator or a power user point of view.

So, here it is.  The article demonstrates how modular the automounter is, and how easy it is to adopt to any mount-related situation you might have, using recently added removable media support as an example.  (And it shows some related mechanisms as a bonus.)

autofs(5) Basics

The purpose of autofs(5) is to mount filesystems on access, in a way that's transparent to the application. In other words, filesystems get mounted when they are first accessed, and then unmounted after some time passes. The application trying to access the filesystem doesn't even notice this event, apart from a slight delay on first access.  It's a mechanism similar to ones available in other systems, in particular to OS X.  It's a completely independent implementation, it's just that OS X is the other operating system I use.

Automounting requires cooperation of four things: the kernel filesystem, autofs.ko, which is responsible, among other things, for "pausing" the application until the filesystem is actually there; the automountd(8) daemon, which is the component that retrieves configuration information from maps (this includes fetching it from remote sources, such as LDAP) and actually mounts the filesystems; the automount(8) utility for various administrative purposes; and then the autounmountd(8) daemon to, well, unmount the filesystems mounted by automountd(8) after a timeout.

Setting it up is fairly simple: you obviously need to have autofs(5) enabled in /etc/rc.conf:
autofs_enable="YES"
And you need to have the autofs(5) daemons running - just like other deamons in FreeBSD those will get started at system bootup if autofs_enable was set; otherwise you need to start them by hand:
# /etc/rc.d/automount start
# /etc/rc.d/automountd start
# /etc/rc.d/autounmountd start
The kernel driver will get loaded automatically, you can see it in kldstat(8) output.

autofs(5) and Removable Media

Note that at the time of this writing, this is only available in FreeBSD 11-CURRENT. This will change soon.

The main configuration file for autofs(5) is /etc/auto_master; you need to uncomment this line:
/media -media -nosuid
This basically says that there is a /media directory, and automount will mount the "-media" map there, and everything that gets mounted there will have the "nosuid" mount option, for security reasons.

If you already had autofs(5) running before uncommenting the line, you must refresh its configuration by running automount(8) as root; run it as "automount -v" for a detailed explanation of what it does.  It looks like this:
# automount -v
automount: parsing auto_master file at "/etc/auto_master"
automount: done parsing "/etc/auto_master"
automount: unmounting stale autofs mounts
automount: skipping /, filesystem type is not autofs
automount: skipping /dev, filesystem type is not autofs
automount: leaving autofs mounted on /net
automount: mounting new autofs mounts
automount: autofs already mounted on /net
automount: nothing mounted on /media; mounting
automount: mounting map -media on /media, prefix "/media", options "nosuid"
If you run mount(8), you will see so called "trigger nodes" of type autofs(5):
# mount
/dev/ada0p2 on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)

Basic usage

With all that done, plug a drive into USB, and here is what happens in a real-world case:
[trasz@brick:~]% ll /media
total 9
drwxr-xr-x 3 root wheel 512 Feb 24 12:54 .
drwxr-xr-x 30 root wheel 1024 Feb 24 12:28 ..
drwxr-xr-x 1 root wheel 4096 Jan 1 1980 ADATA UFD
drwxr-xr-x 3 root wheel 512 Feb 24 12:54 md0
[trasz@brick:~]% cd /media/ADATA UFD
[trasz@brick:/media/ADATA UFD]% ll
total 10117
drwxr-xr-x 1 root wheel 4096 Jan 1 1980 .
drwxr-xr-x 3 root wheel 512 Feb 24 12:54 ..
drwxr-xr-x 1 root wheel 4096 Nov 24 00:03 .Spotlight-V100
drwxr-xr-x 1 root wheel 4096 Nov 24 00:03 .Trashes
-rwxr-xr-x 1 root wheel 4096 Nov 24 00:03 ._.Trashes
drwxr-xr-x 1 root wheel 4096 Jan 13 11:24 .fseventsd
drwxr-xr-x 1 root wheel 4096 Nov 22 22:44 Bonus
-rwxr-xr-x 1 root wheel 3309568 Nov 24 14:50 DSC05996.JPG
-rwxr-xr-x 1 root wheel 4063232 Nov 24 14:50 DSC05997.JPG
-rwxr-xr-x 1 root wheel 2953199 Nov 25 21:40 DSC05998.JPG
drwxr-xr-x 1 root wheel 4096 Nov 22 18:24 Meshuggah
drwxr-xr-x 1 root wheel 4096 Nov 22 21:06 System Volume Information
[trasz@brick:/media/ADATA UFD]% mount
/dev/ada0p2 on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
/dev/da0s1 on /media/ADATA UFD (msdosfs, local, nosuid, automounted)
[trasz@brick:/media/ADATA UFD]% cd /
[trasz@brick:/media/ADATA UFD]% sudo automount -u
[trasz@brick:/media/ADATA UFD]% mount
/dev/ada0p2 on / (ufs, local, noatime, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
map -hosts on /net (autofs)
map -media on /media (autofs)
Two things to notice here: first, the "ADATA UFD" is a factory default
filesystem label on the flash drive.  If there was no filesystem label,
autofs(5) would use device name instead - in this case, that would
be "da0s1".  Second - if you don't want to wait for autounmountd(8)
to unmount the automounted volume, you can use "automount -u".  Or
"automount -fu", if you want to force unmount.

Not So Basic Usage

Take a close look at the directory listing for /media in previous example. Did you notice the "md0" there?  It looks like a device node for memory disk (md(4)), but is a directory.  That's a leftover from my earlier experimentation, and shows an interesting feature of autofs(5)-based automounter: it's not limited to removable media, it can mount everything that's available for mounting.  In this case it's a memory disk (kind of ramdisk, see "man mdconfig").  It can also be an iSCSI lun.  And, of course, a removable media.  How does that work?

GEOM

In FreeBSD, GEOM is a name of what could otherwise be called a block device layer.  It's a piece of code that manages all the "disk-like devices", both physical and virtual: SATA/SAS/FC/NVME/USB drives, memory disks, iSCSI LUNs, partitions, encrypted GELI volumes etc.

GEOM has another meaning: an instance of GEOM class.  The "class" here means the "kind" of device, and the instance is an actual device of that kind. It's easiest to explain it with an example:
# geom disk list
Geom name: cd0
Providers:
1. Name: cd0
   Mediasize: 0 (0B)
   Sectorsize: 2048
   Mode: r0w0e0
   descr: MATSHITA DVD/CDRW UJDA775
   ident: (null)
   fwsectors: 0
   fwheads: 0

Geom name: ada0
Providers:
1. Name: ada0
   Mediasize: 250059350016 (233G)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r2w2e3
   descr: Samsung SSD 850 EVO 250GB
   lunid: 5002538da000f602
   ident: S21PNSAFC02149R
   fwsectors: 63
   fwheads: 16

Geom name: da0
Providers:
1. Name: da0
   Mediasize: 7654604800 (7.1G)
   Sectorsize: 512
   Mode: r0w0e0
   descr: ADATA USB Flash Drive
   lunname: USB MEMORY BAR
   lunid: 2020030102060804
   ident: 14A0711312300023
   fwsectors: 63
   fwheads: 255

# geom part list
Geom name: ada0
modified: false
state: OK
fwheads: 16
fwsectors: 63
last: 488397127
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: ada0p1
   Mediasize: 65536 (64K)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 1024
   Mode: r0w0e0
   rawuuid: 42dc1b8b-c49b-11e3-8066-001c257ac65f
   rawtype: 83bd6b9d-7f41-11dc-be0b-001560b84f0f
   label: (null)
   length: 65536
   offset: 17408
   type: freebsd-boot
   index: 1
   end: 161
   start: 34
2. Name: ada0p2
   Mediasize: 236223201280 (220G)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 1024
   Mode: r1w1e1
   rawuuid: 42dc921f-c49b-11e3-8066-001c257ac65f
   rawtype: 516e7cb6-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 236223201280
   offset: 82944
   type: freebsd-ufs
   index: 2
   end: 461373601
   start: 162
3. Name: ada0p3
   Mediasize: 13836045312 (13G)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 1024
   Mode: r1w1e0
   rawuuid: 21a8eef9-a0d4-11e4-ab80-001c257ac65f
   rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 13836045312
   offset: 236223284224
   type: freebsd-swap
   index: 3
   end: 488397127
   start: 461373602
Consumers:
1. Name: ada0
   Mediasize: 250059350016 (233G)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r2w2e3

Geom name: da0
modified: false
state: OK
fwheads: 255
fwsectors: 63
last: 14950399
first: 1
entries: 4
scheme: MBR
Providers:
1. Name: da0s1
   Mediasize: 7654576128 (7.1G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 28672
   Mode: r0w0e0
   rawtype: 12
   length: 7654576128
   offset: 28672
   type: !12
   index: 1
   end: 14950399
   start: 56
Consumers:
1. Name: da0
   Mediasize: 7654604800 (7.1G)
   Sectorsize: 512
   Mode: r0w0e0
See?  I've used the geom(8) command to get the information about two GEOM classes: "disk", and "part".  The first one returned information about three instances of the disk class: the DVD drive, the SSD, and the flash.  The second one returned information on the partitions known to the system. Everything that is potentially mountable - a physical disk, a partition, encrypted ELI volume, multipath device, RAID3 volume, memory disks, even volume labels - it all has its GEOM class and can be queried in a similar way.  To see all the GEOM instances in the running system, use:
# sysctl kern.geom.conftxt
Now, notice the "Mode" lines.  Like the one for ada0: "r2w2e3".  Those are three usage counters for ada0 GEOM: read, write, and exclusive.  They are non-zero, because ada0 is used: there are three partitions on it; three instances of PART GEOM class hold it opened.  The partitions, just like any other GEOM nodes, have their own counters.  Take a look at the first one, ada0p1: the mode there is "r0w0e0".  This means it's not open by anything.  It's, in other words, available for mounting.  If you check the MD geom class:
# geom md list 
Geom name: md0
Providers:
1. Name: md0
   Mediasize: 1073741824 (1.0G)
   Sectorsize: 512
   Mode: r0w0e0
   type: swap
   access: read-write
   compression: off
   length: 1073741824
   fwsectors: 0
   fwheads: 0
   unit: 0
You will see the same thing: it's not opened.  That's the first thing the autofs(5) "-media" map checks for: zero access counts; if the counts are not zero, it means the node is used by something: it's either mounted (like ada0p2, mounted on /), or there is something "on top of it" - like ada0.

But why there is no /media/ada0p1?  Because it's not mountable; there is no filesystem there.  It's a boot loader partition.  How does autofs(5) figure it out?

fstyp(8)

Before we can do anything with a filesystem, we need to determine what kind of filesystem it is - and whether it actually is a supported filesystem in the first place.  That means we need a piece of code that can take a look at it and determine if it has a format it recognizes.

It is possible to use file(1) for this, eg:
# file -s /dev/md0
Vermaden's sysutils/automount port uses this approach.  There are a few problems with doing it this way, though.  First, the output, for a typical FAT filesystem, looks like this:
/dev/md0: DOS/MBR boot sector, code offset 0x3c+2, 
OEM-ID "BSD4.4  ", sectors/cluster 32, root entries
512, sectors/FAT 256, sectors/track 63, heads 255,
sectors 2097144 (volumes > 32 MB) , serial number
0x668a120e, unlabeled, FAT (16 bit)
It's not particularly easy to parse.  It's even harder to extract the volume label.

Second, file(1) can recognize all kinds of file types, from JPEG to 6502 assembly.  This means that if there are some strange data on the removable media, instead of a filesystem we expect, the file(1) will output something our script wasn't tested against, making the first problem even harder.

Third, file(1) had its share of security bugs, eg CVE-2014-1943, CVE-2014-9620, or CVE-2014-3710.

For this reason I've decided the proper fix would be to just write a new utility. The strange name - "fstyp" - comes from the utility of the same name, installed by default on Solaris, IRIX, OS X, and perhaps most other UNIX systems.

The fstyp(8) addresses the file(1) issues: the output is easily parsable (just a filesystem name, one word), it only recognizes filesystems supported by FreeBSD, and uses Capsicum sandboxing to make sure that even if there is a vulnerability, its impact is limited to incorrectly reporting the filesystem type.  It's a good topic for another article, but in short - in FreeBSD, every process can enter what's called a "capability mode". It's one way - a process can enter it, but there is no way to exit it.  Child processes inherit the mode. In capability mode, kernel will deny all attempts to open new files, create sockets, attach the shared memory segments etc, but the process is pretty much free to do anything it likes with the file descriptors it already had opened before entering the capability mode - and it can receive other file descriptors over a UNIX socket.  So, the fstyp(8) utility opens the device file, then calls cap_enter(2), which switches it into capability mode, and then continues execution, reading from the device to determine what's there.  Should it be compromised, it won't be able to execute /bin/sh, it won't be able to open a socket to transmit the data to some external host, etc.

The "-media" Map

Those are the components underneath the autofs(5), but how does it fit together? Let's start with the actual map.  In FreeBSD, special maps (the ones with names starting with "-") are just executables in /etc/autofs/:
# ls -al /etc/autofs
total 36
drwxr-xr-x 2 root wheel 512 Feb 14 21:18 .
drwxr-xr-x 25 root wheel 3072 Feb 24 11:22 ..
-rwxr-xr-x 1 root wheel 1010 Oct 17 11:26 include_ldap
-rwxr-xr-x 1 trasz wheel 43 Aug 17 2014 include_nis
-rwxr-xr-x 1 root wheel 367 Oct 17 11:26 special_hosts
-rwxr-xr-x 1 root wheel 2294 Dec 6 10:15 special_media
-rwxr-xr-x 1 root wheel 355 Feb 14 21:17 special_noauto
-rwxr-xr-x 1 root wheel 97 Oct 17 11:26 special_null
-rwxr-xr-x 1 root wheel 357 Aug 22 2014 special_smb
See the special_media?  That's the one.  It's a shell script.  The reason it's in /etc is that the system administrator can modify it if required, or add new special maps.

Now, let's try to run it by hand, as root:
# /etc/autofs/special_media
ADATA UFD
md0

# /etc/autofs/special_media md0
-fstype=msdosfs,nosuid  :/dev/md0
That's exactly how automountd(8) uses it, after the kernel component notifies it that it needs the /media directory taken care of.  It's described in more detail in the auto_master(5) manual page.  The shell script is pretty well commented, and I don't think there is any point in explaining it here.

Bottom line:
the core autofs itself doesn't know anything about removable devices; the special map "-media" does: it queries GEOM for the list of all disk-like nodes that are not in use, and then uses fstyp(8) to determine whether they contain a useful filesystem.  UNIX.  Modularity.  Plain text.  ;-)

Cache

Now, let's create a second memory disk, 1GB in size (the "1g" below) to see if it all works as intended:
# mdconfig -s1g
md1
# newfs_msdos /dev/md1
newfs_msdos: cannot get number of sectors per track:
Operation not supported
newfs_msdos: cannot get number of heads:
Operation not supported
newfs_msdos: trim 8 sectors to adjust to a multiple of 63
/dev/md1: 2096576 sectors in 65518 FAT16 clusters
(16384 bytes/cluster)
BytesPerSec=512 SecPerClust=32 ResSectors=1 FATs=2
RootDirEnts=512 Media=0xf0 FATsecs=256 SecPerTrack=63
Heads=255 HiddenSecs=0 HugeSectors=2097144

# ll /media
total 5
drwxr-xr-x 3 root wheel 512 Feb 24 12:25 .
drwxr-xr-x 30 root wheel 1024 Feb 23 09:04 ..
drwxr-xr-x 1 root wheel 4096 Jan 1 1980 ADATA UFD
drwxr-xr-x 3 root wheel 512 Feb 24 12:25 md0
Whoops.  Where is /media/md1?

There is one more mechanism for the whole thing to work correctly: the autofs(5) cache needs to be dealt with.

The first paragraph mentioned that it's automountd(8) that does all the map parsing - including running the /etc/autofs/special_media - and actual mounting. Doing it every time someone accesses the /media directory - or any directory, for that matter - would kill performance.  For this reason, after the kernel component asks the automound(8) to do its magic, it doesn't do that again until some time later.  In most cases it doesn't matter - the list of NFS exports for a given host doesn't change too often - but in case of removable media it's not acceptable.  The cache needs to be flushed, using "automount -c".  After that, the subsequent lookup in /media will trigger automountd(8), which will query the devices list and refresh the directory contents.

This obviously needs to happen automatically.  And if you actually went and opened /etc/auto_master in a text editor, you would have noticed this:
# When using the -media special map, make sure to edit devd.conf(5)
# to move the call to "automount -c" out of the comments section.
The devd(8) is a daemon responsible for listening for notifications from the kernel and running whatever is configured in its config, /etc/devd.conf. There are all kinds of things there, from running utilities to upload firmware for various USB devices, to launch moused(8) when a mouse gets connected, to switching power profiles, to... discarding autofs(5) caches.  It looks like this:
notify 100 {
match "system" "GEOM";
match "subsystem" "DEV";
action "/usr/sbin/automount -c";
}
If you do "man devd.conf", you will see the description of those events. Note that, just like the "-media" map works the same way for flash drives and encrypted volumes over multipath over iSCSI, this mechanism does not care about any specific hardware either.

Caveats

Two, really.  First: you need to run 11-CURRENT.  Second: the nodes in /media never disappear.  I expect to merge this support to 10-STABLE after the second issue is addressed.

cache line aliasing effects, or "why is freebsd slower than linux?"

There was some threads on FreeBSD/DragonflyBSD mailing lists a few years ago (2012?) which talked about some math benchmarks being much slower on FreeBSD/DragonflyBSD versus Linux.

When the same benchmark is run on FreeBSD/DragonflyBSD using the Linux layer (ie, a linux binary compiled for linux, but run on BSD) it gives the same or better behaviour.

Some digging was done, and it turned out it was due to memory allocation patterns and memory layout. The jemalloc library allocates large chunks at page aligned boundaries, whereas the allocator in glibc under Linux does not.

I've put the code online in the hope that others can test and verify this:

https://github.com/erikarn/himenobmtxpa

The branch 'local/freebsd' has my local change to allow the allocator offset to be specified. The offset compounds on each allocation - so with an 'n' byte offset, the first allocation is 0 bytes offset from the page boundary, the next is 'n' bytes offset from the page boundary, the next is '2n' bytes offset, etc.

You can experiment with different values and get completely different behavioural results. It's non-trivial: there's a 100% speedup by using a 127 byte offset for each allocation, versus a 0 byte offset.

I'd like to investigate cache line aliasing effects further. There was work done a few years ago to offset mbuf headers in the FreeBSD kernel so they weren't all page-aligned or 256/512/1024 byte aligned - and apparently this gave a significant performance improvement. But it wasn't folded into FreeBSD. What I'd like to do is come up with some better strategies / profiling guides for identifying when this is actually happening so the underlying objects being accessed can be adjusted.

So - if anyone out there has any tips, hints or suggestions on how to do this, please let me know. I'd like to document and automate this testing.

A look at the upcoming features for 10.1.2

If you’ve been an EDGE user in the past few weeks, or following our Roadmap items for the upcoming 10.1.2 release, you may have noticed a number of new security and privacy related items. I wanted to take a moment to clarify what some of these new features are and what they will do.

 

– PersonaCrypt –

The first of the new features is a new CLI utility called personacrypt. This command will allow the creation and usage of a GELI backed encrypted external media for your users $HOME directory. We are using it internally to keep our user profiles on USB 3.0 — 256GB hybrid SSD / flash memory stick (Coarsair flash Voyager GTX specifically). This is tied into the PCDM login manager, and user manager, so when you create a new user account, you can opt to keep all your personal data on any external device. The device is formatted with GPT / GELI / ZFS, and is decrypted at login via the GUI, after entering your encryption key, along with the normal user password.

Additionally, the personacrypt command uses GELI’s ability to split the key into two parts. One being your passphrase, and the other being a key stored on disk. Without both of these parts, the media cannot be decrypted. This means if somebody steals the key and manages to get your password, it is still worthless without the system it was “paired” with. PersonaCrypt will also allow exporting / importing this key data, so you can “pair” the key with other systems.

– Tor Mode –

We’ve added a new ability to the System Updater Tray, so you can with a single-click, toggle between running in Tor mode, and regular “Open” mode. This switch to Tor mode, will do the following:

1. Launch the Tor daemon, and connect to the Tor network
2. Re-write all the IPFW rules, blocking all outgoing / incoming traffic, except for traffic to and from the Tor daemon
3. Re-route all DNS / TCP requests through Tor using its transparent proxy support

This allows applications on the system to now connect to the internet through Tor, without needing explicit SOCKS proxy support.

Obviously this alone isn’t enough to keep your identity safe on the Internet. We highly recommend that you read through their excellent FAQ and wiki articles on the subject.

https://www.torproject.org/docs/faq.html.en#AnonymityAndSecurity

– Stealth Mode –

One of the features we just added to personacrypt is something we are calling “stealth” mode. It is integrated into PCDM, and does the following:

During the login, if stealth mode is selected, the users $HOME directory will be mounted with a GELI backed ZVOL with GELI’s onetime key encryption. This $HOME directory is setup with the default /usr/share/skel data, and is pretty much a “blank” slate, allowing you to login, and run apps as if on a fresh system each time. At logout the dataset is destroyed, or should the system be rebooted, the onetime key is lost, rendering the data useless. Think of it as a web browser’s “private” mode, except for your entire desktop session.

– LibreSSL –

We’ve made the switchover to convert our ports to use LibreSSL by default instead of the base systems OpenSSL. (Thanks to Bernard Spil for his work on this). Our hope is that LibreSSL will help make the system security better, and reduce the number of OpenSSL exploits that our packages may be vulnerable to.

– Encrypted Backups –

The Life-Preserver utility has had the ability for a while now to replicate your system to another box running FreeBSD, such as FreeNAS. This backup is done via ZFS send/recv using SSH, but the data on the remote end was stored un-encrypted and could be read by whomever was administrating the remote box. To provide an extra measure of security to backups, we are in the process of adding support for fully-encrypted backups, using GELI backed iSCSI volumes. This allows us to use ZFS send/recv over the wire, with all the data leaving the box already being encrypted via GELI. Your data on the remote side is fully-encrypted, and only accessibly with the key file you have stored on the client side. This is still in active development and should show up in the EDGE repo in the upcoming weeks, along with some additional details on usage.

 

 

We hope you’ve enjoyed this sneak-peek of whats happening with PC-BSD development right now. As always, we love people to test these features in our EDGE repo, and let us know of issues via our bug tracker:

https://bugs.pcbsd.org