Category Archives: Problem(s)

A phoronix benchmark creates a huge benchmarking discussion

The recent Phoronix benchmark which compared a release candidate of FreeBSD 9 with Oracle Linux Server 6.1 created a huge discussion in the FreeBSD mailinglists. The reason was that some people think the numbers presented there give a wrong picture of FreeBSD. Partly because not all benchmark numbers are presented in the most prominent page (as linked above), but only at a different place. This gives the impression that FreeBSD is inferior in this benchmark while it just puts the focus (for a reason, according to some people) on a different part of the benchmark (to be more specific, blogbench is doing disk reads and writes in parallel, FreeBSD gives higher priority to writes than to reads, FreeBSD 9 outperforms OLS 6.1 in the writes while OLS 6.1 shines with the reads, and only the reads are presented on the first page). Other complaints are that it is told that the default install was used (in this case UFS as the FS), when it was not (ZFS as the FS).

The author of the Phoronix article participated in parts of the discussion and asked for specific improvement suggestions. A FreeBSD committer seems to be already working to get some issues resolved. What I do not like personally, is that the article is not updated with a remark that some things presented do not reflect the reality and a retest is necessary.

As there was much talk in the thread but not much obvious activity from our side to resolve some issues, I started to improve the FreeBSD wiki page about benchmarking so that we are able to point to it in case someone wants to benchmark FreeBSD. Others already chimed in and improved some things too. It is far from perfect, some more eyes — and more importantly some more fingers which add content — are needed. Please go to the wiki page and try to help out (if you are afraid to write something in the wiki, please at least tell your suggestions on a FreeBSD mailinglist so that others can improve the wiki page).

What we need too, is a wiki page about FreeBSD tuning (a first step would be to take the man-page and convert it into a wiki page, then to improve it, and then to feed back the changes to the man-page while keeping the wiki page to be able to cross reference parts from the benchmarking page).

I already told about this in the thread about the Phoronix benchmark: everyone is welcome to improve the situation. Do not talk, write something. No matter if it is an improvement to the benchmarking page, tuning advise, or a tool which inspects the system and suggests some tuning. If you want to help in the wiki, create a FirstnameLastname account and ask a FreeBSD comitter for write access.

A while ago (IIRC we have to think in months or even years) there was some framework for automatic FreeBSD benchmarking. Unfortunately the author run out of time. The framework was able to install a FreeBSD system on a machine, run some specified benchmark (not much benchmarks where integrated), and then install another FreeBSD version to run the same benchmark, or to reinstall the same version to run another benchmark. IIRC there was also some DB behind which collected the results and maybe there was even some way to compare them. It would be nice if someone could get some time to talk with the author to get the framework and set it up somewhere, so that we have a controlled environment where we can do our own benchmarks in an automatic and repeatable fashion with several FreeBSD versions.

Share

Are USB memory sticks really that bad?

Last week my ZFS cache device — an USB memory stick — showed xxxM write errors. I got this stick for free as a promo, so I do not expect it to be of high quality (or wear-leveling or similar life-saving things). The stick survived about 9 months, during which it provided a nice speed-up for the access to the corresponding ZFS storage pool. I replaced it by another stick which I got for free as a promo. This new stick survived… one long weekend. It has now 8xxM write errors and the USB subsystem is not able to speak to it anymore. 30 minutes ago I issued an “usbconfig reset� to this device, which is still not finished. This leads me to the question if such sticks are really that bad, or if some problem crept into the USB subsystem?

If this is a problem with the memory stick itself, I should be able to reproduce such a problem on a different machine with a different OS. I could test this with FreeBSD 8.1, Solaris 10u9, or Windows XP. What I need is an automated test. This rules out the Windows XP machine for me, I do not want to spend time to search a suitable test which is available for free and allows to be run in an automated way. For FreeBSD and Solaris it probably comes down to use some disk-I/O benchmark (I think there are enough to chose from in the FreeBSD Ports Collection) and run it in a shell-loop.

Share

Debugging lang/mono — 2nd round

Today I had again some energy to look at why mono fails to build on FreeBSD–current.

I decided to do a debug-build of mono. This did not work initially, I had to produce some patches. :(

Does this mean nobody is doing debug builds of mono on FreeBSD?

I have to say, this experience with lang/mono is completely unsatisfying.

Ok, bottom line, either the debug build seems to prevent a race condition in most cases (I had a lot less lockups for each of the two builds I did).

Whatever it is, I do not care ATM (if the configure stuff is looking at the architecture of the system, it may be the case that the i386-portbld-freebsdX does not enable some important stuff which would be enabled when run with i486-portbld-freebsdX or better). Here are the patches I used in case someone is interested (warning, copy&paste converted tabs to spaces, you also have to apply the map.c (a generated file… maybe a touch of the right file would allow to apply this patch in the normal patch stage) related stuff when the build fails, else there is some parser error in mono):

--- mcs/class/Mono.Posix/Mono.Unix/UnixProcess.cs.orig       2010-01-29 11:34:00.592323482 +0100
+++ mcs/class/Mono.Posix/Mono.Unix/UnixProcess.cs    2010-01-29 11:34:18.540607357 +0100
@@ -57,7 +57,7 @@ namespace Mono.Unix {
 int r = Native.Syscall.waitpid (pid, out status,
 Native.WaitOptions.WNOHANG | Native.WaitOptions.WUNTRACED);
 UnixMarshal.ThrowExceptionForLastErrorIf (r);
-                       return r;
+                       return status;
 }

 public int ExitCode {

--- mono/io-layer/processes.c.orig    2010-01-29 11:36:08.904331535 +0100
+++ mono/io-layer/processes.c 2010-01-29 11:42:21.819159544 +0100
@@ -160,7 +160,7 @@ static gboolean waitfor_pid (gpointer te
 ret = waitpid (process->id, &status, WNOHANG);
 } while (errno == EINTR);

-       if (ret <= 0) {
+       if (ret == 0 || (ret < 0 && errno != ECHILD)) {
 /* Process not ready for wait */
 #ifdef DEBUG
 g_message ("%s: Process %d not ready for waiting for: %s",
@@ -169,6 +169,17 @@ static gboolean waitfor_pid (gpointer te

 return (FALSE);
 }
+
+       if (ret < 0 && errno == ECHILD) {
+#ifdef DEBUG
+               g_message ("%s: Process %d does not exist (anymore)", __func__,
+                          process->id);
+#endif
+               /* Faking the return status. I do not know if it is correct
+                * to assume a successful exit.
+                */
+               status = 0;
+       }

 #ifdef DEBUG
 g_message ("%s: Process %d finished", __func__, ret);

--- mono/metadata/mempool.c.orig      2010-01-29 11:58:16.871052861 +0100
+++ mono/metadata/mempool.c   2010-01-29 12:30:45.143367454 +0100
@@ -212,12 +212,14 @@ mono_backtrace (int size)

         EnterCriticalSection (&mempool_tracing_lock);
         g_print ("Allocating %d bytes\n", size);
+#if defined(HAVE_BACKTRACE_SYMBOLS)
         symbols = backtrace (array, BACKTRACE_DEPTH);
         names = backtrace_symbols (array, symbols);
         for (i = 1; i < symbols; ++i) {
                 g_print ("\t%s\n", names [i]);
         }
         free (names);
+#endif
         LeaveCriticalSection (&mempool_tracing_lock);
 }

--- mono/metadata/metadata.c.orig     2010-01-29 11:59:38.552316989 +0100
+++ mono/metadata/metadata.c  2010-01-29 12:00:43.957337476 +0100
@@ -3673,12 +3673,16 @@ mono_backtrace (int limit)
         void *array[limit];
         char **names;
         int i;
+#if defined(HAVE_BACKTRACE_SYMBOLS)
         backtrace (array, limit);
         names = backtrace_symbols (array, limit);
         for (i =0; i < limit; ++i) {
                 g_print ("\t%s\n", names [i]);
         }
         g_free (names);
+#else
+       g_print ("No backtrace available.\n");
+#endif
 }
 #endif

--- support/map.c.orig        2010-01-29 12:05:22.374653708 +0100
+++ support/map.c 2010-01-29 12:10:29.024412452 +0100
@@ -216,7 +216,7 @@
 #define _cnm_dump(to_t, from) do {} while (0)
 #endif /* def _CNM_DUMP */

-#ifdef DEBUG
+#if defined(DEBUG) && !defined(__FreeBSD__)
 #define _cnm_return_val_if_overflow(to_t,from,val)  G_STMT_START {   \
         int     uns = _cnm_integral_type_is_unsigned (to_t);             \
         gint64  min = (gint64)  _cnm_integral_type_min (to_t);           \

Mono build problems on FreeBSD-current

I try to build mono on FreeBSD–current (it is a dependency of some GNOME program). Unfortunately this does not work correctly.

What I see are hangs of the build. If I stop the build when it hangs and restart it, it will continue and succeed to process the build steps a little bit further, but then it hangs again.

If I ktrace the hanging process, I see that there is a call to wait returning with the error message that the child does not exist. Then there is a call to nanosleep.

It looks to me like this process missed some SIGCLD (or is waiting for something which did not exist at all), and a loop is waiting for a child to exit. This loop probably has no proper condition for the fact that there is no such child (anymore). As such it will stay forever in this loop.

So I grepped a litte bit around in mono and found the following code in <mono-src-dir>/mcs/class/Mono.Posix/Mono.Unix/UnixProcess.cs:

public void WaitForExit ()
{
    int status;
    int r;
    do {
        r = Native.Syscall.waitpid (pid, out status, (Native.WaitOptions) 0);
    } while (UnixMarshal.ShouldRetrySyscall (r));
    UnixMarshal.ThrowExceptionForLastErrorIf (r);
}

This does look a little bit as it could be related to the problem I see, but ShouldRetrySyscall only returns true if the errno is EINTR. So this looks correct. :-(

I looked a little bit more at this file and it looks like either I do not understand the semantic of this language, or GetProcessStatus does return the returnvalue of the waitpid call instead of the status (which is not what it shall return to my understanding). If I am correct, it can not really detect the status of a process. It would be very bad if such a fundamental thing went unnoticed in mono…  which does not put a good light on the unit-tests (if any) or the general testing of mono. For this reason I hope I am wrong.

I did not stop there, as this part does not look like it is the problem. I found the following in mono/io-layer/processes.c:

static gboolean waitfor_pid (gpointer test, gpointer user_data)
{
...
    do {
        ret = waitpid (process->id, &status, WNOHANG);
    } while (errno == EINTR);

    if (ret <= 0) {
        /* Process not ready for wait */
#ifdef DEBUG
        g_message ("%s: Process %d not ready for waiting for: %s",
                   __func__, process->id, g_strerror (errno));
#endif

        return (FALSE);
    }

#ifdef DEBUG
    g_message ("%s: Process %d finished", __func__, ret);
#endif

    process->waited = TRUE;
...
}

And here we have the problem, I think. I changed the (ret <= 0) to  (ret == 0 || (ret < 0 && errno != ECHILD)). This will not really give the correct status, but at least it should not block anymore and I should be able to see the difference during the build.

And now after testing, I see a difference, but the problem is still there. The wait with ECHILD is gone in the loop, but there is still some loop with a semaphore operation:

62960 mono     CALL  clock_gettime(0xd,0xbf9feef8)
62960 mono     RET   clock_gettime 0
62960 mono     CALL  semop(0x20c0000,0xbf9feef6,0x1)
62960 mono     RET   semop 0
62960 mono     CALL  semop(0x20c0000,0xbf9feef6,0x1)
62960 mono     RET   semop 0
62960 mono     CALL  semop(0x20c0000,0xbf9feef6,0x1)
62960 mono     RET   semop 0
62960 mono     CALL  semop(0x20c0000,0xbf9feef6,0x1)
62960 mono     RET   semop 0
62960 mono     CALL  nanosleep(0xbf9fef84,0)
62960 mono     RET   nanosleep 0
62960 mono     CALL  clock_gettime(0xd,0xbf9feef8)
62960 mono     RET   clock_gettime 0
62960 mono     CALL  semop(0x20c0000,0xbf9feef6,0x1)
62960 mono     RET   semop 0
62960 mono     CALL  semop(0x20c0000,0xbf9feef6,0x1)
62960 mono     RET   semop 0
62960 mono     CALL  semop(0x20c0000,0xbf9feef6,0x1)
62960 mono     RET   semop 0
62960 mono     CALL  semop(0x20c0000,0xbf9feef6,0x1)
62960 mono     RET   semop 0
62960 mono     CALL  nanosleep(0xbf9fef84,0)

OK, there is more going on. I think someone with more knowledge about mono should have a look at this (do not only look at this semop thing, but also look why it loses a child).

Stability problems solved (hardware problem)

After putting the disks of the 7-stable system which exhibited stability problems into a completely different system (it is a rented root-server, not our own hardware), the system now survived more than a day (and still no trace of problems) with the UFS setup. Previously it would crash after some minutes.

The ZFS setup with the changed hardware had a problem during the night before (like always after all my ZFS related changes on this machine), but on this machine I changed all locks in ZFS from shared locks to exclusive locks (this extended the uptime from 4–6 hours to “until I rebooted the morning after because of hanging processes”), so this may be because of this. I do not know yet if we will test the ZFS setup with the pure 7-stable source we use now or not (the goal was to get back a stable system, instead of playing around with unrelated stuff).

It looks like some kind of hardware problem was uncovered by updating from 7.1 to 7.2 (and 7-stable subsequently). This new machine has a completely different chipset, a new CPU and RAM and PSU and … so I do not really know what caused this (but the fact that the previous system did not recognize the CPU after replacing it with a bigger one and the observation that only shared locks with a specific usage pattern where affected lets me point towards missing microcode updates…).

Stabilizing 7-stable…

The 7-stable system on which I have stability problems after an update from 7.1 to 7.2/7-stable is now semi-stable.

The watchdog reboots after one minute of no reaction (currently it is able to run 3–4 hours), and the jails come up without problems now.

The problem with the jails was, that e.g. the mysql-server startup went into the STOP state because TTY-input was “requested”. I solved the problem by using /dev/null as input on jail-startup. On –current I do not see this behavior (I have a 9-current system with a lot of jails which reboots every X days, and there mysql does not go into the STOP state).

I also start the jails in the background, so that one blocking jail does not block everything (done like in –current).

To say this with code:

--- /usr/src/etc/rc.d/jail      2009-02-07 15:04:35.000000000 +0100
+++ /etc/rc.d/jail      2009-12-16 17:03:12.000000000 +0100
@@ -556,7 +556,8 @@
 fi
 _tmp_jail=${_tmp_dir}/jail.$$
 eval ${_setfib} jail ${_flags} -i ${_rootdir} ${_hostname} \
-                       \\"${_addrl}\\" ${_exec_start} > ${_tmp_jail} 2>&1
+                       \\"${_addrl}\\" ${_exec_start} > ${_tmp_jail} 2>&1 \\
+                       </dev/null

 if [ "$?" -eq 0 ] ; then
 _jail_id=$(head -1 ${_tmp_jail})
@@ -623,4 +624,4 @@
 if [ -n "$*" ]; then
 jail_list="$*"
 fi
-run_rc_command "${cmd}"
+run_rc_command "${cmd}" &

I also identified 57 patches for ZFS which are in 8-stable, but not in 7-stable (I do not think they could solve the deadlock, but I do not really know, and now that there is one FS on ZFS, I would like to get as much fixed as possible). Some of them should be merged, some would be nice to merge, and some I do not care much about (but if they are easy to merge, why not…). I already have all revisions and the corresponding commit logs available in an email-draft.

Now I just need to write a little bit of text and find some people willing to help (some of the changes need a review if they are applicable to 7-stable, and everything should be tested on a scratch-box).

Share/Bookmark

Stability problems with 7-stable

On the machine where I host this blog, I have/had some stability problems.

Last week I updated the machine from FreeBSD 7.1-pX to 7.2-p5 (GENERIC kernel in both cases). 5–10 Minutes after the reboot into the new version the machine had a deadlock. After some roadblocks (ordering a KVM-switch from the hoster, the KVM-switch not working with a proxy (during lunchtime at work), a broken video-capture of the KVM-switch and a replacement on Monday morning to not pay the WE-fees), I spend a big part of the night to get it stable. I tried disabling SMP, enabling INVARIANTS and WITNESS, changing the scheduler, cutting the software mirror (to rule out a mismatch between the content of the disks after all the hard reboots) and updating to 7-stable.

Unfortunately nothing helped. :(

Googling a little bit around (it is a AMD Dual-Core system with NVidia MCP61 chipset) was leading me to a post on the mailinglists from 2008 which talks about an issue with the buffer cache. I do not know if this is still an issue (I have send a email to kib@ to ask about it), and my scenario is not the same as the one which is described in the mail, but because of this I decided to switch one of the two UFS mirrors to ZFS.

The first boot into the ZFS caused again a reboot after some minutes (I do not know if it was because of a memory exhausted panic, or because of a deadlock), but as I did not tune the kernel for ZFS I am tempted to believe that I should not count that. Now, after tuning the kernel (increasing the kmem_size to 700M, no prefetching, limiting the ARC to 40M) it is up since nearly 2h (as of this writting… crossing fingers). Before it was not able to survive more than some minutes with just the jail for the mails up. Now I not only have the mail-jail up, but also the jail for the blog (one jail still disabled, but I will take care about that after this post).

I do not know if only increasing the kmem_size would have helped with the problem, but as I was testing a GENERIC kernel + gmirror module in the beginning, I expected that the auto-tuning of this value should have been enough for such a simple setup (2GB RAM, 2 disks with 3 partitions each, one partition pair for root, one for swap, one for the jails).

I hope that I stabilized the system now. It may be the case that I will test some patches in case someone comes up with something, so do not be surprised if the blog and email to me is a little bit flaky.

Share/Bookmark

More problems with CUPS (passing env variables)

Saturday I had to print a postscript file. The file was generated out of a template which I wrote myself by hand several years ago. There I use a non-standard PS font which can not be changed. The font is not embedded, and I can print it via ghostscript by telling it the location where the font files are located (export GS_LIB=/path/to/dir1/path/to/dir2). Now that I switched to use CUPS as my printserver software, I had to teach it to set this for the call to gs (via foomatic). Unfortunately I failed to get it working via the CUPS config.

I added “SetEnv GS_LIB /path/to/dir1:/path/to/dir2” to cups.conf and restarted CUPS. This did not work. I added “PassEnv GS_LIB” to cups.conf, added an appropriate export of GS_LIB to /etc/rc.conf (just to make sure… I still had the SetEnv in cups.conf) and restarted CUPS. This did not work either.

As I just wanted to print out something and did not want to spend my time debugging this, I put a workaround into place: I moved gsc to gsc.bin and created a little shell script as gsc which sets the variable and starts gsc.bin.

At the next update of ghostscript this will break my printing (if I forget that I have this workaround in place), so I should try to get some time to fix this. Maybe I can fix this by adding “env GS_LIB=…” to the call of gs in the ppd, but this seems more like another workaround to me, than a real fix.

Share/Bookmark