In the previous post, I discussed Western Digital's “Advanced Format” drives and the problems caused by their misreporting their real, physical sector size.
I wrote a benchmark utility to demonstrate the performance penalty of unaligned accesses and uncover a drive's physical sector size. What it does is write blocks of zeroes varying size at regular intervals. For each block size, it writes a total of 128 MB at intervals of four times the block size, and at an offset that varies from 512 bytes up to half of the block size.
With the default settings, the first pass will write 131,072 1,024-byte blocks at n × 4,096, and the second pass will do the same at n × 4,096 + 512. The third, fourth and fifth passes will write 65,536 2,048-byte blocks each at n × 8,192, n × 8,192 + 512 and n × 8,192 + 1,024. It will make four more passes with 4,096-byte blocks and five with 8,192-byte blocks.
Here's the idea: most passes will be very slow (up to half an hour per pass), but when we hit the right block size and alignment, performance will skyrocket; so on—let's say—a WD20EARS with factory settings, passes 6 (4,096 bytes at offset 0), 10 (8,192 bytes at offset 0) and 14 (8192 bytes at offset 4,096) should stand out from the crowd. In fact, here are the results for passes 6 through 9:
count size offset step msec tps kBps
32768 4096 0 16384 19503 138 6720
32768 4096 512 16384 1216537 2 107
32768 4096 1024 16384 1213479 2 108
32768 4096 2048 16384 1214623 2 107
Pass 6 takes 20 seconds, while passes 7, 8 and 9 take 20 minutes.
Let me rephrase that: properly aligned non-sequential writes are faster than misaligned ones by a factor of sixty.
Sixty. Six zero.
We really, really need to get that fixed somehow.
That's not the whole story, though. Let's see how it compares to a 7,200 rpm, 2 TB Hitachi Deskstar (HDS722020ALA330) with 512-byte physical sectors:
count size offset step msec tps kBps
32768 4096 0 16384 8803 307 14889
32768 4096 512 16384 8701 310 15063
32768 4096 1024 16384 8735 309 15004
32768 4096 2048 16384 8705 310 15056
The Hitachi blows through the test so fast you don't even have time to make yourself a cup of coffee, let alone drink it.
This is a 7,200 rpm, 400 GB Caviar SE16 (WD4000AAKS)—more than three years old, so don't expect too much:
count size offset step msec tps kBps
32768 4096 0 16384 21348 126 6139
32768 4096 512 16384 21674 124 6047
32768 4096 1024 16384 20799 129 6301
32768 4096 2048 16384 21031 128 6232
So, about the same as we get from the WD20EARS with aligned writes.
Now, here's the kicker. The last drive in my test lineup is a WD20EADS—almost the same as the WD20EARS, but with 512-byte sectors and only 32 MB cache (although cache doesn't mean anything here—I made sure my test program writes enough data to blow through the cache on every pass).
count size offset step msec tps kBps
32768 4096 0 16384 22811 118 5745
32768 4096 512 16384 19552 138 6703
32768 4096 1024 16384 36945 73 3547
32768 4096 2048 16384 50102 53 2616
Ouch. It's not just slow, it's also very inconsistent. I have no idea what to make of that.
Note 1: I did not mention rotational speed for the WD Green disks, because Western Digital themselves do not specify one; the spec sheet just says “IntelliPower”. Not sure what to make of that, either. Tom's Hardware contradict themselves, saying in one review that it means 5,400, and in another that it means it varies. Meanwhile, my supplier claim the WD20EARS rotates at 7,200 rpm. Go figure.
Note 2: I also have a 1 TB WD10EARS, but I haven't tested it yet. I expect it to perform pretty much as well (or as poorly, depending on your perspective) as the WD20EARS.
Update: the results for the WD10EARS are in. Strangely, it is much faster at unaligned writes than the WD20EARS, although it's a little slower at aligned writes.
count size offset step msec tps kBps
32768 4096 0 16384 23105 116 5672
32768 4096 512 16384 79285 34 1653
32768 4096 1024 16384 75814 35 1728
32768 4096 2048 16384 79920 33 1640
A naïve sequential-write benchmark (diskinfo -t) suggests that it's about 20% slower overall. It is possible that both disks use a striped layout internally, so the WD20EARS gets better results because it has more platters. If that is the case, it should be possible to modify phybs to detect the stripe size.