ivoras’ FreeBSD blog

September 29, 2007

How slow is VMWare (Server)?

Filed under: FreeBSD — ivoras @ 9:47 pm

VMWare is slow. But how slow is it? Here’s two runs of benchmarks/unixbench on the same machine, first in a VMWare guest under VMWare Server 1.0 on Windows XP, the second under the native OS on the same machine.

Here are the results on VMWare:


INDEX VALUES
TEST BASELINE RESULT INDEX

Dhrystone 2 using register variables 116700.0 6330202.6 542.4
Double-Precision Whetstone 55.0 1606.8 292.1
Execl Throughput 43.0 468.4 108.9
File Copy 1024 bufsize 2000 maxblocks 3960.0 36722.0 92.7
File Copy 256 bufsize 500 maxblocks 1655.0 11696.0 70.7
File Copy 4096 bufsize 8000 maxblocks 5800.0 49643.0 85.6
Pipe Throughput 12440.0 95945.5 77.1
Pipe-based Context Switching 4000.0 21320.3 53.3
Process Creation 126.0 1209.9 96.0
Shell Scripts (8 concurrent) 6.0 1.0 1.7
System Call Overhead 15000.0 47093.0 31.4
=========
FINAL SCORE 70.1

And here on the raw hardware:

INDEX VALUES
TEST BASELINE RESULT INDEX

Dhrystone 2 using register variables 116700.0 6467105.1 554.2
Double-Precision Whetstone 55.0 1633.7 297.0
Execl Throughput 43.0 2030.9 472.3
File Copy 1024 bufsize 2000 maxblocks 3960.0 63783.0 161.1
File Copy 256 bufsize 500 maxblocks 1655.0 57489.0 347.4
File Copy 4096 bufsize 8000 maxblocks 5800.0 53476.0 92.2
Pipe Throughput 12440.0 930715.9 748.2
Pipe-based Context Switching 4000.0 204248.8 510.6
Process Creation 126.0 5373.3 426.5
Shell Scripts (8 concurrent) 6.0 563.7 939.5
System Call Overhead 15000.0 720641.0 480.4
=========
FINAL SCORE 387.4

Both guests are FreeBSD 7-CURRENT with debugging disabled. The results are not 100% comparable since the VMWare image was run without SMP, but on this benchmark, SMP positively influences only “shell scripts” results (parallel execution) – other results are either comparable, or negatively influenced by SMP (the CPU is a dual-core Athlon 64, i386 mode).

Make your own conclusions, but I consider the IO and context switch performance so bad they’re making the whole system unusable in production (at least where performance is important).

Update:

In defense of VMWare I’ve run unixbench on VMWare ESX3 server (though on a system not at all comparable to the one in above benchmarks – a 3 GHz Xeon from the NetBurst era, running 6.2-RELEASE as a guest) and the results are better:


INDEX VALUES
TEST BASELINE RESULT INDEX

Dhrystone 2 using register variables 116700.0 5113310.0 438.2
Double-Precision Whetstone 55.0 935.0 170.0
Execl Throughput 43.0 555.5 129.2
File Copy 1024 bufsize 2000 maxblocks 3960.0 55662.0 140.6
File Copy 256 bufsize 500 maxblocks 1655.0 17818.0 107.7
File Copy 4096 bufsize 8000 maxblocks 5800.0 66604.0 114.8
Pipe Throughput 12440.0 132556.6 106.6
Pipe-based Context Switching 4000.0 18074.1 45.2
Process Creation 126.0 1414.9 112.3
Shell Scripts (8 concurrent) 6.0 130.7 217.8
System Call Overhead 15000.0 62919.9 41.9
=========
FINAL SCORE 121.2

I still wouldn’t use it where performance is important, but at least these results look half-usable. The major improvement seems to be in context switching and parallel execution.

Second update:

Here’s the same setup as in the first VMWare Server benchmark (same machine, Windows XP host, 7-CURRENT), with QEmu+kqemu (kernel+user code acceleration):


TEST BASELINE RESULT INDEX

Dhrystone 2 using register variables 116700.0 5456588.4 467.6
Double-Precision Whetstone 55.0 1492.1 271.3
Execl Throughput 43.0 166.5 38.7
File Copy 1024 bufsize 2000 maxblocks 3960.0 13744.0 34.7
File Copy 256 bufsize 500 maxblocks 1655.0 4426.0 26.7
File Copy 4096 bufsize 8000 maxblocks 5800.0 23832.0 41.1
Pipe Throughput 12440.0 23079.7 18.6
Pipe-based Context Switching 4000.0 2159.5 5.4
Process Creation 126.0 409.8 32.5
Shell Scripts (8 concurrent) 6.0 8.6 14.3
System Call Overhead 15000.0 9728.4 6.5
=========
FINAL SCORE 33.3

Compared to this, VMWare server doesn’t look bad at all :(

Powered by WordPress