is OpenBSD 10x faster than Linux?
Here’s a little benchmark complements of Jann Horn. It’s unexpectedly slow on Linux.
OpenBSD is so fast, I had to modify the program slightly to measure itself, as the time utility is missing sufficient precision to even record nonzero.
All it does is create one extra thread, then both existing threads create 256 sockets. What’s so hard about that?
<span class=bi>#include <pthread.h></span>
<span class=bi>#include <unistd.h></span>
<span class=bi>#include <err.h></span>
<span class=bi>#include <stdio.h></span>
<span class=bi>#include <sys/time.h></span>
<span class=bi>#include <sys/socket.h></span>
<span class=tp>static</span> <span class=tp>void</span> open_sockets<span class=st>(</span><span class=tp>void</span><span class=st>)</span> <span class=st>{</span>
<span class=kw>for</span> <span class=kw>(</span><span class=tp>int</span> i<span class=op>=</span><span class=nm>0</span>; i<span class=op><</span><span class=nm>256</span>; i<span class=op>++</span><span class=kw>)</span> <span class=kw>{</span>
<span class=tp>int</span> sock <span class=op>=</span> socket<span class=cm>(</span>AF_INET<span class=op>,</span> SOCK_STREAM<span class=op>,</span> <span class=nm>0</span><span class=cm>)</span>;
<span class=kw>if</span> <span class=cm>(</span>sock <span class=op>==</span> <span class=op>-</span><span class=nm>1</span><span class=cm>)</span>
err<span class=cm>(</span><span class=nm>1</span><span class=op>,</span> <span class=st>"socket"</span><span class=cm>)</span>;
<span class=kw>}</span>
<span class=st>}</span>
<span class=tp>static</span> <span class=tp>void</span> <span class=op>*</span>thread_fn<span class=st>(</span><span class=tp>void</span> <span class=op>*</span>dummy<span class=st>)</span> <span class=st>{</span>
open_sockets<span class=kw>(</span><span class=kw>)</span>;
<span class=kw>return</span> <span class=bi>NULL</span>;
<span class=st>}</span>
<span class=tp>int</span> main<span class=st>(</span><span class=tp>int</span> argc<span class=st>)</span> <span class=st>{</span>
<span class=tp>struct</span> timeval one<span class=op>,</span> two;
gettimeofday<span class=kw>(</span>&one<span class=op>,</span> <span class=bi>NULL</span><span class=kw>)</span>;
<span class=kw>if</span> <span class=kw>(</span>argc <span class=op>></span> <span class=nm>1</span><span class=kw>)</span>
dup2<span class=kw>(</span><span class=nm>0</span><span class=op>,</span> <span class=nm>666</span><span class=kw>)</span>;
pthread_t thread;
<span class=kw>if</span> <span class=kw>(</span>pthread_create<span class=cm>(</span>&thread<span class=op>,</span> <span class=bi>NULL</span><span class=op>,</span> thread_fn<span class=op>,</span> <span class=bi>NULL</span><span class=cm>)</span><span class=kw>)</span>
errx<span class=kw>(</span><span class=nm>1</span><span class=op>,</span> <span class=st>"pthread_create"</span><span class=kw>)</span>;
open_sockets<span class=kw>(</span><span class=kw>)</span>;
<span class=kw>if</span> <span class=kw>(</span>pthread_join<span class=cm>(</span>thread<span class=op>,</span> <span class=bi>NULL</span><span class=cm>)</span><span class=kw>)</span>
errx<span class=kw>(</span><span class=nm>1</span><span class=op>,</span> <span class=st>"pthread_join"</span><span class=kw>)</span>;
gettimeofday<span class=kw>(</span>&two<span class=op>,</span> <span class=bi>NULL</span><span class=kw>)</span>;
timersub<span class=kw>(</span>&two<span class=op>,</span> &one<span class=op>,</span> &one<span class=kw>)</span>;
printf<span class=kw>(</span><span class=st>"elapsed: %lld.%06lds\n"</span><span class=op>,</span> one<span class=op>.</span>tv_sec<span class=op>,</span> one<span class=op>.</span>tv_usec<span class=kw>)</span>;
<span class=kw>return</span> <span class=nm>0</span>;
<span class=st>}</span>
On Linux, I get results approximately as so:
tedu@penguin:~$ ./a.out
elapsed: 0.017770s
tedu@penguin:~$ ./a.out
elapsed: 0.026309s
tedu@penguin:~$ ./a.out
elapsed: 0.018414s
On OpenBSD, here we go, choo choo:
ox$ ./a.out
a.out: a.out: socketsocket: : Too many open files
Too many open files
ox$ ulimit -n 1024
ox$ ./a.out
elapsed: 0.006096s
ox$ ./a.out
elapsed: 0.002508s
ox$ ./a.out
elapsed: 0.002326s
These aren’t identical machines, but roughly comparable.
There’s a hint in the code (nothing to do with networking code, if that was your first guess), with more explanation in the linked thread, which is worth reading and some thinking. I’d love to see the system and benchmark where Linux outperforms here.
Really, I just found it a little funny. Usually it’s the weirdo benchmark that shows OpenBSD being 10x slower, so this one is definitely going in the collection.
Jann Horn (@jann@infosec.exchange)
Linux kernel quiz: Why is this program so slow and takes around 50ms to run? What line do you have to add to make it run in ~3ms instead without interfering with what this program does? ``` user@debian12:~/test$ cat > slow.c #includeInfosec Exchange