Arjan Schaaf bio photo

Arjan Schaaf

Linux, AWS, Azure, Docker & Java rock my world. But far more important: I'm a husband and father of two. Love to BBQ and I'm famous in my world for my burgers & ribs :-)

Twitter LinkedIn Github

Network performance

Working on the Amdatu RTI project, I’ve been busy with technologies like Docker, Azure, CoreOS and Weave lately. All very cool stuff, often bleeding edge :-) Which is all fine and dandy, but I often get the question from my colleagues: and how does it all perform, Arjan?

Running our first performance tests, we noticed both in the test results directly and in our New Relic monitoring that network performance seemed to be a limitation. So the first question we asked ourselfs: what kind of network performance does Azure offer with the different VM instance types?

Information about that subject is hard to find. Not much else then this presentation came up: http://www.slideshare.net/KristofRennen/windows-azure-virtual-machines-and-virtual-networks

The second question we asked ourselves: and what about Weave? How does it affect the performance of the native network interface. Weave did get a bad rep on the interwebs lately concerning raw performance. But we wanted to find out for ourselves what this impact is when using weave for the communication between containers on multiple CoreOS hosts in our Azure cluster.

A number of other posts on the web use the qperf tool to test bandwidth and network latency. http://blog.weave.works/2015/06/12/weave-fast-datapath/

http://www.admon.org/networking/qperf-measure-ip-networking-performance/

The most helpfull article I could find was this post on weave performance: http://www.generictestdomain.net/docker/weave/networking/stupidity/2015/04/05/weave-is-kinda-slow/

But I couldn’t get the weave / docker run command to work, so I created my own basic container image to run qperf on our CoreOS servers running on Azure:

https://registry.hub.docker.com/u/arjanschaaf/centos-qperf/

Run a qperf performance test over the native CoreOS network stack

To run the qperf test, you need 2 hosts: one as server and one as client. We are running 2 CoreOS nodes in one Azure Affinitiy Group to maximize performance.

Start a server on the first node:

sudo docker run -dti -p 4000:4000 -p 4001:4001 arjanschaaf/centos-qperf -lp 4000

Then connect the client to the host ip address of the first node that runs the server

sudo docker run -ti --rm arjanschaaf/centos-qperf <server host ip address> -lp 4000 -ip 4001 tcp_bw tcp_lat conf

Run a qperf performance test over the Weave network stack

Running the qperf server/client combo on weave is very similar, but make sure your qperf server container gets an weave ip address assigned and connect the client to that weave ip address instead of the address of the server host. I’m using the weave-proxy and set the environment variable DOCKER_HOST:

export DOCKER_HOST=tcp://127.0.0.1:12375

Start a server on the first node:

sudo docker run -dti -p 4000:4000 -p 4001:4001 arjanschaaf/centos-qperf -lp 4000

Then connect the client to the weave ip address of the qperf server container on the first server:

sudo docker run -ti --rm arjanschaaf/centos-qperf <server container weave ip address> -lp 4000 -ip 4001 tcp_bw tcp_lat conf

Results

The results below are the average of 3 separate runs.

Azure VM type network interface native qperf bandwidth native qperf latency weave qperf bandwidth weave qperf latency weave bandwith compared to native
A0 5 Mbps 590 KB/sec 289 us 557 KB/sec 419 us -16%
A1 100 Mbps 12.5 MB/sec 483 us 9.11 MB/sec 483 us -27%
A2 200 Mbps 24.8 MB/sec 426 us 10.3 MB/sec 557 us -58%
A3 400 Mbps 48 MB/sec 465 us 12 MB/sec 483 us -75%
A8 not documented 310 MB/sec 94.2 us 51 MB/sec 188 us -83%
D1 not documented 56.3 MB/sec 276 us 17 MB/sec 456 us -69%
D2 not documented 117 MB/sec 178 us 24.6 MB/sec 446 us - 79%

Conclusion

Azure

Ignore the A0 instance: the performance (not only network) is not really usable in production environments. It’s good to see that the A1 - A3 instances scale linearly and close to their specification. The D-series instances look very promising: a D1 instance is about as expensive as an A2 instance, but offers superior network performance. The A8 (and A9) instances with a 40 Gbps/s InfiniBand network are in a league of their own, but this performance costs a lot of money.

Weave

Boy, does the performance of Weave (version 1.0.1) disappoint. I wasn’t expecting near native performance, a protocol like Weave comes with a certain overhead. But the current performance is so poorly that Weave in its current state is not really usable for us. As explained at http://www.generictestdomain.net/docker/weave/networking/stupidity/2015/04/05/weave-is-kinda-slow/ the poor performance of Weave is mainly due to the UDP based implementation.

VXLan based encapsulation, like the Flannel VXLan implementation and the beta implementation of Weave VXLan encapsulation (http://blog.weave.works/2015/06/12/weave-fast-datapath/), should offer superior performance. We are planning to include both the Weave VXLan implementation and Flannel in our tests. Keep an eye on this blog and the blog of Paul Bakker for any updates on our journey!

What’s next

  • Flannel UDP and VXLan performance tests on Azure
  • Weave VXLan performance tests
  • Comparision between Azure and Amazon VM network performance: where to get the biggest bang for the buck