Cloudy Performance in Cloud

I like Go.cd and use it quite a lot. Yesterday I decided to migrate one of my build machines from home to DigitalOcean. While doing so, I pulled a newer version of the gocd-server docker image, and ran into issues. It took up to 20 minutes to start the vanilla, all-clean go server. I got in contact with developers of Go-cd and while the general time it took was around 5-6 minutes, it was still very slow. My laptop spun the service up in about a minute.

And so I went on a quick “benchmark” run, based on this simple example:

How fast can a machine start the gocd/gocd-server docker image, and have the UI show up in a browser?

I tried this:

  • DigitalOcean, 2 GB, 4 GB and 8 GB droplet sizes, and they all performed about as “good”. “Good” in this case means bad, they have the worst disk IO performance I’ve ever seen. Too bad since all they so expressly put forward that all their machines are using SSDs …
  • Bare metal - my three year old ThinkPad T410 with a farily decent SSD drive. Booted well in under 2 minutes.
  • Amazon EC2 (Xen), tried t2.medium and m4.xlarge. Both outperformed my laptop.
  • GleSYS with VMware, 4 GB RAM, 2 cores. This performed about as well as my laptop. Just under a minute here too.

Lesson learned: never demo anything that touches disk, using DigitalOcean.

UPDATE: The gocd staff helped me realize that entropy starvation might be an issue, and added the following to my docker-compose.yml file: GO_SERVER_SYSTEM_PROPERTIES: -Djava.security.egd=file:/dev/./urandom With this change in place, I spun up 20 machines for a demo/workshop with Foss Gbg without any issues whatsoever. Disk IO seems to be fairly good after all (which it should be)!

I also tried OpenVZ with GleSYS, but docker wouldn’t run on it - the installation step failed saying “we don’t support this kernel”. I didn’t dig into details.

Network

One thing that became obvious, was also how different the providers network was setup. They all host copies/mirrors of Ubuntu’s packages (I used 14.04 since that was available at every site, but also tried 15.04 at DO to see if it was an OS related thing), so I could not with my human perception see any big differences. But pulling the docker image from Docker hub on the other hand really showed what “100 mbit/s” meant downstream. And without any numbers, these are the results:

  • Amazon - so much faster, even for the t2.medium
  • DigitalOcean - decent, about the same I get from my work and home
  • GleSYS - I used the London site and I really had to wait a long time for things to get pulled

Conclusion

This is in no way scientific, but based on repeating the same process a couple of times over at each site.

This is what I used:

curl -sSL http://bootstrap.wendt.io/gocd.wendt.io.sh | sh

Findings

  • GoCD is really disk intensive when starting up the first time
  • DigitalOcean has really poor disk IO (at least in Amsterdam and Frankfurt)
  • GleSYS network (in London) performed badly, and it’s a pity that you can’t use SSH keys for authentication after booting a new machine
  • The Ubuntu image at AWS includes curl from the go, none of the other do
  • The people over at ThoughtWorks, working on gocd are really helpful and nice to work with :-)

Some final notes for myself:

  • I mostly used 8 GB droplets in AMS2. The really slow startup I got from the 2 GB RAM droplets in FRA1.
This work by Fredrik Wendt is licensed under CC by-sa.