A Comparison of Ruby Application Servers, Part 2 – Performance

FooBarWidget · on June 27, 2014

Phusion Passenger author here. I'd like to say: if anybody is still on Passenger 3, you should upgrade to Passenger 4. Passenger 3 is very old, over 2 years now. Passenger 4 has received tons of improvements, as documented at "Recent Technical Advances in Phusion Passenger": https://vimeo.com/85970704

Honestly, I'm surprised anybody still talks about Passenger 3 nowadays. The improvements in 4 are so huge that 3 shouldn't even be worth looking at.

Also, the tests don't seem to say how the servers are configured. Passenger's out-of-the-box configuration is designed to conserve resources, not performance, because a lot of our users are on low-memory VPSes. It needs a few minor tweaks to optimize for performance like Unicorn is. With settings like these, it behaves almost exactly like Unicorn:

    # 'x' is whatever your Unicorn worker count should be
    passenger_max_pool_size x;
    passenger_min_instances x;
    passenger_pre_start_url http://yourapp.com/

Xorlev · on June 27, 2014

-1 no visualization

-1 seems like the author didn't use EM-friendly sleeps & http clients when appropriate.

-1 not clear if the author used a separate, in-DC machine for load testing or his laptop, and what the CPU looked like on the test machines during those tests. If they're sitting idle at 20% usage because RTT between your desk and EC2, that's not a great test. Additionally, it takes some sysctl tuning to really make OSX able to run a decent number of requests/s necessary for load testing of any type.

stormbrew · on June 27, 2014

I'm not sure it's even possible to get OSX into a state where it's a reliable and useful tool for IO-based load testing. It's not 100% clear to me that the author of this was using OSX for any part, except that they mention installing siege through brew, but if they did that pretty much renders the results meaningless.

There needs to be a mantra: Benchmarks are hard.

isomorphic · on June 27, 2014

- Didn't test database apps because reasons.

- Tested Twitter instead.

danenania · on June 27, 2014

Just wanted to chime in and say that even if it wasn't perfect (as already discussed ad nauseum in the comments here), I thought it was a really interesting comparison and I learned a lot from reading it. Thanks for the writeup!

ShaneOG · on June 27, 2014

It's not that it might not be perfect, but the fact is that if the experiment was run with incorrect parameters the result might just be plain wrong.

oomkiller · on June 27, 2014

Benchmarking on VMs (especially on EC2) is a no-no. It introduces a lot of other factors that can skew results drastically. There are entire companies built around the idea of variable performance in the cloud. If you want to do a decent benchmark, do it on a real piece of hardware.

VeejayRampay · on June 27, 2014

Aren't most application hosted in the cloud anyway? I understand how VMs are not ideal, but aren't they more "real-life" than "real hardware" nowadays (not a rhetorical question, genuinely curious).

nitrogen · on June 27, 2014

The problem with benchmarking on a VM is that VM effects may totally dominate code effects. It's sort of like using laptop speakers to mix a dubstep album.

paukiatwee · on June 27, 2014

The issue is the EC2 instances are not all "same" performance/stable, and thus each benchmark will have different results.

One of the key thing to remember when doing experiment/benchmark is to control variable, now EC2 performance is variable that you cant control.

jtc331 · on June 27, 2014

Server configuration (for example, how many unicorn workers was the app running?) isn't given. Without this (and the many other testing methodology issues pointed out in other comments), the entire article is meaningless.

616c · on June 27, 2014

I think I skimmed the first one. When I heard about Torquebox (or the upcoming Torqbox), I thought it would be interesting to try JRuby for beefier web apps to scale out better. Unfortunately, no one really seems to focus on web app server performance for JRuby in particular (skimming part 2 shows mention of JRuby, but none of this is JRuby specific).

Does anyone know where to get reliable information on using different webservers specifically on the JRuby stack, specifically the beefier Java options?

rubiquity · on June 27, 2014

Try the JRuby mailing lists[0] or IRC. JRuby is actually quite impressive these days and people are using it in prod with servers like Torquebox. However, benchmarking JRuby is even more skewed and complex due to waiting for the JVM caches to warm! :)

0 - https://github.com/jruby/jruby/wiki/MailingLists

616c · on June 27, 2014

Thanks for the note. I really appreciate it.

ShaneOG · on June 27, 2014

I'd like to see Chef/Puppet/Ansible build/deploy scripts for the test boxes, and the same for the box that siege was running from, assuming it's not a personal laptop (and if it is, it shouldn't be).

Benchmarks are hard, and it's a lot easier to be confident about results if others can easily repeat the experiments.

lumpypua · on June 27, 2014

Where are the graphs?!?

trhway · on June 27, 2014

> It didn't even make it a quarter of the way through /sleep.

my understanding of Ruby is limited to JRuby[on Rails] 1-hour demonstration i saw 10 years ago, and somehow the result above doesn't surprise me and is very consistent with my absolutely uneducated opinion of Ruby and its ecosystem.