Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Pypy uses a crapton more memory.


Hi mike, the answer to this really depends. you should qualify such blank statements a bit more. To be precise:

* startup memory is much higher (30M vs 5M roughly). If you have lots of very small processes, PyPy is not a good fit.

* objects are smaller (by a bit unspecified amount, up to 50%)

* there is a GC overhead which means peak memory will be ~30% of your total heap

* JIT occupies some memory. This is a function of the size of your code.


You're right, I reran the tests that I've been seeing, and it's probably CPU that is where I'm having problems with Pypy. I have a test suite that completes in about 17 minutes for cPython and with Pypy I can't even get it to complete within two hours.

This is running the SQLAlchemy unit tests against SQLite, on an Amazon EC2 small instance via our jenkins suite at http://jenkins.sqlalchemy.org. So yes, we are dealing with more limited resources than usual. Usually when something slows to a crawl on EC2 it's because it started swapping, so I had assumed that was the issue here, but apparently it's not. SQLAlchemy is a large library with a lot of tests - 155 test_* modules. So I'd imagine pypy has lots of work to do running the JIT on all those source files, and I guess because running tests means a continuous stream of new codepaths, that means all new JIT activity for each one.

In this particular case, the two tests ran on the same server, and resource contention seems likely. Swap space remained 100% free; the two jobs shared the CPU 50/50 and once the cPython job was done, pypy's went right out to 99% and stayed there. For startup time, the cPython suite started running tests within 3 seconds, and pypy didn't get to the test suite for about one minute 40 seconds. Pypy didn't actually start running real tests, save for a series of "skipped" tests in the beginning, until the cPython job was totally finished at 17 minutes. Pypy then took all 99% of the CPU for the rest of it's duration, and about an hour into it, it's just about halfway through the suite.

I'd welcome any help in debugging why the test suite here appears to be excessively slow (is it the slow sqlite module?) Otherwise, if this is just how things are with the JIT + large number of codepaths, that would be a significant caveat to pypy's speed advantage. But you're right, it wasn't memory.

update: the build on pypy took a total of 2 hours 17 minutes.


There are quite a few problems with pypy and test suites. This sounds like it's an extreme case, but sqlite is definitely very slow. How about you post this on a bugtracker, so we have a point of reference to start with?

For the record, I don't believe "this is how things with the JIT are" to start with. 17 minutes is by far enough to spin the JIT. It might be sqlite, it might be some code in sqlalchemy, it might be something unbeliavably silly, please start with a bug report and we can take it from there. SQLAlchemy is an important package and would be cool to have it run fast on PyPy.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: