Monday, May 23, 2011

PyPy Genova-Pegli Post-EuroPython Sprint June 27 - July 2 2011

The next PyPy sprint will be in Genova-Pegli, Italy, the week after EuroPython (which is in Florence, about 3h away by train). This is a fully public sprint: newcomers and topics other than those proposed below are welcome.

Goals and topics of the sprint

  • Now that we have released 1.5, the sprint itself is going to be mainly working on fixing issues reported by various users. Possible topics include, but are not limited to:
    • fixing issues in the bug tracker
    • improve cpyext, the C-API compatibility layer, to support more extension modules
    • finish/improve/merge jitypes2, the branch which makes ctypes JIT friendly
    • general JIT improvements
    • improve our tools, like the jitviewer or the buildbot infrastructure
    • make your favorite module/application working on PyPy, if it doesn't yet
  • Of course this does not prevent people from showing up with a more precise interest in mind If there are newcomers, we will gladly give introduction talks.
  • Since we are almost on the beach, we can take one day off for summer relaxation and/or tourist visits nearby :-).

Exact times

The work days should be 27 June - 2 July 2011. People may arrive on the 26th already and/or leave on the 3rd.

Location & Accomodation

Both the sprint venue and the lodging will be at Albergo Puppo in Genova-Pegli, Italy. Pegli is a nice and peaceful little quarter of Genova, and the hotel is directly on the beach, making it a perfect place for those who want to enjoy the sea in the middle of the Italian summer, as a quick search on Google Images shows :-)

The place has a good ADSL Internet connexion with wireless installed. You can of course arrange your own lodging anywhere but I definitely recommend lodging there too.
Please confirm that you are coming so that we can adjust the reservations as appropriate. The prices are as follows, and they include breakfast and a parking place for the car, in case you need it:
  • single room: 70 €
  • double room: 95 €
  • triple room: 105 €
Please register by hg:
http://bitbucket.org/pypy/extradoc/src/extradoc/sprintinfo/genova-pegli-2011/people.txt
or on the pypy-dev mailing list if you do not yet have check-in rights:
http://mail.python.org/mailman/listinfo/pypy-dev
In case you want to share a room with someone else but you don't know who, please let us know (either by writing it directly in people.txt or by writing on the mailing list) and we will try to arrange it.

Monday, May 16, 2011

PyPy Usage Survey

We've been working on PyPy for a long time. But readers of this blog will know that in the past year something has changed: we think PyPy is production ready. And it's not just us, this week LWN.net wrote an article about how PyPy sped up one of their scripts by a factor of three, noting that, "plans are to run gitdm under PyPy from here on out". All in all we think PyPy is pretty great, but not everyone is using it yet, and we want to know why. We want your feedback on why PyPy isn't ready to be your only Python yet, and how we can improve it to make that happen.

Therefore, we've put together a quick survey, whether you're using PyPy or not if you could take a few minutes to fill it out and let us know how we're doing we'd really appreciate it. You can find the form here.

Thanks, The PyPy team

Sunday, May 15, 2011

Server migration in progress

Hi all,

We are in the process of migrating the hosting machine for PyPy, moving away from codespeak.net and towards a mixture of custom servers (e.g. for buildbot.pypy.org) and wide-scale services (e.g. for the docs, now at readthedocs.org).

When this is done, a proper announce will be posted here. In the meantime, we have already moved the mailing lists, now hosted on python.org. The subscribers' list have been copied, so if you didn't notice anything special for the past week, then everything works fine :-) This concerns pypy-dev, pypy-issue and pypy-commit. Two notes:

  • Some settings have not been copied, notably if you used to disable mail delivery. Sorry about that; you have to re-enter such settings.
  • Following the move, about 50 addresses have been dropped for being invalid. I'm unsure why they were not dropped earlier, but in case sending mail to you from python.org instead of codespeak.net fails, then you have been dropped from the mailing lists, and you need to subscribe again.

Wednesday, May 11, 2011

Playing with Linear Programming on PyPy

Fancy hi-level interfaces often come with a high runtime overhead making them slow. Here is an experiment with building such an interface using constructions that PyPy should be good at optimizing. The idea is to allow the JIT in PyPy to remove the overhead introduced by using a fancy high-level python interface on top of a low-level C interface. The application considered is Linear programming. It is a tool used to solve linear optimization problems. It can for example be used to find the nonnegative values x, y and z that gives the maximum value of
without violating the constraints
There exists general purpose solvers for these kind of problems that are very fast and can literally handle millions of variables. To use them however the problem has to be transformed into some specific matrix form, and the coefficients of all the matrices has to be passed to the solver using some API. This transformation is a tedious and error prone step that forces you to work with matrix indexes instead of readable variable names. Also it makes maintaining an implementation hard since any modification has to be transformed too.

The example above comes from the manual of the glpk library. That manual continues by describing how to convert this problem into the standard form of glpk (which involves introducing three new variables) and then gives the c-code needed to call the library. Relating that c-code to the problem above without the intermediate explanation of the manual is not easy. A common solution here is to build a hi-level interface that allows a more natural way of defining the matrices and/or allow the equations to be entered symbolically. Unfortunately, such interfaces often become slow. For the benchmark below for example, cvxopt requires 20 minutes to setup a problem that takes 9.43 seconds to solve (this seems a bit extreme, am I doing something wrong?).

The high-level interface I constructed on top of the glpk library is pplp and it allows the equations to be entered symbolically. The above problem can be solved using

    lp = LinearProgram()
    x, y, z = lp.IntVar(), lp.IntVar(), lp.IntVar()
    lp.objective = 10*x + 6*y + 4*z
    lp.add_constraint( x + y + z <= 100 )
    lp.add_constraint( 10*x + 4*y + 5*z <= 600 )
    lp.add_constraint( 2*x + 2*y + 6*z <= 300 )
    lp.add_constraint( x >= 0 )
    lp.add_constraint( y >= 0 )
    lp.add_constraint( z >= 0 )

    maxval = lp.maximize()
    print maxval
    print x.value, y.value, z.value

To benchmark the API I used it to solve a minimum-cost flow problem with 154072 nodes and 390334 arcs. The C library needs 9.43 s to solve this and the pplp interface adds another 5.89 s under PyPy and 28.17 s under CPython. A large amount of time is still spend setting up the problem, but it's a significant improvement over the 20 minutes required on CPython by cvxopt. It is probably not designed to be fast on this kind of benchmark. I have not been able to get cvxopt to work under PyPy. The benchmark used is available here

Thursday, May 5, 2011

NumPy Follow up

Hi everyone. Since yesterday's blog post we got a ton of feedback, so we want to clarify a few things, as well as share some of the progress we've made, in only the 24 hours since the post.

Reusing the original NumPy

First, a lot of people have asked why we cannot just reuse the original NumPy through cpyext, our CPython C-API compatibility layer. We believe this is not the best approach, for a few reasons:

  1. cpyext is slow, and always will be slow. It has to emulate far too many details of the CPython object model that don't exist on PyPy (e.g., reference counting). Since people are using NumPy primarily for speed this would mean that even if we could have a working NumPy, no one would want to use it. Also, as soon as the execution crosses the cpyext boundary, it becomes invisible to the JIT, which means the JIT has to assume the worst and deoptimize stuff away.
  2. NumPy uses many obscure documented and undocumented details of the CPython C-API. Emulating these is often difficult or impossible (e.g. we can't fix accessing a struct field, as there's no function call for us to intercept).
  3. It's not much fun. Frankly, working on cpyext, debugging the crashes, and everything else that goes with it is not terribly fun, especially when you know that the end result will be slow. We've demonstrated we can build a much faster NumPy, in a way that's more fun, and given that the people working on this are volunteers, it's important to keep us motivated.

Finally, we are not proposing to rewrite the entirety of NumPy or, god forbid, BLAST, or any of the low level stuff that operates on C-level arrays, only the parts that interface with Python code directly.

C bindings vs. CPython C-API

There are two issues on C code, one has a very nice story, and the other not so much. First is the case of arbitrary C-code that isn't Python related, things like libsqlite, libbz2, or any random C shared library on your system. PyPy will quite happily call into these, and bindings can be developed either at the RPython level (using rffi) or in pure Python, using ctypes. Writing bindings with ctypes has the advantage that they can run on every alternative Python implementation, such as Jython and IronPython. Moreover, once we merge the jittypes2 branch ctypes calls will even be smoking fast.

On the other hand there is the CPython C-extension API. This is a very specific API which CPython exposes, and PyPy tries to emulate. It will never be fast, because there is far too much overhead in all the emulation that needs to be done.

One of the reasons people write C extensions is for speed. Often, with PyPy you can just forget about C, write everything in pure python and let the JIT to do its magic.

In case the PyPy JIT alone isn't fast enough, or you just want to use existing C code then it might make sense to split your C-extension into 2 parts, one which doesn't touch the CPython C-API and thus can be loaded with ctypes and called from PyPy, and another which does the interfacing with Python for CPython (where it will be faster).

There are also libraries written in C to interface with existing C codebases, but for whom performance is not the largest goal, for these the right solution is to try using CPyExt, and if it works that's great, but if it fails the solution will be to rewrite using ctypes, where it will work on all Python VMs, not just CPython.

And finally there are rare cases where rewriting in RPython makes more sense, NumPy is one of the few examples of these because we need to be able to give the JIT hints on how to appropriately vectorize all of the operations on an array. In general writing in RPython is not necessary for almost any libraries, NumPy is something of a special case because it is so ubiquitous that every ounce of speed is valuable, and makes the way people use it leads to code structure where the JIT benefits enormously from extra hints and the ability to manipulate memory directly, which is not possible from Python.

Progress

On a more positive note, after we published the last post, several new people came and contributed improvements to the numpy-exp branch. We would like to thank all of them:

  • nightless_night contributed: An implementation of __len__, fixed bounds checks on __getitem__ and __setitem__.
  • brentp contributed: Subtraction and division on NumPy arrays.
  • MostAwesomeDude contributed: Multiplication on NumPy arrays.
  • hodgestar contributed: Binary operations between floats and NumPy arrays.

Those last two were technically an outstanding branch we finally merged, but hopefully you get the picture. In addition there was some exciting work done by regular PyPy contributors. I hope it's clear that there's a place to jump in for people with any level of PyPy familiarity. If you're interested in contributing please stop by #pypy on irc.freenode.net, the pypy-dev mailing list, or send us pull requests on bitbucket.

Alex

Wednesday, May 4, 2011

Numpy in PyPy - status and roadmap

Hello.

NumPy integration is one of the single most requested features for PyPy. This post tries to describe where we are, what we plan (or what we don't plan), and how you can help.

Short version for the impatient: we are doing experiments, which show that PyPy+numpy can be faster and better than CPython+numpy. We have a plan on how to move forward, but at the moment there is lack of dedicated people or money to tackle it.

The slightly longer version

Integrating numpy in PyPy has been my pet project on an on-and-off (mostly off) basis over the past two years. There were some experiments, then a long pause, and then some more experiments which are documented below.

The general idea is not to use the existing CPython module, but to reimplement numpy in RPython (i.e. the language PyPy is implemented in), thus letting our JIT achieve extra speedups. The really cool thing about this part is that numpy will automatically benefit of any general JIT improvements, without any need of extra tweaking.

At the moment, there is branch called numpy-exp which contains a translatable version of a very minimal version of numpy in the module called micronumpy. Example benchmarks show the following:

  add iterate
CPython 2.6.5 with numpy 1.3.0 0.260s (1x) 4.2 (1x)
PyPy numpy-exp @ 3a9d77b789e1 0.120s (2.2x) 0.087 (48x)

The add benchmark spends most of the time inside the + operator on arrays (doing a + a + a + a + a), , which in CPython is implemented in C. As you can see from the table above, the PyPy version is already ~2 times faster. (Although numexpr is still faster than PyPy, but we're working on it).

The exact way array addition is implemented is worth another blog post, but in short it lazily evaluates the expression and computes it at the end, avoiding intermediate results. This approach scales much better than numexpr and can lead to speeding up all the operations that you can perform on matrices.

The next obvious step to get even more speedups would be to extend the JIT to use SSE operations on x86 CPUs, which should speed it up by about additional 2x, as well as using multiple threads to do operations.

iterate is also interesting, but for entirely different reasons. On CPython it spends most of the time inside a Python loop; the PyPy version is ~48 times faster, because the JIT can optimize across the python/numpy boundary, showing the potential of this approach, users are not grossly penalized for writing their loops in Python.

The drawback of this approach is that we need to reimplement numpy in RPython, which takes time. A very rough estimate is that it would be possible to implement an useful subset of it (for some definition of useful) in a period of time comprised between one and three man-months.

It also seems that the result will be faster for most cases and the same speed as original numpy for other cases. The only problem is finding the dedicated persons willing to spend quite some time on this and however, I am willing to both mentor such a person and encourage him or her.

The good starting point for helping would be to look at what's already implemented in micronumpy modules and try extending it. Adding a - operator or adding integers would be an interesting start. Drop by on #pypy on irc.freenode.net or get in contact with developers via some other channel (such as the pypy-dev mailing list) if you want to help.

Another option would be to sponsor NumPy development. In case you're interested, please get in touch with us or leave your email in comments.

Cheers,
fijal