Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Performance - have you ever had to rewrite in something else?

Has anyone ever had code in Python, that turned out not to perform fast enough?

I mean, you were forced to choose another language because of it?

We are investigating using Python for a couple of larger projects, and my feeling is that in most cases, Python is plenty fast enough for most scenarios (compared to say, Java) because it relies on optimized C routines.

I wanted to see if people had instances where they started out in Python, but ended up having to go with something else because of performance.

Thanks.

like image 565
Dutch Masters Avatar asked Dec 22 '08 16:12

Dutch Masters


People also ask

When should you rewrite code?

Here are some good reasons to rewrite your project code:You can't add new features without a complete rewrite of the existing source code. Onboarding new developers on a project gets too complicated and takes more than two months. You find it hard to set Continuous Integration or Deployment.

Should you rewrite software?

Disadvantages of rewrites You're essentially writing new software based on strict requirements. You should only commit to a rewrite if there is plenty of time to rebuild the application's code. As mentioned above, rewrites split resources between a team that manages the old code and a team that creates the new code.


2 Answers

Yes, I have. I wrote a row-count program for a binary (length-prefixed rather than delimited) bcp output file once and ended up having to redo it in C because the python one was too slow. This program was quite small (it only took a couple of days to re-write it in C), so I didn't bother to try and build a hybrid application (python glue with central routines written in C) but this would also have been a viable route.

A larger application with performance critical bits can be written in a combination of C and a higher level language. You can write the performance-critical parts in C with an interface to Python for the rest of the system. SWIG, Pyrex or Boost.Python (if you're using C++) all provide good mechanisms to do the plumbing for your Python interface. The C API for python is more complex than that for Tcl or Lua, but isn't infeasible to build by hand. For an example of a hand-built Python/C API, check out cx_Oracle.

This approach has been used on quite a number of successful applications going back as far as the 1970s (that I am aware of). Mozilla was substantially written in Javascript around a core engine written in C. Several CAD packages, Interleaf (a technical document publishing system) and of course EMACS are substantially written in LISP with a central C, assembly language or other core. Quite a few commercial and open-source applications (e.g. Chandler or Sungard Front Arena) use embedded Python interpreters and implement substantial parts of the application in Python.

EDIT: In rsponse to Dutch Masters' comment, keeping someone with C or C++ programming skills on the team for a Python project gives you the option of writing some of the application for speed. The areas where you can expect to get a significant performance gain are where the application does something highly iterative over a large data structure or large volume of data. In the case of the row-counter above it had to inhale a series of files totalling several gigabytes and go through a process where it read a varying length prefix and used that to determine the length of the data field. Most of the fields were short (just a few bytes long). This was somewhat bit-twiddly and very low level and iterative, which made it a natural fit for C.

Many of the python libraries such as numpy, cElementTree or cStringIO make use of an optimised C or FORTRAN core with a python API that facilitates working with data in aggregate. For example, numpy has matrix data structures and operations written in C which do all the hard work and a Python API that provides services at the aggregate level.

like image 177
ConcernedOfTunbridgeWells Avatar answered Sep 28 '22 17:09

ConcernedOfTunbridgeWells


This is a much more difficult question to answer than people are willing to admit.

For example, it may be that I am able to write a program that performs better in Python than it does in C. The fallacious conclusion from that statement is "Python is therefore faster than C". In reality, it may be because I have much more recent experience in Python and its best practices and standard libraries.

In fact no one can really answer your question unless they are certain that they can create an optimal solution in both languages, which is unlikely. In other words "My C solution was faster than my Python solution" is not the same as "C is faster than Python"

I'm willing to bet that Guido Van Rossum could have written Python solutions for adam and Dustin's problems that performed quite well.

My rule of thumb is that unless you are writing the sort of application that requires you to count clock cycles, you can probably achieve acceptable performance in Python.

like image 28
Kenan Banks Avatar answered Sep 28 '22 15:09

Kenan Banks