Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python vs. C++ for an application that does sparse linear algebra

I'm writing an application where quite a bit of the computational time will be devoted to performing basic linear algebra operations (add, multiply, multiply by vector, multiply by scalar, etc.) on sparse matrices and vectors. Up to this point, we've built a prototype using C++ and the Boost matrix library.

I'm considering switching to Python, to ease of coding the application itself, since it seems the Boost library (the easy C++ linear algebra library) isn't particularly fast anyway. This is a research/proof of concept application, so some reduction of run time speed is acceptable (as I assume C++ will almost always outperform Python) so long as coding time is also significantly decreased.

Basically, I'm looking for general advice from people who have used these libraries before. But specifically:

1) I've found scipy.sparse and and pySparse. Are these (or other libraries) recommended?

2) What libraries beyond Boost are recommended for C++? I've seen a variety of libraries with C interfaces, but again I'm looking to do something with low complexity, if I can get relatively good performance.

3) Ultimately, will Python be somewhat comparable to C++ in terms of run time speed for the linear algebra operations? I will need to do many, many linear algebra operations and if the slowdown is significant then I probably shouldn't even try to make this switch.

Thank you in advance for any help and previous experience you can relate.

like image 484
RandomGuy Avatar asked Dec 02 '22 04:12

RandomGuy


1 Answers

My advice is to fully test the algorithm in Python before translating it into any other language (otherwise you run the risk of optimizing prematurely a bad algorithm). Once you have clearly defined the best interface for your problems, you can factor it out to external code.

Let me explain.

Suppose your final algorithm consists of taking a bunch of numbers in (row, column, value) format and, say, computing the SVD of the corresponding sparse matrix. Then you can leave the entire interface to Python:

class Problem(object):
   def __init__(self, values):
       self.values = values

   def solve(self):
       return external_svd(self.values)

where external_svd is the Python wrapper to a Fortran/C/C++ subroutine which efficiently computes the svd given a matrix in the format (row, column, value), or whatever floats your boat.

Again, first try to use numpy and scipy, and any other standard Python tool. Only then, after you've profiled your code, should you write the actual wrapper external_svd.

If you go this route, you will have a module which is user friendly (the user interacts with Python, not with Fotran/C/C++) and, most importantly, you will be able to use different back-ends: external_svd_lapack, external_svd_paradiso, external_svd_gsl, etc. (one for each back-end you choose).

As for sparse linear algebra libraries, check the Intel Math Kernel Library, the PARADISO sparse solver, the Harwell Subroutine Library (HSL) called "MA27". I've used them successfully to solve very sparse, very large problems (check the page of the nonlinear optimization solver IPOPT to see what I mean)

like image 164
Escualo Avatar answered May 06 '23 12:05

Escualo