I'm working on a web application related to genome searching. This application makes use of this suffix tree library through Cython bindings. Objects of this type are large (hundreds of MB up to ~10GB) and take as long to load from disk as it takes to process them in response to a page request. I'm looking for a way to load several of these objects once on server boot and then use them for all page requests.
I have tried using a remote manager / client setup using the multiprocessing module, modeled after this demo, but it fails when the client connects with an error message that says the object is not picklable.
I would suggest writing a small Flask (or even raw WSGI… But it's probably simpler to use Flask, as it will be easier to get up and running quickly) application which loads the genome database then exposes a simple API. Something like this:
app = Flask(__name__)
database = load_database()
@app.route('/get_genomes')
def get_genomes():
return database.all_genomes()
app.run(debug=True)
Or, you know, something a bit more sensible.
Also, if you need to be handling more than one request at a time (I believe that app.run
will only handle one at a time), start by threading… And if that's too slow, you can os.fork()
after the database is loaded and run multiple request handlers from there (that way they will all share the same database in memory).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With