Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing persistent storage solutions in python

I'm starting on a new scientific project which has a lot of data (millions of entries) I'd like to store in an easily and quickly accessible format. I've come across a number of different potential options, but I'm not sure how to pick amongst them. My data can probably just be stored as a dictionary, or potentially a dictionary of dictionaries. Some potential considerations:

  • Speed. I can't load all the data off disk every time I start a new script, and I'd like as quick access to random entries as possible.
  • Ease-of-use. This is python. The storage should feel like python.
  • Stability/maturity. I'd like something that's currently supported, although something that works well but is still in development would be fine.
  • Ease of installation. My sysadmin should be able to get this running on our cluster.

I don't really care that much about the size of the storage, but it could be a consideration if an option is really terrible on this front. Also, if it matters, I'll most likely be creating the database once, and thereafter only reading from it.

Some potential options that I've started looking at (see this post):

  • pyTables
  • ZopeDB
  • shove
  • shelve
  • redis
  • durus

Any suggestions on which of these might be better for my purposes? Any better ideas? Some of these have a back-end; any suggestions on which file-system back-end would be best?

like image 534
Noah Avatar asked Aug 05 '09 20:08

Noah


People also ask

What is persistent storage in Python?

The Persistent Storage module minimizes database activity by caching retrieved objects and by saving objects only after their attributes change. To relieve code writing tedium and reduce errors, a code generator takes a brief object description and creates a Python module for a persistent version of that object.

What are the types of persistent storage?

Magnetic media, such as hard disk drives and tape are common types of persistent storage, as are the various forms of Optical media such as DVD. Persistent storage systems can be in the form of file, block or object storage.

What is the opposite of persistent storage?

Persistent storage volumes can be contrasted with ephemeral storage volumes that live and die with containers and are associated with stateless apps.


3 Answers

Might want to give mongodb a shot - the PyMongo library works with dictionaries and supports most Python types. Easy to install, very performant + scalable. MongoDB (and PyMongo) is also used in production at some big names.

like image 107
mdirolf Avatar answered Oct 07 '22 07:10

mdirolf


A RDBMS.

Nothing is more realiable than using tables on a well known RDBMS. Postgresql comes to mind.

That automatically gives you some choices for the future like clustering. Also you automatically have a lot of tools to administer your database, and you can use it from other software written in virtually any language.

It is really fast.

In the "feel like python" point, I might add that you can use an ORM. A strong name is sqlalchemy. Maybe with the elixir "extension".

Using sqlalchemy you can leave your user/sysadmin choose which database backend he wants to use. Maybe they already have MySql installed - no problem.

RDBMSs are still the best choice for data storage.

like image 21
nosklo Avatar answered Oct 07 '22 08:10

nosklo


I'm working on such a project and I'm using SQLite.

SQLite stores everything in one file and is part of Python's standard library. Hence, installation and configuration is virtually for free (ease of installation).

You can easily manage the database file with small Python scripts or via various tools. There is also a Firefox plugin (ease of installation / ease-of-use).

I find it very convenient to use SQL to filter/sort/manipulate/... the data. Although, I'm not an SQL expert. (ease-of-use)

I'm not sure if SQLite is the fastes DB system for this work and it lacks some features you might need e.g. stored procedures.

Anyway, SQLite works for me.

like image 41
wierob Avatar answered Oct 07 '22 08:10

wierob