Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When is it appropriate to use a database , in Python

I am making a little add-on for a game, and it needs to store information on a player:

  • username
  • ip-address
  • location in game
  • a list of alternate user names that have came from that ip or alternate ip addresses that come from that user name

I read an article a while ago that said that unless I am storing a large amount of information that can not be held in ram, that I should not use a database. So I tried using the shelve module in python, but I'm not sure if that is a good idea.

When do you guys think it is a good idea to use a database, and when it better to store information in another way , also what are some other ways to store information besides databases and flat file databases.

like image 516
user1155844 Avatar asked Jun 09 '12 02:06

user1155844


2 Answers

Most importantly, unless you specifically need performance or high reliability, do whatever will make your code simplest/easiest to write.


If your data is extremely structured (and you know SQL or are willing to learn) then using a database like sqlite3 might be appropriate. (You should ignore the comment about database size and RAM: there are times when databases are appropriate for even very small data sets, because of how the data is structured.)

If the data is relatively simple and you don't need the reliability that a database (normally) has then storing it in one of the builtin datatypes while the program is running is probably fine.

If you'd like the data stored on disk to be human readable (and editable, with a bit of effort), then a format like JSON (there is builtin json module) is nice, since the basic Python objects serialise without any effort. If the data not so simple then YAML is essentially an extended version of JSON (PyYAML is very good.). Similarly, you could use CSV files (the csv modules), although this is not nearly as good as JSON or YAML, or just a custom text format (but this is quite a lot of effort to get error handling and so on implemented neatly).

Finally, if your data contains more advanced objects (e.g. recursive dictionaries, or complicated custom datatypes) then using one of the builtin binary serialisation techniques (pickle, shelve etc.) might be appropriate, however, YAML can handle many of these things (including recursive data structures).

Some general points:

  • Plain text formats are nice, as they allow values to be tweaked easily and debugging/testing is easy
  • Binary formats are nice, as they mean that values can't be tweaked without a little bit of extra effort (this is not saying they can't be adjusted though), and the file size is smaller (probably not relevant)
like image 199
huon Avatar answered Sep 20 '22 07:09

huon


Assuming by 'database' you mean 'relational database' - even the embedded databases like SQLite come with some overhead compared to a plain text file. But, sometimes that overhead is worth it compared to rolling your own.

The biggest question you need to ask is whether you are storing relational data - whether things like normalisation and SQL queries make any sense at all. If you need to lookup data across multiple tables using joins, you should certainly use a relational database - that's what they're for. On the other hand, if all you need to do is lookup into one table based on its primary key, you probably want a CSV file. Pickle and shelve are useful if what you're persisting is the objects you use in your program - if you can just add the relevant magic methods to your existing classes and expect it all to make sense.

Certainly "you shouldn't use databases unless you have a lot of data" isn't the best advice - the amount of data goes more to what database you might use if you are using one. SQLite, for example, wouldn't be suitable for something the size of Stackoverflow - but, MySQL or Postgres would almost certainly be overkill for something with five users.

like image 27
lvc Avatar answered Sep 21 '22 07:09

lvc