Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimal format for simple data storage in python

Tags:

python

As a relatively new programmer, I have several times encountered situations where it would be beneficial for me to read and assemble program data from an external source rather than have it written in the code. This is mostly the case when there are a large number of objects of the same type. In such scenarios, object definitions quickly take up a lot of space in the code and add unnecessary impediment to readability.

As an example, I've been working on text-based RPG, which has a large number of rooms and items of which to keep track. Even a few items and rooms leads to massive blocks of object creation code.

I think it would be easier in this case to use some format of external data storage, reading from a file. In such a file, items and rooms would be stored by name and attributes, so that they could parsed into an object with relative ease.

What formats would be best for this? I feel a full-blown database such as SQL would add unnecessary bloat to a fairly light script. On the other hand, an easy method of editing this data is important, either through an external application, or another python script. On the lighter end of things, the few I heard most often mentioned are XML, JSON, and YAML.

From what I've seen, XML does not seem like the best option, as many seem to find it complex and difficult to work with effectively.

JSON and YAML seem like either might work, but I don't know how easy it would be to edit either externally. Speed is not a primary concern in this case. While faster implementations are of course desirable, it is not a limiting factor to what I can use.

I've looked around both here and via Google, and while I've seen quite a bit on the topic, I have not been able to find anything specifically helpful to me. Will formats like JSON or YAML be sufficient for this, or would I be better served with a full-blown database?

like image 265
George Osterweil Avatar asked Jun 20 '12 23:06

George Osterweil


People also ask

What data format does Python use?

Python can work with the following file formats: Comma-separated values (CSV) XLSX. ZIP.

Is parquet better than pickle?

TLDR: On read speeds, PICKLE was 10x faster than CSV, MSGPACK was 4X faster, PARQUET was 2–3X faster, JSON/HDF about the same as CSV. On write speeds, PICKLE was 30x faster than CSV, MSGPACK and PARQUET were 10X faster, JSON/HDF about the same as CSV.


1 Answers

Though there are good answers here already, I would simply recommend JSON for your purposes for the sole reason that since you're a new programmer it will be the most straightforward to read and translate as it has the most direct mapping to native Python data types (lists [] and dictionaries {}). Readability goes a long way and is one of the tenets of Python programming.

like image 70
mVChr Avatar answered Sep 16 '22 18:09

mVChr