What is the best way to store large amounts of data in python, given one (or two) 500,000 item+ dictionary used for undirected graph searching?
I've been considering a few options such as storing the data as XML:
<key name="a">
<value data="1" />
<value data="2" />
</key>
<key name="b">
...
or in a python file for direct access:
db = {"a": [1, 2], "b": ...}
or in a SQL database? I'm thinking this would be the best solution, but would I have to rely more on SQL to do the computation than python itself?
The Python source technique absolutely rules.
XML is slow to parse, and relatively hard to read by people. That's why companies like Altova are in business -- XML isn't pleasant to edit.
Python source db = {"a": [1, 2], "b": ...}
is
Fast to parse.
Easy to read by people.
If you have programs that read and write giant dictionaries, use pprint
for the writing so that you get a nicely formatted output. Something easier to read.
If you're worried about portability, consider YAML (or JSON) for serializing the object. They parse quickly also, and are much, much easier to read than XML.
I would consider using one of the many graph libraries available for python (e.g. python-graph)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With