Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python data structure: SQL, XML, or .py file

What is the best way to store large amounts of data in python, given one (or two) 500,000 item+ dictionary used for undirected graph searching?

I've been considering a few options such as storing the data as XML:

<key name="a">
    <value data="1" />
    <value data="2" />
</key>
<key name="b">
...

or in a python file for direct access:

db = {"a": [1, 2], "b": ...}

or in a SQL database? I'm thinking this would be the best solution, but would I have to rely more on SQL to do the computation than python itself?

like image 639
lastlee Avatar asked Jan 13 '09 07:01

lastlee


2 Answers

The Python source technique absolutely rules.

XML is slow to parse, and relatively hard to read by people. That's why companies like Altova are in business -- XML isn't pleasant to edit.

Python source db = {"a": [1, 2], "b": ...} is

  1. Fast to parse.

  2. Easy to read by people.

If you have programs that read and write giant dictionaries, use pprint for the writing so that you get a nicely formatted output. Something easier to read.

If you're worried about portability, consider YAML (or JSON) for serializing the object. They parse quickly also, and are much, much easier to read than XML.

like image 91
S.Lott Avatar answered Sep 25 '22 22:09

S.Lott


I would consider using one of the many graph libraries available for python (e.g. python-graph)

like image 34
mac Avatar answered Sep 22 '22 22:09

mac