I want to convert a JSON file I created to a SQLite database.
My intention is to decide later which data container and entry point is best, json (data entry via text editor) or SQLite (data entry via spreadsheet-like GUIs like SQLiteStudio).
My json file is like this (containing traffic data from some crossroads in my city):
... "2011-12-17 16:00": { "local": "Av. Protásio Alves; esquina Ramiro Barcelos", "coord": "-30.036916,-51.208093", "sentido": "bairro-centro", "veiculos": "automotores", "modalidade": "semaforo 50-15", "regime": "típico", "pistas": "2+c", "medicoes": [ [32, 50], [40, 50], [29, 50], [32, 50], [35, 50] ] }, "2011-12-19 08:38": { "local": "R. Fernandes Vieira; esquina Protásio Alves", "coord": "-30.035535,-51.211079", "sentido": "único", "veiculos": "automotores", "modalidade": "semáforo 30-70", "regime": "típico", "pistas": "3", "medicoes": [ [23, 30], [32, 30], [33, 30], [32, 30] ] } ...
And I have created nice database with a one-to-many relation with these lines of Python code:
import sqlite3 db = sqlite3.connect("fluxos.sqlite") c = db.cursor() c.execute('''create table medicoes (timestamp text primary key, local text, coord text, sentido text, veiculos text, modalidade text, pistas text)''') c.execute('''create table valores (id integer primary key, quantidade integer, tempo integer, foreign key (id) references medicoes(timestamp))''')
BUT the problem is, when I was preparing to insert the rows with actual data with something like c.execute("insert into medicoes values(?,?,?,?,?,?,?)" % keys)
, I realized that, since the dict loaded from the JSON file has no special order, it does not map properly to the column order of the database.
So, I ask: "which strategy/method should I use to programmatically read the keys from each "block" in the JSON file (in this case, "local", "coord", "sentido", "veiculos", "modalidade", "regime", "pistas" e "medicoes"), create the database with the columns in that same order, and then insert the rows with the proper values"?
I have a fair experience with Python, but am just beginning with SQL, so I would like to have some counseling about good practices, and not necessarily a ready recipe.
loads() json. loads() method can be used to parse a valid JSON string and convert it into a Python Dictionary. It is mainly used for deserializing native string, byte, or byte array which consists of JSON data into Python Dictionary.
It's possible to output query results as a JSON document when using the SQLite command line interface. We can do this with the json output mode. We can also use SQLite functions like json_object() and/or json_array() to return query results as a JSON document.
You have this python code:
c.execute("insert into medicoes values(?,?,?,?,?,?,?)" % keys)
which I think should be
c.execute("insert into medicoes values (?,?,?,?,?,?,?)", keys)
since the %
operator expects the string to its left to contain formatting codes.
Now all you need to make this work is for keys
to be a tuple (or list) containing the values for the new row of the medicoes table in the correct order. Consider the following python code:
import json traffic = json.load(open('xxx.json')) columns = ['local', 'coord', 'sentido', 'veiculos', 'modalidade', 'pistas'] for timestamp, data in traffic.iteritems(): keys = (timestamp,) + tuple(data[c] for c in columns) print str(keys)
When I run this with your sample data, I get:
(u'2011-12-19 08:38', u'R. Fernandes Vieira; esquina Prot\xe1sio Alves', u'-30.035535,-51.211079', u'\xfanico', u'automotores', u'sem\xe1foro 30-70', u'3') (u'2011-12-17 16:00', u'Av. Prot\xe1sio Alves; esquina Ramiro Barcelos', u'-30.036916,-51.208093', u'bairro-centro', u'automotores', u'semaforo 50-15', u'2+c')
which would seem to be the tuples you require.
You could add the necessary sqlite code with something like this:
import json import sqlite3 traffic = json.load(open('xxx.json')) db = sqlite3.connect("fluxos.sqlite") query = "insert into medicoes values (?,?,?,?,?,?,?)" columns = ['local', 'coord', 'sentido', 'veiculos', 'modalidade', 'pistas'] for timestamp, data in traffic.iteritems(): keys = (timestamp,) + tuple(data[c] for c in columns) c = db.cursor() c.execute(query, keys) c.close()
Edit: if you don't want to hard-code the list of columns, you could do something like this:
import json traffic = json.load(open('xxx.json')) someitem = traffic.itervalues().next() columns = list(someitem.keys()) print columns
When I run this it prints:
[u'medicoes', u'veiculos', u'coord', u'modalidade', u'sentido', u'local', u'pistas', u'regime']
You could use it with something like this:
import json import sqlite3 db = sqlite3.connect('fluxos.sqlite') traffic = json.load(open('xxx.json')) someitem = traffic.itervalues().next() columns = list(someitem.keys()) columns.remove('medicoes') columns.remove('regime') query = "insert into medicoes (timestamp,{0}) values (?{1})" query = query.format(",".join(columns), ",?" * len(columns)) print query for timestamp, data in traffic.iteritems(): keys = (timestamp,) + tuple(data[c] for c in columns) c = db.cursor() c.execute(query) c.close()
The query this code prints when I try it with your sample data is something like this:
insert into medicoes (timestamp,veiculos,coord,modalidade,sentido,local,pistas) values (?,?,?,?,?,?,?)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With