If I have some sample data, how do I put it into SQLite (preferably fully automated)? <pre class="prettyprint"><code>{"uri":"/","user_agent":"example1"} {"uri":"/foobar","user_agent":"example1"} {"uri":"/","user_agent":"example2"} {"uri":"/foobar","user_agent":"example3"} </code></pre>

I found the easiest way to do this is by using jq and CSV as an intermediary format. Edit: As pointed out (thanks @Leo), the original question did show newline delimited JSON objects, which each on their own conform to rfc4627, but not all together in that format. jq can handle a single JSON array of objects much the same way though by preprocessing the file using <code>jq '.[]' <input.json >preprocessed.json</code>. If you happen to be dealing with JSON text sequences (rfc7464) luckily jq has got your back too with the <code>--seq</code> parameter. Edit 2: Both the newline separated JSON and the JSON text sequences have one important advantage; they reduce memory requirements down to O(1), meaning your total memory requirement is only dependent on your longest line of input, whereas putting the entire input in a single array requires that either your parser can handle late errors (i.e. after the first 100k elements there's a syntax error), which generally isn't the case to my knowledge, or it will have to parse the entire file twice (first validating syntax, then parsing, in the process discarding previous elements, as is the case with <code>jq --stream</code>) which also happens rarely to my knowledge, or it will try to parse the whole input at once and return the result in one step (think of receiving a Python dict which contains the entirety of your say 50G input data plus overhead) which is usually memory backed, hence raising your memory footprint by just about your total data size. Edit 3: If you hit any obstacles, try using keys_unsorted instead of keys. I haven't tested that myself (I kind of assume my columns were already sorted), however @Kyle Barron reports that this was needed. <h3>Getting the CSV</h3> First write your data to a file. I will assume data.json here. Then construct the header using <code>jq</code>: <pre class="prettyprint"><code>% head -1 data.json | jq -r 'keys | @csv' "uri","user_agent" </code></pre> The <code>head -1</code> is because we only want one line. <code>jq</code>'s <code>-r</code> makes the output a plain string instead of a JSON-String wrapping the CSV. We then call the internal function <code>keys</code> to get the keys of the input as an array. This we send to the <code>@csv</code> formatter which outputs us a single string with the headers in quoted CSV format. We then need to construct the data. <pre class="prettyprint"><code>% jq -r '[.[]] | @csv' < data.json "/","example1" "/foobar","example1" "/","example2" "/foobar","example3" </code></pre> We now take the whole input and deconstruct the associative array (map) using <code>.[]</code> and then put it back into a simple array <code>[…]</code>. This basically converts our dictionary to an array of keys. Sent to the <code>@csv</code> formatter, we again get some CSV. Putting it all together we get a single one-liner in the form of: <pre class="prettyprint"><code>% (head -1 data.json | jq -r 'keys | @csv' && jq -r '[.[]] | @csv' < data.json) > data.csv </code></pre> If you need to convert the data on the fly, i.e. without a file, try this: <pre class="prettyprint"><code>% cat data.json | (read -r first && jq -r '(keys | @csv),( [.[]] | @csv)' <<<"${first}" && jq -r '[.[]] | @csv') </code></pre> <h3>Loading it into SQLite</h3> Open an SQLite database: <pre class="prettyprint"><code>sqlite3 somedb.sqlite </code></pre> Now in the interactive shell do the following (assuming you wrote the CSV to data.csv and want it in a table called <code>my_table</code>): <pre class="prettyprint"><code>.mode csv .import data.csv my_table </code></pre> Now close the shell and open it again for a clean environment. You can now easily <code>SELECT</code> from the database and do whatever you want to. <h3>Putting it all together</h3> Have an asciinema recording right there: <img src="https://asciinema.org/a/139449.png" alt="asciicast">

A way do this without CSV or a 3rd party tool is to use the <code>JSON1</code> extension of SQLite combined with the <code>readfile</code> extension that is provided in the <code>sqlite3</code> CLI tool. As well as overall being a "more direct" solution, this has the advantage of handling JSON NULL values more consistently than CSV, which will otherwise import them as empty strings. If the input file is a well-formed JSON file, e.g. the example given as an array: <pre class="prettyprint lang-json prettyprint-override"><code>[ {"uri":"/","user_agent":"example1"}, {"uri":"/foobar","user_agent":"example1"}, {"uri":"/","user_agent":"example2"}, {"uri":"/foobar","user_agent":"example3"} ] </code></pre> Then this can be read into the corresponding <code>my_table</code> table as follows. Open the SQLite database file <code>my_db.db</code> using the sqlite3 CLI: <pre class="prettyprint"><code>sqlite3 my_db.db </code></pre> then create <code>my_table</code> using: <pre class="prettyprint lang-sql prettyprint-override"><code>CREATE TABLE my_table(uri TEXT, user_agent TEXT); </code></pre> Finally, the JSON data in <code>my_data.json</code> can be inserted into the table with the CLI command: <pre class="prettyprint lang-sql prettyprint-override"><code>INSERT INTO my_table SELECT json_extract(value, '$.uri'), json_extract(value, '$.user_agent') FROM json_each(readfile('my_data.json')); </code></pre> If the initial JSON file is newline separated JSON elements, then this can be converted first using <code>jq</code> using: <pre class="prettyprint"><code>jq -s <my_data_raw.json >my_data.json </code></pre> It's likely there is a way to do this directly in SQLite using JSON1, but I didn't pursue that given that I was already using <code>jq</code> to massage the data prior to import to SQLite.

How to convert a JSON file to an SQLite database

Tags:

json

sqlite

sqlite-json1

If I have some sample data, how do I put it into SQLite (preferably fully automated)?

{"uri":"/","user_agent":"example1"}
{"uri":"/foobar","user_agent":"example1"}
{"uri":"/","user_agent":"example2"}
{"uri":"/foobar","user_agent":"example3"}

536

asked Sep 25 '17 14:09

benaryorg

3 Answers

I found the easiest way to do this is by using jq and CSV as an intermediary format.

Edit: As pointed out (thanks @Leo), the original question did show newline delimited JSON objects, which each on their own conform to rfc4627, but not all together in that format. jq can handle a single JSON array of objects much the same way though by preprocessing the file using jq '.[]' <input.json >preprocessed.json. If you happen to be dealing with JSON text sequences (rfc7464) luckily jq has got your back too with the --seq parameter.

Edit 2: Both the newline separated JSON and the JSON text sequences have one important advantage; they reduce memory requirements down to O(1), meaning your total memory requirement is only dependent on your longest line of input, whereas putting the entire input in a single array requires that either your parser can handle late errors (i.e. after the first 100k elements there's a syntax error), which generally isn't the case to my knowledge, or it will have to parse the entire file twice (first validating syntax, then parsing, in the process discarding previous elements, as is the case with jq --stream) which also happens rarely to my knowledge, or it will try to parse the whole input at once and return the result in one step (think of receiving a Python dict which contains the entirety of your say 50G input data plus overhead) which is usually memory backed, hence raising your memory footprint by just about your total data size.

Edit 3: If you hit any obstacles, try using keys_unsorted instead of keys. I haven't tested that myself (I kind of assume my columns were already sorted), however @Kyle Barron reports that this was needed.

Getting the CSV

First write your data to a file. I will assume data.json here.

Then construct the header using jq:

% head -1 data.json | jq -r 'keys | @csv'
"uri","user_agent"

The head -1 is because we only want one line. jq's -r makes the output a plain string instead of a JSON-String wrapping the CSV. We then call the internal function keys to get the keys of the input as an array. This we send to the @csv formatter which outputs us a single string with the headers in quoted CSV format.

We then need to construct the data.

% jq -r '[.[]] | @csv' < data.json
"/","example1"
"/foobar","example1"
"/","example2"
"/foobar","example3"

We now take the whole input and deconstruct the associative array (map) using .[] and then put it back into a simple array […]. This basically converts our dictionary to an array of keys. Sent to the @csv formatter, we again get some CSV.

Putting it all together we get a single one-liner in the form of:

% (head -1 data.json | jq -r 'keys | @csv' && jq -r '[.[]] | @csv' < data.json) > data.csv

If you need to convert the data on the fly, i.e. without a file, try this:

% cat data.json | (read -r first && jq -r '(keys | @csv),( [.[]] | @csv)' <<<"${first}" && jq -r '[.[]] | @csv')

Loading it into SQLite

Open an SQLite database:

sqlite3 somedb.sqlite

Now in the interactive shell do the following (assuming you wrote the CSV to data.csv and want it in a table called my_table):

.mode csv
.import data.csv my_table

Now close the shell and open it again for a clean environment. You can now easily SELECT from the database and do whatever you want to.

Putting it all together

Have an asciinema recording right there:

asciicast

answered Oct 08 '22 21:10

benaryorg

A way do this without CSV or a 3rd party tool is to use the JSON1 extension of SQLite combined with the readfile extension that is provided in the sqlite3 CLI tool. As well as overall being a "more direct" solution, this has the advantage of handling JSON NULL values more consistently than CSV, which will otherwise import them as empty strings.

If the input file is a well-formed JSON file, e.g. the example given as an array:

[
{"uri":"/","user_agent":"example1"},
{"uri":"/foobar","user_agent":"example1"},
{"uri":"/","user_agent":"example2"},
{"uri":"/foobar","user_agent":"example3"}
]

Then this can be read into the corresponding my_table table as follows. Open the SQLite database file my_db.db using the sqlite3 CLI:

sqlite3 my_db.db

then create my_table using:

CREATE TABLE my_table(uri TEXT, user_agent TEXT);

Finally, the JSON data in my_data.json can be inserted into the table with the CLI command:

INSERT INTO my_table SELECT 
  json_extract(value, '$.uri'), 
  json_extract(value, '$.user_agent')
FROM json_each(readfile('my_data.json'));

If the initial JSON file is newline separated JSON elements, then this can be converted first using jq using:

jq -s <my_data_raw.json >my_data.json

It's likely there is a way to do this directly in SQLite using JSON1, but I didn't pursue that given that I was already using jq to massage the data prior to import to SQLite.

answered Oct 08 '22 21:10

mm2001

sqlitebiter appears to provide a python solution:

A CLI tool to convert CSV/Excel/HTML/JSON/LTSV/Markdown/SQLite/TSV/Google-Sheets to a SQLite database file. http://sqlitebiter.rtfd.io/

docs: http://sqlitebiter.readthedocs.io/en/latest/

project: https://github.com/thombashi/sqlitebiter

last update approximately 3 months ago
last issue closed approximately 1 month ago, none open
noted today, 2018-03-14

answered Oct 08 '22 21:10

jimmont

Related questions
                            
                                tornado maps GET and POST arguments to lists. How can I disable this "feature"?
                            
                                Access user email address in Meteor JS app
                            
                                Issue with merging multiple JSON files in Python
                            
                                Correct way to place and handle .json file in Xcode
                            
                                Grouping JSON by values
                            
                                ExtJS grab JSON result
                            
                                FF 13, IE 9: JSON stringify / geolocation object
                            
                                Passing HTML markup into handlebars
                            
                                Remove invalid UTF-8 characters from a string
                            
                                Angular 2 Form Serialization Into JSON Format
                            
                                Jersey + Json media type application/json was not found
                            
                                My structures are not marshalling into json [duplicate]
                            
                                Sort a JSON array object using Javascript by value [duplicate]
                            
                                How do I convert an Error object in node js to a string properly?
                            
                                How do I prevent JSON serialization in ASP.NET MVC?
                            
                                jQuery uses (new Function("return " + data))(); instead of eval(data); to parse JSON, why?
                            
                                Use jQuery to convert JSON array to HTML bulleted list
                            
                                Saving and restoring geometries in OpenLayers
                            
                                Sending UTF-8 string using HttpURLConnection
                            
                                Jackson error "Illegal character... only regular white space allowed" when parsing JSON

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to convert a JSON file to an SQLite database

Tags:

json

sqlite

sqlite-json1

benaryorg

People also ask

3 Answers

Getting the CSV

Loading it into SQLite

Putting it all together

benaryorg

mm2001

jimmont

Recent Activity

Donate For Us