I am trying to upload a data table for testing that contains multiple levels of nesting but I can not seem to get the syntax correct for specifying the schema.
Here is my current Schema file:
{
"name":"city", "type":"RECORD",
[
{"name":"id", "type":"INTEGER"},
{"name":"name", "type":"STRING"},
{"name":"country", "type":"STRING"},
{"name":"coord", "type":"RECORD"},
[
{"name":"lon", "type":"FLOAT"},
{"name":"lat", "type":"FLOAT"}
],
{"name":"time", "type":"TIMESTAMP"}
]
}
Here is a sample of the data:
{"city":{"id":1283240,"name":"Kathmandu","country":"NP","coord":{"lon":85.316666,"lat":27.716667}},"time":1394865171,"data":[{"dt":1394852400,"main":{"temp":296.15,"temp_min":293.866,"temp_max":296.15}},{"dt":1394863200,"main":{"temp":301.51,"temp_min":299.345,"temp_max":301.51}}]}
In the full file I have multiple City's, each with multiple "data" points per day.
Thanks
Mark
When you have a RECORD type, you need to name the schema JSON array fields:. As in:
{
"name":"city", "type":"RECORD",
"fields": [
{"name":"id", "type":"INTEGER"},
{"name":"name", "type":"STRING"},
{"name":"country", "type":"STRING"},
{"name":"coord", "type":"RECORD",
"fields": [
{"name":"lon", "type":"FLOAT"},
{"name":"lat", "type":"FLOAT"}
]},
{"name":"time", "type":"TIMESTAMP"}
]
}
There was also an issue that you had the } in the wrong place to close the inner schema.
One trick that I like to use is to use Python's json.loads() function to verify that I've actually created a valid JSON object, since sometimes it can be hard to figure out if you've got all of the commas you need and closed all of your quotes correctly. For example:
$ python
>>> import json
>>> schema = """
... <paste your initial schema>
... """
>>> json.loads(schema)
ValueError: Expecting property name: line 4 column 5 (char 41)
(it is complaining that you have an array element without a property name... you need "fields" here).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With