I am reading this StackOverFlow discussion converting JSON into CSV and it seems great, but I cant get basic jq to work.. I am not sure what I am doing wrong. I have tried the basic thing and I cant crack whats wrong. Here is my ES query in a Shell Script
curl -XGET 'http://es-1:9200/data_latest/customer/_search?pretty' -H 'Content-Type: application/json' -d'
{
"_source": ["customer_app_version", "customer_num_apps", "customer_name","app_disk_size_bytes","app_memory_capacity_bytes"],
"query": {
"bool": {
"must": [{
"term": {
"is_app_customer": {
"value": "true"
}
}
}]
}
},
"aggs": {
"Customer_UUID": {
"terms": {
"field": "customer_uuid",
"size": 100
}
}
}
}
' Shell Script Output
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 6171,
"max_score": 1.8510876,
"hits": [
{
"_index": "data_latest_v1",
"_type": "customer",
"_id": "0003245-4844-9015-1z2e-d4ae5234rd56",
"_score": 1.8510876,
"_source": {
"customer_app_version": "el7.20150513",
"customer_num_apps": 3,
"app_memory_capacity_bytes": 405248409600,
"customer_name": "Timbuktu Inc",
"app_disk_size_bytes": 25117047875604
}
},
{
"_index": "data_latest_v1",
"_type": "customer",
"_id": "0003245-4844-9015-1z2e-d4ae5234rd56",
"_score": 1.8510876,
"_source": {
"customer_app_version": "el4.20150513",
"customer_num_apps": 34,
"app_memory_capacity_bytes": 58923439600,
"customer_name": "Bunnies Inc",
"app_disk_size_bytes": 36517984275604
}
}
]
}
}
(truncated, but the subset above is syntactically valid)
For example, I added after the ' (end of the above script) I added | jq -r '."customer_name"'
and also tried
| jq -r '.customer_name'
For both I get output like this.
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
103 13566 100 13566 0 346 507k 13248 --:--:-- --:--:-- --:--:-- 537k
null
What am i doing wrong? What do I need to do? would be super helpful if someone can guide me here.
To describe in your jq
query how to navigate in the document to the data you want to extract might look like the following:
jq -r '.hits.hits[]._source.customer_name'
In this case, the output is:
Timbuktu Inc
Bunnies Inc
To generate a key/value CSV, one might use:
jq -r '.hits.hits[]._source | to_entries | .[] | [.key, .value] | @csv'
...with output:
"customer_app_version","el7.20150513"
"customer_num_apps",3
"app_memory_capacity_bytes",405248409600
"customer_name","Timbuktu Inc"
"app_disk_size_bytes",25117047875604
"customer_app_version","el4.20150513"
"customer_num_apps",34
"app_memory_capacity_bytes",58923439600
"customer_name","Bunnies Inc"
"app_disk_size_bytes",36517984275604
If you want customer name to be a column of its own, this might instead be:
jq -r '.hits.hits[]._source | .customer_name as $name | del(.customer_name) | to_entries | .[] | [$name, .key, .value] | @csv'
...with output:
"Timbuktu Inc","customer_app_version","el7.20150513"
"Timbuktu Inc","customer_num_apps",3
"Timbuktu Inc","app_memory_capacity_bytes",405248409600
"Timbuktu Inc","app_disk_size_bytes",25117047875604
"Bunnies Inc","customer_app_version","el4.20150513"
"Bunnies Inc","customer_num_apps",34
"Bunnies Inc","app_memory_capacity_bytes",58923439600
"Bunnies Inc","app_disk_size_bytes",36517984275604
If you're willing to hardcode the column names, consider instead:
jq -r '.hits.hits[]._source | [.customer_name, .customer_app_version, .customer_num_apps, .app_memory_capacity_bytes, .app_disk_size_bytes] | @csv'
with output:
"Timbuktu Inc","el7.20150513",3,405248409600,25117047875604
"Bunnies Inc","el4.20150513",34,58923439600,36517984275604
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With