So since mysql 5.7 i am now able to use the json datatype which is nice when your working with frameworks like angularjs.
Now since im fairly new to this (and this datatype is also fairly new) i was wondering if i could make some simple operations on that dataset.
For instance i have saved a json in my database where each object contains the following fields:
name
country
phone_num
What i wish to do is i wish to sum how many that lives in each country normally a select like this would work
select country, count(*) as num_residents from mytable
However im not quite sure how to use this sort of select statement on a table that has a json datatype.
Can anyone help me out?
Update
Okay so my table looks like this:
id int(11) AI PK
date_created timestamp
data json
schema_id int(11)
The data looks like this:
[{"A": 1, "B": "Debra", "C": "Peters", "D": "[email protected]", "E": "Female", "F": "Tsagaan-Ovoo"}, {"A": 2, "B": "Marc", "C": "Rasmussen", "D": "[email protected]", "E": "Male", "F": "Copenhagen"}]
And i am trying the following sql statement:
SELECT
data,
JSON_EXTRACT(data, "$.F") AS country,
count(JSON_EXTRACT(data, "$.F")) AS num_residents
FROM kpi_data
WHERE schema_id = 39
GROUP BY country, data
But sadly i get this result:
Firstly you are storing all the JSON in one row, if you want to query the data like this each resident should have it's own row of his/her own JSON:
*other_fields*, {"A": 1, "B": "Debra", "C": "Peters", "D": "[email protected]", "E": "Female", "F": "Tsagaan-Ovoo"}
*other_fields*, {"A": 2, "B": "Marc", "C": "Rasmussen", "D": "[email protected]", "E": "Male", "F": "Copenhagen"}
Secondly you don't need to count the extracted data, you just need to count the rows for each country extracted. This will do:
SELECT JSON_EXTRACT(data, "$.F") AS country,
COUNT(*) AS num_residents
Thirdly, you don't want to GROUP BY data
at all as this is presumably unique per resident. This will leave you with a group for each row. I think you just want:
GROUP BY country
Putting it all together:
SELECT JSON_EXTRACT(data, "$.F") AS country,
COUNT(*) AS num_residents
FROM kpi_data
WHERE schema_id = 39
GROUP BY country
For decent performance, you may consider placing an secondary index on the extracted country.. see DOCS
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With