I've got some JSON within Google Refine - http://mapit.mysociety.org/point/4326/0.1293497,51.5464828 for the full version, but abbreviated it's like this:
{1234: {'name': 'Barking', 'type': 'WMC'},
5678: {'name': 'England', 'type': 'EUR'} }
I only want to extract the name for the object with the (presumed unique) type WMC
.
Parse JSON in Google Refine doesn't help, that's working with arrays, not dicts.
Any suggestions what I should be looking at to fix this?
Edit: I don't know what the initial keys are: I believe they're unique identifiers which I can't predict ahead of time.
Refine doesn't currently know how to iterate through the keys of a dict where they keys are unknown (although I'm about to implement that functionality).
The trick to getting this working with the current implementation is to convert the JSON object to a JSON array. The following GREL expression will do that, parse the result as JSON, iterate through all elements of the array and give you the first name of type 'WMC'.
filter(('['+(value.replace(/"[0-9]+":/,""))[1,-1]+']').parseJson(),v,v['type']=='WMC')[0]['name']
Use that expression with the "Add column based on this column" command to create a new WMC name column. If there's a chance that there'll be more than one name of this type and you want them all, you can add in a forEach loop and join along the lines of
forEach(filter(('['+(value.replace(/"[0-9]+":/,""))[1,-1]+']').parseJson(),v,v['type']=='WMC'),x,x['name']).join('|')
This will give you a pipe separated list of names that you can split apart using "Split multi-valued cells."
It'll be easier in the next release hopefully!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With