Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Refine: iterate over a JSON dictionary

I've got some JSON within Google Refine - http://mapit.mysociety.org/point/4326/0.1293497,51.5464828 for the full version, but abbreviated it's like this:

{1234: {'name': 'Barking', 'type': 'WMC'},
 5678: {'name': 'England', 'type': 'EUR'} }

I only want to extract the name for the object with the (presumed unique) type WMC.

Parse JSON in Google Refine doesn't help, that's working with arrays, not dicts.

Any suggestions what I should be looking at to fix this?


Edit: I don't know what the initial keys are: I believe they're unique identifiers which I can't predict ahead of time.

like image 233
Dragon Avatar asked Oct 08 '22 17:10

Dragon


1 Answers

Refine doesn't currently know how to iterate through the keys of a dict where they keys are unknown (although I'm about to implement that functionality).

The trick to getting this working with the current implementation is to convert the JSON object to a JSON array. The following GREL expression will do that, parse the result as JSON, iterate through all elements of the array and give you the first name of type 'WMC'.

filter(('['+(value.replace(/"[0-9]+":/,""))[1,-1]+']').parseJson(),v,v['type']=='WMC')[0]['name']

Use that expression with the "Add column based on this column" command to create a new WMC name column. If there's a chance that there'll be more than one name of this type and you want them all, you can add in a forEach loop and join along the lines of

forEach(filter(('['+(value.replace(/"[0-9]+":/,""))[1,-1]+']').parseJson(),v,v['type']=='WMC'),x,x['name']).join('|')

This will give you a pipe separated list of names that you can split apart using "Split multi-valued cells."

It'll be easier in the next release hopefully!

like image 200
Tom Morris Avatar answered Oct 12 '22 11:10

Tom Morris