Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select entries based on multiple values in jq

Tags:

json

jq

whitelist

I'm working with JQ and I absolutely love it so far. I'm running into an issue I've yet to find a solution to anywhere else, though, and wanted to see if the community had a way to do this.

Let's presume we have a JSON file that looks like so:

{"author": "Gary", "text": "Blah"}
{"author": "Larry", "text": "More Blah"}
{"author": "Jerry", "text": "Yet more Blah"}
{"author": "Barry", "text": "Even more Blah"}
{"author": "Teri", "text": "Text on text on text"}
{"author": "Bob", "text": "Another thing to say"}

Now, we want to select rows where the value of author is equal to either "Gary" OR "Larry", but no other case. In reality, I have several thousand names I'm checking against, so simply stating the direct or conditional (e.g. cat blah.json | jq -r 'select(.author == "Gary" or .author == "Larry")') isn't sufficient. I'm trying to do this via the inside function like so but get an error dialog:

cat blah.json | jq -r 'select(.author | inside(["Gary", "Larry"]))'
jq: error (at <stdin>:1): array (["Gary","La...) and string ("Gary") cannot have their containment checked

What would be the best method for doing something like this?

like image 395
DGaff Avatar asked Jun 22 '17 16:06

DGaff


1 Answers

inside and contains are a bit weird. Here are some more straightforward solutions:

index/1

select( .author as $a | ["Gary", "Larry"] | index($a) )

any/2

["Gary", "Larry"] as $whitelist
| select( .author as $a | any( $whitelist[]; . == $a) )

Using a dictionary

If performance is an issue and if "author" is always a string, then a solution along the lines suggested by @JeffMercado should be considered. Here is a variant (to be used with the -n command-line option):

["Gary", "Larry"] as $whitelist
| ($whitelist | map( {(.): true} ) | add) as $dictionary
| inputs
| select($dictionary[.author])
like image 73
peak Avatar answered Sep 28 '22 04:09

peak