Get word between quotes

Question

I have x lines like this:

Unable to find latest released revision of 'CONTRIB_046578'.

And I need to extract the word between the revision of ' and ' in this example the word CONTRIB_046578 and if possible count the number of occurrences of that word using grep, sed or any other command?

Chris Seymour · Accepted Answer

The cleanest solution is with grep -Po "(?<=')[^']+(?=')"

$ cat file
Unable to find latest released revision of 'CONTRIB_046578'
Unable to find latest released revision of 'foo'
Unable to find latest released revision of 'bar'
Unable to find latest released revision of 'CONTRIB_046578'

# Print occurences 
$ grep -Po "(?<=')[^']+(?=')" file
CONTRIB_046578
foo
bar
CONTRIB_046578

# Count occurences
$ grep -Pc "(?<=')[^']+(?=')" file
4

# Count unique occurrences 
$ grep -Po "(?<=')[^']+(?=')" file | sort | uniq -c 
2 CONTRIB_046578
1 bar
1 foo

Ed Morton · Answer

All you need is a very simple awk script to count the occurrences of what's between the quotes:

awk -F\' '{c[$2]++} END{for (w in c) print w,c[w]}' file

Using @anubhava's test input file:

$ cat file
Unable to find latest released revision of 'CONTRIB_046572'
Unable to find latest released revision of 'CONTRIB_046578'
Unable to find latest released revision of 'CONTRIB_046579'
Unable to find latest released revision of 'CONTRIB_046570'
Unable to find latest released revision of 'CONTRIB_046579'
Unable to find latest released revision of 'CONTRIB_046572'
Unable to find latest released revision of 'CONTRIB_046579'
$
$ awk -F\' '{c[$2]++} END{for (w in c) print w,c[w]}' file
CONTRIB_046578 1
CONTRIB_046579 3
CONTRIB_046570 1
CONTRIB_046572 2

Get word between quotes

Tags:

linux

grep

unix

sed

awk

user1921608

2 Answers

Chris Seymour

Ed Morton

Recent Activity

Donate For Us

Get word between quotes

Tags:

linux

grep

unix

sed

awk

user1921608

2 Answers

Chris Seymour

Ed Morton

Related questions

Recent Activity

Donate For Us