Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

awk/gsub - print everything between double quotes in multiple occurrences per line

Tags:

bash

sed

awk

I attempting to print all data between double quotes (sampleField="sampleValue"), but am having trouble to get awk and/or sub/gsub to return all instances of data between the double quotes. I'd then like to print all instances on the respective lines they were found to keep the data together.

Here is a sample of the input.txt file:

deviceId="1300", deviceName="router 13", deviceLocation="Corp"
deviceId="2000", deviceName="router 20", deviceLocation="DC1"

The output I'm looking for is:

"1300", "router 13", "Corp"
"2000", "router 20", "DC1"

I'm having trouble using gsub to remove all of the data between a , and =. Each time I've tried a different approach, it always just returns the first field and moves onto the next line.

UPDATE:

I forgot to mention that I won't know how many double quote encapsulated fields will be on each line. It could be 1, 3, or 5,000. Not sure if this affects the solution, but wanted to make sure it was out there.

like image 302
Travis Crooks Avatar asked Jan 23 '13 19:01

Travis Crooks


3 Answers

A sed solution:

sed -r 's/[^\"]*([\"][^\"]*[\"][,]?)[^\"]*/\1 /g'
    <<< 'deviceId="1300", deviceName="router 13", deviceLocation="Corp"'

Output:

"1300", "router 13", "Corp"

Or for a file:

sed -r 's/[^\"]*([\"][^\"]*[\"][,]?)[^\"]*/\1 /g' input.txt
like image 155
Rubens Avatar answered Oct 20 '22 20:10

Rubens


awk -F '"' '{printf(" %c%s%c, %c%s%c, %c%s%c\n", 34,$2, 34, 34, $4,34, $6, 34) } ' \
    input file > newfile

is another simpler approach, using quote as a field separator.

awk 'BEGIN{ t=sprintf("%c", 34)}
     { for(i=1; i<=NF; i++){
        if(index($i,t) ){print $i}  }; printf("\n")}'  infile > outfile

More general awk approach.

like image 2
jim mcnamara Avatar answered Oct 20 '22 20:10

jim mcnamara


awk -F \" '
    {
        sep=""
        for (i=2; i<=NF; i+=2) {
            printf "%s\"%s\"", sep, $i
            sep=", "
        }
        print ""
    }
' << END
deviceId="1300", deviceName="router 13", deviceLocation="Corp", foo="bar"
deviceId="2000", deviceName="router 20", deviceLocation="DC1"
END

outputs

"1300", "router 13", "Corp", "bar"
"2000", "router 20", "DC1"
like image 1
glenn jackman Avatar answered Oct 20 '22 21:10

glenn jackman