I'm currently writing a bash script to get the first value among the many comma separated strings. I have a file that looks like this - <pre class="prettyprint"><code>name things: "water bottle","40","new phone cover",10 place </code></pre> I just need to return the value in first double quotes. <pre class="prettyprint"><code>water bottle </code></pre> The value in first double quotes can be one word/two words. That is, <code>water bottle</code> can be sometimes replaced with <code>pen</code>. I tried - <pre class="prettyprint"><code>awk '/:/ {print $2}' </code></pre> But this just gives <pre class="prettyprint"><code>water </code></pre> I wanted to comma separate it, but there's <code>colon(:)</code> after <code>things</code>. So, I'm not sure how to separate it. How do i get the value present in first double quotes? EDIT: SOLUTION: I used the below code since I particularly wanted to use awk - <pre class="prettyprint"><code>awk '/:/' test.txt | cut -d\" -f2 </code></pre>

Using <code>gnu awk</code> you could make use of a capture group, and use a negated character class to not cross the <code>,</code> as that is the field delimiter. <pre class="prettyprint"><code>awk 'match($0, /^[^",:]*:[^",]*"([^"]*)"/, a) {print a[1]}' file </code></pre> Output <pre class="prettyprint"><code>water bottle </code></pre> The pattern matches <ul> <li> <code>^</code> Start of string</li> <li> <code>[^",:]*:</code>Optionally match any value except <code>"</code> and <code>,</code> and <code>:</code>, then match <code>:</code> </li> <li> <code>[^",]*</code> Optionally match any value except <code>"</code> and <code>,</code> </li> <li> <code>"([^"]*)"</code> Capture in group 1 the value between double quotes</li> </ul> <hr> If the value is always between double quotes, a short option to get the desired result could be setting the field separator to <code>"</code> and check if group 1 contains a colon, although technically you can also get <code>water bottle</code> if there is only a leading double quote and not closing one. <pre class="prettyprint"><code>awk -F'"' '$1 ~ /:/ {print $2}' file </code></pre>

A solution using the <code>cut</code> utility could be <pre class="prettyprint"><code>cut -d\" -f2 infile > outfile </code></pre>

With your shown samples, please try following <code>awk</code> code. <pre class="prettyprint"><code>awk '/^things:/ && match($0,/"[^"]*/){print substr($0,RSTART+1,RLENGTH-1)}' Input_file </code></pre> Explanation: In <code>awk</code> program checking if line starts with things: AND using <code>match</code> function to match everything between 1st and 2nd <code>"</code> and printing them accordingly.

Solution 1: <code>awk</code> You can use a single <code>awk</code> command: <pre class="prettyprint"><code>awk -F\" 'index($1, ":"){print $2}' test.txt > outfile </code></pre> See the online demo. The <code>-F\"</code> sets the field separator to a <code>"</code> char, <code>index($1, ":")</code> condition makes sure Field 1 contains a <code>:</code> char (no regex needed) and then <code>{print $2}</code> prints the second field value. Solution 2: <code>awk</code> + <code>cut</code> You can use <code>awk</code> + <code>cut</code>: <pre class="prettyprint lang-sh prettyprint-override"><code>awk '/:/' test.txt | cut -d\" -f2 > outfile </code></pre> With <code>awk '/:/' test.txt</code>, you will extract line(s) containing <code>:</code> char, and then the piped <code>cut -d\" -f2</code> command will split the string with <code>"</code> as a separator and return the second item. See the online demo. Solution 3: <code>sed</code> Alternatively, you can use <code>sed</code>: <pre class="prettyprint lang-sh prettyprint-override"><code>sed -n 's/^[^"]*"$[^"]*$".*/\1/p' file > outfile </code></pre> See the online demo: <pre class="prettyprint lang-sh prettyprint-override"><code>#!/bin/bash s='name things: "water bottle","40","new phone cover",10 place' sed -n 's/^[^"]*"$[^"]*$".*/\1/p' <<< "$s" # => water bottle </code></pre> The command means <ul> <li> <code>-n</code> - the option suppresses the default line output</li> <li> <code>^[^"]*"$[^"]*$".*</code> - a POSIX BRE regex pattern that matches <ul> <li> <code>^</code> - start of string</li> <li> <code>[^"]*</code> - zero or more chars other than <code>"</code> </li> <li> <code>"</code> - a <code>"</code> char</li> <li> <code>$[^"]*$</code> - Group 1 (<code>\1</code> refers to this value): any zero or more chars other than <code>"</code> </li> <li> <code>".*</code> - a <code>"</code> char and the rest of the string.</li> </ul> </li> <li> <code>\1</code> replaces the match with Group 1 value</li> <li> <code>p</code> - only prints the result of a successful substitution.</li> </ul>

How do i get the value present in first double quotes?

Tags:

bash

shell

sed

awk

I'm currently writing a bash script to get the first value among the many comma separated strings. I have a file that looks like this -

name


things: "water bottle","40","new phone cover",10



place

I just need to return the value in first double quotes.

water bottle

The value in first double quotes can be one word/two words. That is, water bottle can be sometimes replaced with pen. I tried -

awk '/:/ {print $2}'

But this just gives

water

I wanted to comma separate it, but there's colon(:) after things. So, I'm not sure how to separate it. How do i get the value present in first double quotes?

EDIT:

SOLUTION: I used the below code since I particularly wanted to use awk -

awk '/:/' test.txt | cut -d\" -f2

704

asked Dec 10 '21 11:12

Sourabrt

Video Answer

4 Answers

Using gnu awk you could make use of a capture group, and use a negated character class to not cross the , as that is the field delimiter.

awk 'match($0, /^[^",:]*:[^",]*"([^"]*)"/, a) {print a[1]}' file

Output

water bottle

The pattern matches

^ Start of string
[^",:]*:Optionally match any value except " and , and :, then match :
[^",]* Optionally match any value except " and ,
"([^"]*)" Capture in group 1 the value between double quotes

If the value is always between double quotes, a short option to get the desired result could be setting the field separator to " and check if group 1 contains a colon, although technically you can also get water bottle if there is only a leading double quote and not closing one.

awk -F'"' '$1 ~ /:/ {print $2}' file

answered Nov 15 '22 06:11

The fourth bird

A solution using the cut utility could be

cut -d\" -f2 infile > outfile

102

answered Nov 15 '22 07:11

M. Nejat Aydin

With your shown samples, please try following awk code.

awk '/^things:/ && match($0,/"[^"]*/){print substr($0,RSTART+1,RLENGTH-1)}' Input_file

Explanation: In awk program checking if line starts with things: AND using match function to match everything between 1st and 2nd " and printing them accordingly.

answered Nov 15 '22 06:11

RavinderSingh13

Solution 1: awk

You can use a single awk command:

awk -F\" 'index($1, ":"){print $2}' test.txt > outfile

See the online demo.

The -F\" sets the field separator to a " char, index($1, ":") condition makes sure Field 1 contains a : char (no regex needed) and then {print $2} prints the second field value.

Solution 2: awk + cut

You can use awk + cut:

awk '/:/' test.txt | cut -d\" -f2 > outfile

With awk '/:/' test.txt, you will extract line(s) containing : char, and then the piped cut -d\" -f2 command will split the string with " as a separator and return the second item. See the online demo.

Solution 3: sed

Alternatively, you can use sed:

sed -n 's/^[^"]*"\([^"]*\)".*/\1/p' file > outfile

See the online demo:

#!/bin/bash
s='name
things: "water bottle","40","new phone cover",10
place'
 
sed -n 's/^[^"]*"\([^"]*\)".*/\1/p' <<< "$s"
# => water bottle

The command means

-n - the option suppresses the default line output
^[^"]*"$[^"]*$".* - a POSIX BRE regex pattern that matches
- ^ - start of string
- [^"]* - zero or more chars other than "
- " - a " char
- $[^"]*$ - Group 1 (\1 refers to this value): any zero or more chars other than "
- ".* - a " char and the rest of the string.
\1 replaces the match with Group 1 value
p - only prints the result of a successful substitution.

answered Nov 15 '22 06:11

Wiktor Stribiżew

Related questions
                            
                                bash printf two arrays in two columns [duplicate]
                            
                                restrict pidof to own processes
                            
                                Setting IFS for a single statement
                            
                                How do I validate that a version number is valid using a regexp in bash?
                            
                                How to get exit code of remote command through ssh
                            
                                Do I need to quote command substitutions?
                            
                                which command can be used to determine if a file is binary
                            
                                How to make fish shell delete words like bash does
                            
                                Bash printf literal verbatim string
                            
                                global git config file does not exist
                            
                                bash command XOR ^ anothercommand
                            
                                How to delete all the lines after the last occurence of pattern?
                            
                                How can I tell if I'm in a child shell
                            
                                Python sockets/port forwarding
                            
                                If output of bash command is empty, do something
                            
                                How to add new column with header to csv with awk
                            
                                Why does Python's unittest give "ImportError: Import by filename is not supported." in WSL bash?
                            
                                Different result from $((++n)) when running bash vs dash
                            
                                How to remove duplicate elements in an existing array in bash? [duplicate]
                            
                                Use sed, grep, or awk without perl to replicate positive lookbehind

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How do i get the value present in first double quotes?

Tags:

bash

shell

sed

awk

Sourabrt

People also ask

Video Answer

4 Answers

The fourth bird

M. Nejat Aydin

RavinderSingh13

Wiktor Stribiżew

Recent Activity

Donate For Us