I'm currently writing a bash script to get the first value among the many comma separated strings. I have a file that looks like this -
name
things: "water bottle","40","new phone cover",10
place
I just need to return the value in first double quotes.
water bottle
The value in first double quotes can be one word/two words. That is, water bottle
can be sometimes replaced with pen
.
I tried -
awk '/:/ {print $2}'
But this just gives
water
I wanted to comma separate it, but there's colon(:)
after things
. So, I'm not sure how to separate it.
How do i get the value present in first double quotes?
EDIT:
SOLUTION: I used the below code since I particularly wanted to use awk -
awk '/:/' test.txt | cut -d\" -f2
When referencing a variable, it is generally advisable to enclose its name in double quotes. This prevents reinterpretation of all special characters within the quoted string -- except $, ` (backquote), and \ (escape).
index() to find where the quotes("") begin and end? temp. index('"') , or temp. index("\"") .
A string using double quotes is exactly the same as a string using single quotes. Note, however, that a string beginning with a double quote must end with a double quote, and a string beginning with a single quote must end with a single quote.
The first double quote tells Excel that this is where the string of text starts. The second and third are the escape character and the character we want to display, and the fourth double quote tells Excel where the text string ends. As we can see in the examples above, the multiple double quotes can become rather hard to count.
Double quotes inside a formula. If you need to include double quotes inside a formula, you can use additional double quotes as "escape characters". By escaping a character, you are asking Excel to to treat the " character as literal text. As always, you'll also need to include double quotes wherever you would normally in a formula.
If you use double quote signs, you should use single quotation signs for a quote within a quote. "When I say 'immediately,' I mean sometime before August," said the manager.
Working with extra double quotes can get confusing fast, so another way to do the same thing is to use the CHAR function with the number 34: In this case, CHAR (34) returns the double quote character (") which is included in the result as literal text.
Using gnu awk
you could make use of a capture group, and use a negated character class to not cross the ,
as that is the field delimiter.
awk 'match($0, /^[^",:]*:[^",]*"([^"]*)"/, a) {print a[1]}' file
Output
water bottle
The pattern matches
^
Start of string[^",:]*:
Optionally match any value except "
and ,
and :
, then match :
[^",]*
Optionally match any value except "
and ,
"([^"]*)"
Capture in group 1 the value between double quotesIf the value is always between double quotes, a short option to get the desired result could be setting the field separator to "
and check if group 1 contains a colon, although technically you can also get water bottle
if there is only a leading double quote and not closing one.
awk -F'"' '$1 ~ /:/ {print $2}' file
A solution using the cut
utility could be
cut -d\" -f2 infile > outfile
With your shown samples, please try following awk
code.
awk '/^things:/ && match($0,/"[^"]*/){print substr($0,RSTART+1,RLENGTH-1)}' Input_file
Explanation: In awk
program checking if line starts with things: AND using match
function to match everything between 1st and 2nd "
and printing them accordingly.
Solution 1: awk
You can use a single awk
command:
awk -F\" 'index($1, ":"){print $2}' test.txt > outfile
See the online demo.
The -F\"
sets the field separator to a "
char, index($1, ":")
condition makes sure Field 1 contains a :
char (no regex needed) and then {print $2}
prints the second field value.
Solution 2: awk
+ cut
You can use awk
+ cut
:
awk '/:/' test.txt | cut -d\" -f2 > outfile
With awk '/:/' test.txt
, you will extract line(s) containing :
char, and then the piped cut -d\" -f2
command will split the string with "
as a separator and return the second item. See the online demo.
Solution 3: sed
Alternatively, you can use sed
:
sed -n 's/^[^"]*"\([^"]*\)".*/\1/p' file > outfile
See the online demo:
#!/bin/bash
s='name
things: "water bottle","40","new phone cover",10
place'
sed -n 's/^[^"]*"\([^"]*\)".*/\1/p' <<< "$s"
# => water bottle
The command means
-n
- the option suppresses the default line output^[^"]*"\([^"]*\)".*
- a POSIX BRE regex pattern that matches
^
- start of string[^"]*
- zero or more chars other than "
"
- a "
char\([^"]*\)
- Group 1 (\1
refers to this value): any zero or more chars other than "
".*
- a "
char and the rest of the string.\1
replaces the match with Group 1 valuep
- only prints the result of a successful substitution.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With