Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using a single sed call to split and grep

Tags:

sed

This is mostly by curiosity, I am trying to have the same behavior as:

echo -e "test1:test2:test3"| sed 's/:/\n/g' | grep 1

in a single sed command.

I already tried

echo -e "test1:test2:test3"| sed -e "s/:/\n/g" -n "/1/p"

But I get the following error:

sed: can't read /1/p: No such file or directory

Any idea on how to fix this and combine different types of commands into a single sed call?

Of course this is overly simplified compared to the real usecase, and I know I can get around by using multiple calls, again this is just out of curiosity.

EDIT: I am mostly interested in the sed tool, I already know how to do it using other tools, or even combinations of those.
EDIT2: Here is a more realistic script, closer to what I am trying to achieve:

arch=linux64
base=https://chromedriver.storage.googleapis.com
split="<Contents>"
curl $base \
    | sed -e 's/<Contents>/<Contents>\n/g' \
    | grep $arch \
    | sed -e 's/^<Key>\(.*\)\/chromedriver.*/\1/' \
    | sort -V > out

What I would like to simplify is the curl line, turning it into something like:

curl $base \
 | sed 's/<Contents>/<Contents>\n/g' -n '/1/p' -e 's/^<Key>\(.*\)\/chromedriver.*/\1/' \
 | sort -V > out
like image 537
Soulthym Avatar asked Jun 26 '19 11:06

Soulthym


3 Answers

Here are some alternatives, awk and sed based:

sed -E "s/(.*:)?([^:]*1[^:]*).*/\2/" <<< "test1:test2:test3"
awk -v RS=":" '/1/' <<< "test1:test2:test3"
# or also 
awk 'BEGIN{RS=":"} /1/' <<< "test1:test2:test3"

Or, using your logic, you would need to pipe a second sed command:

sed "s/:/\n/g" <<< "test1:test2:test3" | sed -n "/1/p"

See this online demo. The awk solution looks cleanest.

Details

In sed solution, (.*:)?([^:]*1[^:]*).* pattern matches an optional sequence of any 0+ chars and a :, then captures into Group 2 any 0 or more chars other than :, 1, again 0 or more chars other than :, and then just matches the rest of the line. The replacement just keeps Group 2 contents.

In awk solution, the record separator is set to : and then /1/ regex is used to only return the record having 1 in it.

like image 94
Wiktor Stribiżew Avatar answered Nov 18 '22 11:11

Wiktor Stribiżew


This might work for you (GNU sed):

sed 's/:/\n/;/^[^\n]*1/P;D' file

Replace each : and if the first line in the pattern space contains 1 print it. Repeat.

An alternative:

sed -Ez 's/:/\n/g;s/^[^1]*$//mg;s/\n+/\n/;s/^\n//' file

This slurps the whole file into memory and replaces all colons by newlines. All lines that do not contain 1 are removed and surplus newlines deleted.

like image 27
potong Avatar answered Nov 18 '22 11:11

potong


echo -e "test1:test2:test3" | sed -En 's/:/\n/g;/^[^\n]*2[^\n]*(\n|$)/P;//!D'

  • sed -n doesn't print unless told to
  • sed -E allows using parens to match (\n|$) which is newline or the end of the pattern space
  • P prints the pattern buffer up to the first newline.
  • D trims the pattern buffer up to the first newline
  • [^\n] is a character class that matches anything except a newline
  • // is sed shorthand for repeating a match
  • //! is then matching everything that didn't match previously

So, after you split into newlines, you want to make sure the 2 character is between the start of the pattern buffer ^ and the first newline.

And, if there is not the character you are looking for, you want to D delete up to the first newline.

At that point, it works for one line of input, with one string containing the character you're looking for.

To expand to several matches within a line, you have to ta, conditionally branch back to label :a:

$ printf "test1:test2:test3\nbob3:bob2:fred2\n"  | \
    sed -En ':a s/:/\n/g;/^[^\n]*2[^\n]*(\n|$)/P;D;ta'
test2
bob2
fred2
like image 1
stevesliva Avatar answered Nov 18 '22 10:11

stevesliva