Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sed - how to remove everything but a defined pattern?

Tags:

sed

I have to remove everything but 1, 2, or 3 digits (0-9, or 10-99, or 100) preceding % (I don't want to see the %, though) from another command's output and pipe it forward to another command. I know that

sed -n '/%/p'

will show only the line(s) containing %, but that's not what I want. How can I get rid of the rest of the unwanted text and leave only these numbers to then pipe them to another command?

like image 413
octosquidopus Avatar asked Aug 20 '11 02:08

octosquidopus


2 Answers

If you're not completely tied to sed, this is exactly what grep -o does:

grep -o '[0-9]\{1,3\}%'
like image 51
glenn jackman Avatar answered Sep 29 '22 20:09

glenn jackman


EDIT: I have misunderstood the OP and posted an invalid answer. I changed it to an answer that, I believe, would solve the problem in the more general scenario.

For a file such as the one below:

$ cat input
abc
123%
123
abc%
this is 456% and nothing more
456

Use sed -n -E 's/(^|.*[^0-9])([0-9]{1,3})%.*/\2/p' input

$  sed  -n -E 's/(^|.*[^0-9])([0-9]{1,3})%.*/\2/p' input
123
456

The -n flag makes sed to suppress automatic output of the lines. Then, we use the -E flag which will allow us to use extended regular expressions. (In GNU sed, the flag is not -E but instead is -r).

Now comes the s/// command. The group (^|.*[^0-9]) matchs either a beginning of line (^) or a series of zero or more chars (.*) ending in a non-digit char ([^0-9]). [0-9]\{1,3\} just matches one to three digits and is bound to a group (by the ( and ) group delimiters) if the group is preceded by (^|.*[^0-9]) and followed by %. Then .* matches everything before and after this pattern. After this, we replace everything by the second group (([0-9]{1,3})) using the backreference \2. Since we passed -n to sed, nothing would be printed but we passed the p flag to the s/// command. The result is that if the replacement is executed then the resulted line is printed. Note the p is a flag of s///, not the p command, because it comes just after the last /.

like image 41
brandizzi Avatar answered Sep 29 '22 21:09

brandizzi