Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to operate on part of line only

Tags:

sed

How do I make sed operate on specific parts of a line only? And, on the contrary, how do I make sed not work on specific parts of a line?

Examples:

"A a A a ( A a ) A ( a A ) a"

How do I, for instance, replace all the As with Ts only between the ( and ) to obtain:

"A a A a ( T a ) A ( a T ) a"

And given next example input:

"F f F f ( F f ) F ( f F ) f"

How do I, for instance, replace all the Fs with Xs but not between the ( and ) to obtain:

"X f X f ( F f ) X ( f F ) f"

I searched Google but found nothing usable. I guess it's a general question about sed. The problem is reducible to general sed "templates", I hope.

  1. having FROM and TO then operate between them only (on all occurrences on given line)
  2. having FROM and TO operate anywhere else than between them...
  3. special case when FROM and TO are the same (between " and " or "FOO" and "FOO" etc.) for both 1. and 2.

It should work with any operation, not just substitution, but also with printing etc., like printing everything between strings "FOO" and "BAR" in string.

"1 2 3 BAR a b c FOO d e f BAR g a h FOO i j k BAR l m n FOO o p q"

The result will be

" d e f  i j k "

So, general examples on how to do it would be highly appreciated. It also seems that this question is quite common, but no good howto is found on the Google yet. I also guess this would be quite challenging to answer. Please, also do no give any hints on how to do it using Perl, AWK or whatever else than sed. This question is really a sed-only question.

like image 811
mjf Avatar asked Dec 14 '10 14:12

mjf


2 Answers

Divide and conquer.

Insert newlines to separate the segments then use the newlines, line beginning (^), line ending ($) and delimiter characters (parentheses in this case) as anchors and loop. The added newlines are removed at the end.

$ echo "A a A a ( A a ) A ( a A ) a" |
    sed 's/([^)]*)/\n&/g; 
         :a; 
           s/\(\n([^)]*\)A\([^)]*)\)/\1T\2/;
         ta; 
         s/\n//g'
A a A a ( T a ) A ( a T ) a
$ echo "F f F f ( F f ) F ( f F ) f" | 
    sed 's/(/\n(/g; 
         s/)/)\n/g; 
         :a; 
           s/\([^(]*\)F\([^)]*\(\n\|$\)\)/\1X\2/g; 
         ta; 
         s/\n//g'
X f X f ( F f ) X ( f F ) f
$ echo "1 2 3 BAR a b c FOO d e f BAR g a h FOO i j k BAR l m n FOO o p q" | 
    sed 's/^/BAR/;
         s/$/FOO/;
         s/FOO/&\n/g;
         s/BAR/\n&/g;
         s/BAR[^\n]*\n//g;
         s/[^\n]*FOO\n//g;
         s/\n//g'
 d e f  i j k
like image 53
Dennis Williamson Avatar answered Nov 13 '22 17:11

Dennis Williamson


This might work for you (GNU sed):

sed ':a;s/\(([^)]*\)A/\1T/;ta' file # for case 1

sed ':a;s/\(([^)]*\)F/\1\n/;ta;y/F\n/TF/' file  # for case 2

For case 1 use a loop to substitute A's inside brackets to T's.

For case 2 use the same as above to change F's inside brackets to newlines, then translate F's and newlines to X's and F's respectively.

Case 3 is a little more involved but can be done in 2 substitute commands:

sed -r 's/FOO|BAR/\n&/g;s/[^\n]*(\nBAR[^\n]*)*(\nFOO([^\n]*)\nBAR)?(\nFOO[^\n]*$)?/\3/g' file

First prefix each FOO and BAR strings with newlines. Then look for all combinations of FOO and BAR and only keep the strings between FOO and BAR. The newlines allow the use of the negative class to simplify the procedure.

like image 28
potong Avatar answered Nov 13 '22 15:11

potong