Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace a string, but only when it is the last occurance of that string before a specific line

Tags:

sed

I am converting a custom markup language to TeX.

I have a file like this:

\macro{stuff}
{more stuff}
{yet other stuff}
{some stuff}
these are extra lines
another extra line
there can be any number of extra lines
e

\macro{yet more stuff stuff}
{even more stuff}
{yet other stuff}
{some stuff}
this is extra
this is extra too
e

I need the result to be this:

\macro{stuff}
{more stuff}
{yet other stuff}
{some stuff}{
these are extra lines
another extra line
there can be any number of extra lines
}

\macro{yet more stuff stuff}
{even more stuff}
{yet other stuff}
{some stuff}{
this is extra
this is extra too
}

Notice e is by itself on a line, indicating the end of a set of data, it simply gets replaced with a closing bracket.

I could simply use this:

sed -i 's/^e$/}/g' file.tex

Resulting in this:

\macro{stuff}
{more stuff}
{yet other stuff}
{some stuff}
these are extra lines
another extra line
there can be any number of extra lines
}

\macro{yet more stuff stuff}
{even more stuff}
{yet other stuff}
{some stuff}
this is extra
this is extra too
}

The problem is, I also need a matching starting bracket to surround this extra text before the e.

One way to think of it is:

  1. Replace every occurrence of }.
  2. But only if that occurrence is at the end of the line.
  3. And only if that it is the last occurrence appearing before an e appearing completely by itself.

This is the closest I can figure, not sure how to match across any number of lines not containing more matches of }$:

sed -i 's/}$\n.*\n.*\n.*\n^e$/}{&}/g' file.tex

How can I wrap that final extra text inside { and }?

like image 446
Village Avatar asked Dec 22 '22 15:12

Village


2 Answers

It is easier to do this using awk using an empty RS. Here is a gnu-awk solution:

awk -v RS= '{sub(/.*}/, "&{"); sub(/\ne$/, "\n}"); ORS=RT} 1' file

\macro{stuff}
{more stuff}
{yet other stuff}
{some stuff}{
these are extra lines
another extra line
there can be any number of extra lines
}

\macro{yet more stuff stuff}
{even more stuff}
{yet other stuff}
{some stuff}{
this is extra
this is extra too
}

Or in any version of awk:

awk -v RS= '{sub(/.*}/, "&{"); sub(/\ne$/, "\n}\n")} 1' file
like image 199
anubhava Avatar answered Jun 09 '23 06:06

anubhava


Not as short or elegant as @anubhava's awk solution, but as an exercise I implemented the same in GNU sed.

 sed -n '/^e$/{ z; x; s/.*}/&{/; s/$/\n}/; p; d; }; /^$/d; H; ${ H; x; p; }' file

Breaking it out -
/^e$/{ z; x; s/.*}/&{/; s/$/\n}/; p; d; }; by component...
/^e$/{ ... } performs this list of actions on a line with just e:
z is a GNU option that empties the pattern space.
x exchanges the pattern and hold spaces.
s/.*}/&{/ adds an open paren after the last close paren in the block.
s/$/\n}/ adds a newline and } where the e was.
p will print the pattern space, and d will delete the record and move on.

/^$/d removes the empty records between.
H says Hold (append) the pattern space record onto the hold space, so that we accumulate the block until we hit the next e terminator line, or the end.

${ H; x; p; } just makes sure to print any record(s) after the last e.
Skip that if you don't care, or if you know there shouldn't be any.

If you aren't using GNU sed, it would look a bit different, lol

like image 28
Paul Hodges Avatar answered Jun 09 '23 06:06

Paul Hodges