I want to replace all pairs of square brackets in a file, e.g., [some text]
, with \macro{some text}
, e.g.:
This is some [text].
This [line] has [some more] text.
This becomes:
This is some \macro{text}.
This \macro{line} has \macro{some more} text.
How can I replace these pairs of brackets with this code?
Find and replace text within a file using sed command The procedure to change the text in files under Linux/Unix using sed: Use Stream EDitor (sed) as follows: sed -i 's/old-text/new-text/g' input.txt. The s is the substitute command of sed for find and replace.
If you want to remove the [ or the ] , use the expression: "\\[|\\]" . The two backslashes escape the square bracket and the pipe is an "or".
The square brackets are a synonym for the test command. An if statement checks the exit status of a command in order to decide which branch to take. grep -q "$text" is a command, but "$name" = 'Bob' is not--it's just an expression.
It took a little doing, but here:
sed -i.bkup 's/\[\([^]]*\)\]/\\macro{\1}/g' test.txt
Let's see if I can explain this regular expression:
\[
is matching a square bracket. Since [
is a valid magic regular expression character, the backslash means to match the literal character.\(...\)
is a capture group. It captures the part of the regular expression I want. I can have many capture groups, and in sed
I can reference them as \1
, \2
, etc.\(...\)
. I have [^]]*
.
[^...]
syntax means any character but.[^]]
means any character but a closing brace.*
means zero or more of the preceding. That means I am capturing zero or more characters that are not closing square braces.\]
means the closing square bracketLet's look at the line this is [some] more [text]
s
in some as many characters as possible that are not closing square brackets. This means I am matching [some
, but only capturing some
.[some
and now I'm matching on the last closing square bracket. That means I'm matching [some]
. Note that regular expressions are normally greedy. I'll explain below why this is important.\\macro(\1)
. The \1
is replaced by my capture group. The \\
is just a backslash. Thus, I'll replace [some]
with \macro{some}
.It would be much easier if I could be guaranteed a single set of square brackets in each line. Then I could have done this:
sed -i.bkup 's/\[\(.*\)\]/\\macro(\1)/g'
The capture group is now saying anything between to square brackets. However, the problem is that regular expressions are greedy, that means I would have matched from the s
in some
all the way to the final t
in text. The 'x' below show the capture group. The [
and ]
show the square brackets I'm matching on:
this is [some] more [text]
[xxxxxxxxxxxxxxxx]
This became more complex because I had to match on characters that had special meaning to regular expressions, so we see a lot of backslashing. Plus, I had to account for regular expression greediness, which got the nice looking, non-matching string [^]]*
to match anything not a closing bracket. Add in the square brackets before and after \[[^]]*\]
, and don't forget the \(...\)
capture group: \[\([^]]*\)\]
And you get one big mess of a regular expression.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With