How to restrict sed to replace only data appearing after the first closing square bracket?

Question

I have a CSV file which uses a highly customized format. Here, each number represents a the data in each of the 4 columns:

1 2 [3] 4

I need to restrict sed to only search and modify data appearing in the fourth column. Essentially, it must ignore all data on the line appearing before the first occurrence of a closing square bracket and space, ] and only modify data appearing after. E.g., file1.txt might contain this:

penguin bird [lives in Antarctica] The penguin lives in cold places.
wolf dog [lives in Antarctica with penguins] The wolf likes to eat penguins.

The replacement might be sed 's/penguin/animal/g' file1.txt. After running the script, the output would look like this:

penguin bird [lives in Antarctica] The animal lives in cold places.
wolf dog [lives in Antarctica with penguins] The wolf likes to eat animal.

In this case, all appearances of penguin were ignored prior to the first ] and were only changed on lines appearing after.

Additional closing brackets might appear later in the line, but only the first should be regarded as the division.

How can I have sed ignore the first three columns of this custom CSV format while it finds and replaces text?

I have GNU sed version 4.2.1.

shellter · Accepted Answer

You tell sed to search for the '] ' combination followed by .* (anything), and then as part of your replacement, you put back the ] chars.

The only problem is that sed usually "thinks" that a ] char is part of a character-class definition, so you have to escape it. Try

echo "a b [c] d" | sed 's/\] .*$/\] XYZ/'
a b [c] XYZ

Note, that because there was no opening [ char to indicate char-class def, you can get away with

echo "a b [c] d" | sed 's/] .*$/] XYZ/'
a b [c] XYZ

Edit

To fix just the 4th word,

echo "a b [c] d e" | sed 's/\] [^ ][^ ]*/\] XYZ/'
a b [c] XYZ e

The addition from above [^ ][^ ]/ says "any-char-that-is-not-a-space" followed by any number of "any-char-that-is-not-a-space", so when the matcher finds the next space is stops matching.

final edit

echo "penguin bird [lives in Antarctica] The penguin lives in cold places.
wold dog [lives in Antarctica with penguins] The wolf likes to eat penguins." \
| sed 's/\] The penguin $.*$$/] The animal \1/'

and as you're using gnu sed, you don't need to escape the (...) capturing parens.

echo "penguin bird [lives in Antarctica] The penguin lives in cold places.
wold dog [lives in Antarctica with penguins] The wolf likes to eat penguins." \
| sed 's/\] The penguin (*$)/] The animal \1/'

output

penguin bird [lives in Antarctica] The animal lives in cold places.
wolf dog [lives in Antarctica with penguins] The wolf likes to eat penguins.

Depending on the version of sed you are using. There is a pretty large difference bewtween sed for the AIX, vs solaris, VS the GNU seds usually found in a lunix.

If you have other questions about using sed, it is usually helpful to include the output of sed --version, or sed -V. If no response from those commands, try what sed. Else include the OS name for uname.

IHTH

How to restrict sed to replace only data appearing after the first closing square bracket?

Tags:

csv

sed

Village

1 Answers

shellter

Recent Activity

Donate For Us

How to restrict sed to replace only data appearing after the first closing square bracket?

Tags:

csv

sed

Village

1 Answers

shellter

Related questions

Recent Activity

Donate For Us