I have a CSV file which uses a highly customized format. Here, each number represents a the data in each of the 4 columns:
1 2 [3] 4
I need to restrict sed to only search and modify data appearing in the fourth column. Essentially, it must ignore all data on the line appearing before the first occurrence of a closing square bracket and space, ] and only modify data appearing after. E.g., file1.txt might contain this:
penguin bird [lives in Antarctica] The penguin lives in cold places.
wolf dog [lives in Antarctica with penguins] The wolf likes to eat penguins.
The replacement might be sed 's/penguin/animal/g' file1.txt. After running the script, the output would look like this:
penguin bird [lives in Antarctica] The animal lives in cold places.
wolf dog [lives in Antarctica with penguins] The wolf likes to eat animal.
In this case, all appearances of penguin were ignored prior to the first ] and were only changed on lines appearing after.
How can I have sed ignore the first three columns of this custom CSV format while it finds and replaces text?
I have GNU sed version 4.2.1.
You tell sed to search for the '] ' combination followed by .* (anything), and then as part of your replacement, you put back the ] chars.
The only problem is that sed usually "thinks" that a ] char is part of a character-class definition, so you have to escape it. Try
echo "a b [c] d" | sed 's/\] .*$/\] XYZ/'
a b [c] XYZ
Note, that because there was no opening [ char to indicate char-class def, you can get away with
echo "a b [c] d" | sed 's/] .*$/] XYZ/'
a b [c] XYZ
Edit
To fix just the 4th word,
echo "a b [c] d e" | sed 's/\] [^ ][^ ]*/\] XYZ/'
a b [c] XYZ e
The addition from above [^ ][^ ]/ says "any-char-that-is-not-a-space" followed by any number of "any-char-that-is-not-a-space", so when the matcher finds the next space is stops matching.
final edit
echo "penguin bird [lives in Antarctica] The penguin lives in cold places.
wold dog [lives in Antarctica with penguins] The wolf likes to eat penguins." \
| sed 's/\] The penguin \(.*$\)/] The animal \1/'
and as you're using gnu sed, you don't need to escape the (...) capturing parens.
echo "penguin bird [lives in Antarctica] The penguin lives in cold places.
wold dog [lives in Antarctica with penguins] The wolf likes to eat penguins." \
| sed 's/\] The penguin (*$)/] The animal \1/'
output
penguin bird [lives in Antarctica] The animal lives in cold places.
wolf dog [lives in Antarctica with penguins] The wolf likes to eat penguins.
Depending on the version of sed you are using. There is a pretty large difference bewtween sed for the AIX, vs solaris, VS the GNU seds usually found in a lunix.
If you have other questions about using sed, it is usually helpful to include the output of sed --version, or sed -V. If no response from those commands, try what sed. Else include the OS name for uname.
IHTH
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With