I have a CSV file which uses a highly customized format. Here, each number represents a the data in each of the 4 columns:
1 2 [3] 4
I need to restrict sed
to only search and modify data appearing in the fourth column. Essentially, it must ignore all data on the line appearing before the first occurrence of a closing square bracket and space, ]
and only modify data appearing after. E.g., file1.txt
might contain this:
penguin bird [lives in Antarctica] The penguin lives in cold places.
wolf dog [lives in Antarctica with penguins] The wolf likes to eat penguins.
The replacement might be sed 's/penguin/animal/g' file1.txt
. After running the script, the output would look like this:
penguin bird [lives in Antarctica] The animal lives in cold places.
wolf dog [lives in Antarctica with penguins] The wolf likes to eat animal.
In this case, all appearances of penguin
were ignored prior to the first ]
and were only changed on lines appearing after.
How can I have sed
ignore the first three columns of this custom CSV format while it finds and replaces text?
I have GNU sed version 4.2.1.
You tell sed to search for the '] ' combination followed by .*
(anything), and then as part of your replacement, you put back the ]
chars.
The only problem is that sed
usually "thinks" that a ]
char is part of a character-class definition, so you have to escape it. Try
echo "a b [c] d" | sed 's/\] .*$/\] XYZ/'
a b [c] XYZ
Note, that because there was no opening [
char to indicate char-class def, you can get away with
echo "a b [c] d" | sed 's/] .*$/] XYZ/'
a b [c] XYZ
Edit
To fix just the 4th word,
echo "a b [c] d e" | sed 's/\] [^ ][^ ]*/\] XYZ/'
a b [c] XYZ e
The addition from above [^ ][^ ]/
says "any-char-that-is-not-a-space" followed by any number of "any-char-that-is-not-a-space", so when the matcher finds the next space is stops matching.
final edit
echo "penguin bird [lives in Antarctica] The penguin lives in cold places.
wold dog [lives in Antarctica with penguins] The wolf likes to eat penguins." \
| sed 's/\] The penguin \(.*$\)/] The animal \1/'
and as you're using gnu sed, you don't need to escape the (...
) capturing parens.
echo "penguin bird [lives in Antarctica] The penguin lives in cold places.
wold dog [lives in Antarctica with penguins] The wolf likes to eat penguins." \
| sed 's/\] The penguin (*$)/] The animal \1/'
output
penguin bird [lives in Antarctica] The animal lives in cold places.
wolf dog [lives in Antarctica with penguins] The wolf likes to eat penguins.
Depending on the version of sed you are using. There is a pretty large difference bewtween sed
for the AIX
, vs solaris
, VS the GNU seds usually found in a lunix.
If you have other questions about using sed, it is usually helpful to include the output of sed --version
, or sed -V
. If no response from those commands, try what sed
. Else include the OS name for uname
.
IHTH
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With