I have a text file with a unicode line separator (hex code 2028).
I want to remove it using bash (I see implementations for Python, but not for this language). What command could I use to transform the text file (output4.txt) to lose the unicode line separator?
See in vim below:
Probably this tr command should also work:
tr '\xE2\x80\xA8' ' ' < inFile > outFIle
Working solution: Thanks to OP for finding this:
sed -i.old $'s/\xE2\x80\xA8/ /g' inFile
I noticed that in your screenshot, you have already opened file in vim, then why not just do the substitution in vim?
in vim you could do
:%s/(seebelow)//g
the (seebelow)
part, you could type:
ctrl-vu2028
You can probably use sed:
sed 's/\x20\x28//g' <file_in.txt >file_out.txt
To overwrite the original file:
sed -i 's/\x20\x28//g' file.txt
Edit: (See chepner's comment) You should make sure that you have the correct bytes, depending on the encoding, and then use sed to delete them. You could use e.g. od -t x1
for looking at the hex dump and figuring out the encoding.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With