I'd like to modify a file by adding line numbers to the beginning of each line. I've found that the following command does this:
cat file | perl -pe '$_ = "$. $_"' > file_with_line_numbers
This seems to work, however, when I open the file in vim it's full of ^@ and ^M characters. Further investigation shows that the encoding has changed.
> file -bi file
text/plain; charset=utf-16le
> file -bi file_with_line_numbers
application/octet-stream; charset=binary
What am I missing here?
Because you're not decoding your input data and you're not encoding your output data, and by concatenating $.
with $_
you're mixing data that are in two different encodings (rather, you're mixing a byte-string and a character string, but perl is implicitly converting the byte string to a character string, and doing it in a very wrong way for what you need).
One fix would be:
perl -pe 'BEGIN { binmode STDIN, ":encoding(utf16le)"; binmode STDOUT, ":encoding(utf16le)" } $_ = "$. $_";' < input > output
You need to decode your program's input and encode your program's output.
As ysth points out, this will do the trick (except on Windows, but probably using cygwin):
perl -Mopen=:std,':encoding(utf-16le)' -pe'$_="$. $_";' file.in >file.out
Rest of original answer:
This is easiest done if you have UTF-8, since you can then use -CSDA
.
<file.in iconv -f UTF-16le -t UTF-8 \
| perl -CSDA -pe'$_="$. $_";' \
| iconv -f UTF-8 -t UTF-16le \
>file.out
Due to properties of UTF-8, you can get away without decoding/encoding completely in this case, allowing you to use either of the following:
<file.in iconv -f UTF-16le -t UTF-8 \
| perl -pe'$_="$. $_";' \
| iconv -f UTF-8 -t UTF-16le \
>file.out
or
<file.in iconv -f UTF-16le -t UTF-8 \
| nl \
| iconv -f UTF-8 -t UTF-16le \
>file.out
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With