I am a newbie. However, I managed to extract some lines from a txt-file (unicode) and write them in another file.
lines = InFile.readlines()
OutFile.writelines(lines[3:])
It is working but (I believe) due to a coding issue there is a space added between each character in the output file. Example of a result:
2 0 1 3 - 1 2 - 2 3 ; ; 3 6 0 . 3 7
2 0 1 3 - 1 2 - 2 4 ; ; 0 . 0 0
Lines in the source file:
2013-12-23;;360.37
2013-12-24;;0.00
If I save the txt source file as ANSI before running the script, I receive the correct results. However, as the source file is delivered automatically as Unicode by another software, it is not practical to change that every time manually. I read through a lot of other coding/encoding/decoding questions. But I am completely lost and don't know how I can fix that issue. Which is the correct command? At which place in the script? Or am I completely wrong and it doesn't have anything to do with a coding issue?
I'm fairly certain that your input file is UTF-16 encoded, and the spaces you're seeing are actually null bytes.
Try
with open("myfile.txt", "r", encoding="utf-16") as infile:
lines = infile.readlines()
and see if the problem persists.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With