Newbie question. In Python 2.7.2., I have a problem reading text files which accidentally seem to contain some control characters. Specifically, the loop
for line in f
will cease without any warning or error as soon as it comes across a line containing the SUB
character (ascii hex code 1a). When using f.readlines()
the result is the same. Essentially, as far as Python is concerned, the file is finished as soon as the first SUB
character is encountered, and the last value assigned line
is the line up to that character.
Is there a way to read beyond such a character and/or to issue a warning when encountering one?
On Windows systems 0x1a
is the End-of-File character. You'll need to open the file in binary mode in order to get past it:
f = open(filename, 'rb')
The downside is you will lose the line-oriented nature and have to split the lines yourself:
lines = f.read().split('\r\n') # assuming Windows line endings
Try opening the file in binary mode:
f = open(filename, 'rb')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With