I am writing a Python program to read in a DOS tree command outputted into a text document. When I reach the 533th iteration of the loop, Eclipse gives an error:
Traceback (most recent call last):
File "E:\Peter\Documents\Eclipse Workspace\MusicManagement\InputTest.py", line 24, in <module>
input = myfile.readline()
File "C:\Python33\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 3551: character maps to undefined
I have read other posts, and setting the encoding to latin-1 does not resolve this issue, as it returns a UnicodeDecodeError
on another character, and the same with trying to use utf-8.
The following is the code:
import os
from Album import *
os.system("tree F:\\Music > tree.txt")
myfile = open('tree.txt')
myfile.readline()
myfile.readline()
myfile.readline()
albums = []
x = 0
while x < 533:
if not input: break
input = myfile.readline()
if len(input) < 14:
artist = input[4:-1]
elif input[13] != '-':
artist = input[4:-1]
else:
albums.append(Album(artist, input[15:-1], input[8:12]))
x += 1
for x in albums:
print(x.artist + ' - ' + x.title + ' (' + str(x.year) + ')')
You need to figure out what encoding tree.com
used; according to this post that could any of the MS-DOS codepages.
You could go through each of the MS-DOS encodings; most of those have a codec in the python standard library. I'd try cp437
and cp500
first; the latter is the MS-DOS predecessor of cp1252 I think.
Pass the encoding to open()
:
myfile = open('tree.txt', encoding='cp437')
You really should look into using os.walk()
instead of using tree.com
for this task though, it'll save you having to deal with issues like these at least.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With