How to remove extended ascii using python?

In trying to fix up a PML (Palm Markup Language) file, it appears as if my test file has non-ASCII characters which is causing MakeBook to complain. The solution would be to strip out all the non-ASCII chars in the PML.

So in attempting to fix this in python, I have

import unicodedata, fileinput

for line in fileinput.input():
    print unicodedata.normalize('NFKD', line).encode('ascii','ignore')

However, this results in an error that line must be "unicode, not str". Here's a file fragment.

\B1a\B \tintense, disordered and often destructive rage†.†.†.\t

Not quite sure how to properly pass line in to be processed at this point.

Jauder Ho

Jauder Ho

1 Answers

Try print line.decode('iso-8859-1').encode('ascii', 'ignore') -- that should be much closer to what you want.

Alex Martelli

Alex Martelli