Delete ^L character in a log file [duplicate]

Question

I want to delete all the characters "\L" that I find when i read the file. I tried to use this function when I read a line:

def cleanString(self, s):
            if isinstance(s, str):
                    s = unicode(s,"iso-8859-1","replace")
                    s=unicodedata.normalize('NFD', s)
                    return s.encode('ascii', 'ignore')

But it doesn't delete this character. Does someone know how to do it?

I tried using the replace function as well, but it is not better:

s = line.replace("\^L","")

Thanks for your answers.

glglgl · Accepted Answer

Probably you have not the literal characters ^ and L, but something that is displayed as ^L.

This would be the form feed character.

So do s = line.replace('\x0C', '').

Tim Pietzcker · Answer

^L (codepoint 0C) is an ASCII character, so it won't be affected by an encoding to ASCII. You could filter out all control characters using a small regex (and, while you're at it, filter out everything non-ASCII as well):

import re
def cleanString(self, s):
    if isinstance(s, str):
        s = unicode(s,"iso-8859-1","replace")
        s = unicodedata.normalize('NFD', s)
        s = re.sub(r"[^\x20-\x7f]+", "", s)  # remove non-ASCII/nonprintables
        return str(s)                        # No encoding necessary

Delete ^L character in a log file [duplicate]

Tags:

python

unicode

Kvasir

2 Answers

glglgl

Tim Pietzcker

Recent Activity

Donate For Us

Delete ^L character in a log file [duplicate]

Tags:

python

unicode

Kvasir

2 Answers

glglgl

Tim Pietzcker

Related questions

Recent Activity

Donate For Us