Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing unwanted characters from a string in Python

I have some strings that I want to delete some unwanted characters from them. For example: Adam'sApple ----> AdamsApple.(case insensitive) Can someone help me, I need the fastest way to do it, cause I have a couple of millions of records that have to be polished. Thanks

like image 515
Hossein Avatar asked May 06 '10 12:05

Hossein


2 Answers

One simple way:

>>> s = "Adam'sApple"
>>> x = s.replace("'", "")
>>> print x
'AdamsApple'

... or take a look at regex substitutions.

like image 131
miku Avatar answered Oct 11 '22 20:10

miku


Here is a function that removes all the irritating ascii characters, the only exception is "&" which is replaced with "and". I use it to police a filesystem and ensure that all of the files adhere to the file naming scheme I insist everyone uses.

def cleanString(incomingString):
    newstring = incomingString
    newstring = newstring.replace("!","")
    newstring = newstring.replace("@","")
    newstring = newstring.replace("#","")
    newstring = newstring.replace("$","")
    newstring = newstring.replace("%","")
    newstring = newstring.replace("^","")
    newstring = newstring.replace("&","and")
    newstring = newstring.replace("*","")
    newstring = newstring.replace("(","")
    newstring = newstring.replace(")","")
    newstring = newstring.replace("+","")
    newstring = newstring.replace("=","")
    newstring = newstring.replace("?","")
    newstring = newstring.replace("\'","")
    newstring = newstring.replace("\"","")
    newstring = newstring.replace("{","")
    newstring = newstring.replace("}","")
    newstring = newstring.replace("[","")
    newstring = newstring.replace("]","")
    newstring = newstring.replace("<","")
    newstring = newstring.replace(">","")
    newstring = newstring.replace("~","")
    newstring = newstring.replace("`","")
    newstring = newstring.replace(":","")
    newstring = newstring.replace(";","")
    newstring = newstring.replace("|","")
    newstring = newstring.replace("\\","")
    newstring = newstring.replace("/","")        
    return newstring
like image 30
Pescolly Avatar answered Oct 11 '22 19:10

Pescolly