I want to apply a this function hideEmail
to a specific column of my csv
file (large file) using python
Example of function :
def hideEmail(email):
#hide email
text = re.sub(r'[^@.]', 'x', email)
return text
Csv file (large file > 1gb):
id;Name;firstName;email;profession
100;toto;tata;[email protected];developer
101;titi;tete;[email protected];doctor
..
..
Load the csv
data into a DataFrame
:
df = pd.read_csv(r'/path/to/csv')
Then you can just use pd.Series.str.replace
directly as it supports regex by default:
df = df.astype(str).apply(lambda x: x.str.replace(r'[^@.]', 'x'), axis=1)
That said, if all you want to do is changing a large csv
file, pandas
is probably an overkill.. You might have a look at sed
. Here's one example:
sed -E 's/(\w+)@(\w+)/xxx@xxx/' /path/to/file.csv > /path/to/new_file.csv
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With