Remove special characters from csv file using python

Tags:

There seems to something on this topic already (How to replace all those Special Characters with white spaces in python?), but I can't figure this simple task out for the life of me.

I have a .CSV file with 75 columns and almost 4000 rows. I need to replace all the 'special characters' ($ # & * ect) with '_' and write to a new file. Here's what I have so far:

import csv

input = open('C:/Temp/Data.csv', 'rb')
lines = csv.reader(input)
output = open('C:/Temp/Data_out1.csv', 'wb')
writer = csv.writer(output)

conversion = '-"/.$'
text =  input.read()
newtext = '_'
for c in text:
    newtext += '_' if c in conversion else c
    writer.writerow(c)

input.close()
output.close()

All this succeeds in doing is to write everything to the output file as a single column, producing over 65K rows. Additionally, the special characters are still present!

Sorry for the redundant question. Thank you in advance!

494

asked Apr 01 '13 19:04

Jenny

2 Answers

I might do something like

import csv

with open("special.csv", "rb") as infile, open("repaired.csv", "wb") as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile)
    conversion = set('_"/.$')
    for row in reader:
        newrow = [''.join('_' if c in conversion else c for c in entry) for entry in row]
        writer.writerow(newrow)

which turns

$ cat special.csv
th$s,2.3/,will-be
fixed.,even.though,maybe
some,"shoul""dn't",be

(note that I have a quoted value) into

$ cat repaired.csv 
th_s,2_3_,will-be
fixed_,even_though,maybe
some,shoul_dn't,be

Right now, your code is reading in the entire text into one big line:

text =  input.read()

Starting from a _ character:

newtext = '_'

Looping over every single character in text:

for c in text:

Add the corrected character to newtext (very slowly):

    newtext += '_' if c in conversion else c

And then write the original character (?), as a column, to a new csv:

    writer.writerow(c)

.. which is unlikely to be what you want. :^)

answered Sep 28 '22 09:09

DSM

This doesn't seem to need to deal with CSV's in particular (as long as the special characters aren't your column delimiters).

lines = []
with open('C:/Temp/Data.csv', 'r') as input:
    lines = input.readlines()

conversion = '-"/.$'
newtext = '_'
outputLines = []
for line in lines:
    temp = line[:]
    for c in conversion:
        temp = temp.replace(c, newtext)
    outputLines.append(temp)

with open('C:/Temp/Data_out1.csv', 'w') as output:
    for line in outputLines:
        output.write(line + "\n")

answered Sep 28 '22 10:09

dckrooney

Related questions
                            
                                Deleting variables in Python standard libraries
                            
                                Verifying whether a tree is bst or not Python
                            
                                Two bar charts in matplotlib overlapping the wrong way
                            
                                How can I print a Python class? [duplicate]
                            
                                Dynamically build complex queries with Q() in Django [closed]
                            
                                Python - how to read/parse csv like line?
                            
                                Is there any way to have output piped line-by-line from a currently executing python program?
                            
                                Inverting large sparse matrices with scipy
                            
                                Call Python code from LLVM JIT
                            
                                How to cancel interpreter command in emacs python-mode
                            
                                Why does overriding __contains__ break OrderedDict.keys?
                            
                                Python module with methods imported from a sub-module into root namespace
                            
                                Python CSV reader return Row as list
                            
                                How to eliminate a python3 deprecation warning for the equality operator?
                            
                                What's the most pythonic method for specifying a configuration file?
                            
                                What does the b'' sentinel mean in Python iter()?
                            
                                Can I add pygame events from a second thread
                            
                                Using git-remote-hg on windows
                            
                                python comprehension with multiple 'for' clauses and single 'if'
                            
                                Behavior of "round" function in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Remove special characters from csv file using python

Tags:

python

csv

python-2.7

Jenny

People also ask

2 Answers

DSM

dckrooney

Recent Activity

Donate For Us