Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing duplicate rows from a CSV file using a python script and update this CSV file

Tags:

python

csv

I have a myfile.csv with rows like

first, second, third
1, 2, 3
a, b, c
1, 2, 3

and so on.

I don't understand how to remove duplicate rows in myfile.csv.

One condition, we can't save new files, we need to update myfile.csv.
In order to after run script myfile.csv look like

first, second, third
a, b, c
1, 2, 3

So new data is not saved to a new file need of updating myfile.csv.
Thank you very much.

like image 922
Serhii Avatar asked Dec 06 '22 12:12

Serhii


1 Answers

You can loop over the data and filter the lists to contain only unique values:

import csv
with open('filename.csv') as f:
  data = list(csv.reader(f))
  new_data = [a for i, a in enumerate(data) if a not in data[:i]]
  with open('filename.csv', 'w') as t:
     write = csv.writer(t)
     write.writerows(new_data)
like image 189
Ajax1234 Avatar answered Dec 08 '22 00:12

Ajax1234