Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Python : Compare two csv files and print out differences


I need to compare two CSV files and print out differences in a third CSV file. In my case, the first CSV is a old list of hash named old.csv and the second CSV is the new list of hash which contains both old and new hash.

Here is my code :

import csv
t1 = open('old.csv', 'r')
t2 = open('new.csv', 'r')
fileone = t1.readlines()
filetwo = t2.readlines()

outFile = open('update.csv', 'w')
x = 0
for i in fileone:
    if i != filetwo[x]:
    x += 1

The third file is a copy of the old one and not the update. What's wrong ? I Hope you can help me, many thanks !!

PS : i don't want to use diff

like image 683
Nick Yellow Avatar asked Aug 17 '16 11:08

Nick Yellow

People also ask

How can I find the difference between two CSV files?

Click on "Compare" button to compare your CSV files! You can choose to display only the rows with differences or to display them all (With a color code to visualize the differences).

Can we compare two datasets in Python?

DataCompy is fairly useful library if you want to quickly compare two datasets. It also allow you to cater for minor differences between the data sets and provides detail summary about two dataframes.

1 Answers

The problem is that you are comparing each line in fileone to the same line in filetwo. As soon as there is an extra line in one file you will find that the lines are never equal again. Try this:

with open('old.csv', 'r') as t1, open('new.csv', 'r') as t2:
    fileone = t1.readlines()
    filetwo = t2.readlines()

with open('update.csv', 'w') as outFile:
    for line in filetwo:
        if line not in fileone:
like image 86
Chris Mueller Avatar answered Sep 25 '22 20:09

Chris Mueller