Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing two csv files and getting difference

Tags:

python

csv

I have two csv file I need to compare and then spit out the differnces:

CSV FORMAT:

 Name   Produce   Number
 Adam   Apple     5
 Tom    Orange    4
 Adam   Orange    11

I need to compare the two csv files and then tell me if there is a difference between Adams apples on sheet and sheet 2 and do that for all names and produce numbers. Both CSV files will be formated the same.

Any pointers will be greatly appreciated

like image 314
Trying_hard Avatar asked Jun 19 '12 20:06

Trying_hard


2 Answers

I have used csvdiff

$pip install csvdiff
$csvdiff --style=compact col1 a.csv b.csv 

Link to package on pypi

I found this link useful

like image 107
Aakash Gupta Avatar answered Nov 05 '22 03:11

Aakash Gupta


If your CSV files aren't so large they'll bring your machine to its knees if you load them into memory, then you could try something like:

import csv
csv1 = list(csv.DictReader(open('file1.csv')))
csv2 = list(csv.DictReader(open('file2.csv')))
set1 = set(csv1)
set2 = set(csv2)
print set1 - set2 # in 1, not in 2
print set2 - set1 # in 2, not in 1
print set1 & set2 # in both

For large files, you could load them into a SQLite3 database and use SQL queries to do the same, or sort by relevant keys and then do a match-merge.

like image 33
Jon Clements Avatar answered Nov 05 '22 02:11

Jon Clements