Merging two CSV files using Python

Tags:

OK I have read several threads here on Stack Overflow. I thought this would be fairly easy for me to do but I find that I still do not have a very good grasp of Python. I tried the example located at How to combine 2 csv files with common column value, but both files have different number of lines and that was helpful but I still do not have the results that I was hoping to achieve.

Essentially I have 2 csv files with a common first column. I would like to merge the 2. i.e.

filea.csv

Click to copy

 title,stage,jan,feb darn,3.001,0.421,0.532 ok,2.829,1.036,0.751 three,1.115,1.146,2.921

fileb.csv

Click to copy

 title,mar,apr,may,jun, darn,0.631,1.321,0.951,1.751 ok,1.001,0.247,2.456,0.3216 three,0.285,1.283,0.924,956

output.csv (not the one I am getting but what I want)

Click to copy

 title,stage,jan,feb,mar,apr,may,jun darn,3.001,0.421,0.532,0.631,1.321,0.951,1.751 ok,2.829,1.036,0.751,1.001,0.247,2.456,0.3216 three,1.115,1.146,2.921,0.285,1.283,0.924,956

output.csv (the output that I actually got)

Click to copy

 title,feb,may ok,0.751,2.456 three,2.921,0.924 darn,0.532,0.951

The code I was trying:

Click to copy

''' testing merging of 2 csv files ''' import csv import array import os  with open('Z:\\Desktop\\test\\filea.csv') as f:     r = csv.reader(f, delimiter=',')     dict1 = {row[0]: row[3] for row in r}  with open('Z:\\Desktop\\test\\fileb.csv') as f:     r = csv.reader(f, delimiter=',')     #dict2 = {row[0]: row[3] for row in r}     dict2 = {row[0:3] for row in r}  print str(dict1) print str(dict2)  keys = set(dict1.keys() + dict2.keys()) with open('Z:\\Desktop\\test\\output.csv', 'wb') as f:     w = csv.writer(f, delimiter=',')     w.writerows([[key, dict1.get(key, "''"), dict2.get(key, "''")] for key in keys])

Any help is greatly appreciated.

441

asked Apr 28 '13 17:04

Rex

1 Answers

When I'm working with csv files, I often use the pandas library. It makes things like this very easy. For example:

Click to copy

import pandas as pd  a = pd.read_csv("filea.csv") b = pd.read_csv("fileb.csv") b = b.dropna(axis=1) merged = a.merge(b, on='title') merged.to_csv("output.csv", index=False)

Some explanation follows. First, we read in the csv files:

Click to copy

>>> a = pd.read_csv("filea.csv") >>> b = pd.read_csv("fileb.csv") >>> a    title  stage    jan    feb 0   darn  3.001  0.421  0.532 1     ok  2.829  1.036  0.751 2  three  1.115  1.146  2.921 >>> b    title    mar    apr    may       jun  Unnamed: 5 0   darn  0.631  1.321  0.951    1.7510         NaN 1     ok  1.001  0.247  2.456    0.3216         NaN 2  three  0.285  1.283  0.924  956.0000         NaN

and we see there's an extra column of data (note that the first line of fileb.csv -- title,mar,apr,may,jun, -- has an extra comma at the end). We can get rid of that easily enough:

Click to copy

>>> b = b.dropna(axis=1) >>> b    title    mar    apr    may       jun 0   darn  0.631  1.321  0.951    1.7510 1     ok  1.001  0.247  2.456    0.3216 2  three  0.285  1.283  0.924  956.0000

Now we can merge a and b on the title column:

Click to copy

>>> merged = a.merge(b, on='title') >>> merged    title  stage    jan    feb    mar    apr    may       jun 0   darn  3.001  0.421  0.532  0.631  1.321  0.951    1.7510 1     ok  2.829  1.036  0.751  1.001  0.247  2.456    0.3216 2  three  1.115  1.146  2.921  0.285  1.283  0.924  956.0000

and finally write this out:

Click to copy

>>> merged.to_csv("output.csv", index=False)

producing:

Click to copy

title,stage,jan,feb,mar,apr,may,jun darn,3.001,0.421,0.532,0.631,1.321,0.951,1.751 ok,2.829,1.036,0.751,1.001,0.247,2.456,0.3216 three,1.115,1.146,2.921,0.285,1.283,0.924,956.0

112

answered Sep 20 '22 05:09

DSM

Related questions
                            
                                pymongo auth failed in python script
                            
                                Unable to install psycopg2 (pip install psycopg2)
                            
                                "Adding" Dictionaries in Python? [duplicate]
                            
                                finding out absolute path to a file from python
                            
                                How to execute python file in linux
                            
                                Column of lists, convert list to string as a new column
                            
                                How to get a GCP Bearer token programmatically with python
                            
                                different fields for add and change pages in admin
                            
                                How to check if a name/value pair exists when posting data?
                            
                                How to download python from command-line? [closed]
                            
                                Django rest framework permission_classes of ViewSet method
                            
                                Why does 1+++2 = 3?
                            
                                Python: Resize an existing array and fill with zeros
                            
                                ParseError: not well-formed (invalid token) using cElementTree
                            
                                Do union types actually exist in python?
                            
                                Access IP Camera in Python OpenCV
                            
                                I cannot install numpy because it can't find python 2.7, althought I have installed python
                            
                                Format time string in Python 3.3
                            
                                How do I create a CSV file from database in Python?
                            
                                Immutable dictionary, only use as a key for another dictionary

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Merging two CSV files using Python

Tags:

python

dictionary

merge

csv

key

Rex

People also ask

1 Answers

DSM

Recent Activity

Donate For Us