Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging multiple CSV files into separate tabs of a spreadsheet in Python

I have a code which generates multiple CSV files in a directory. I want to generate a report in excel which will consist of the CSV files as separate tabs. I have used the below code for the same:

import pandas as pd
import os
import csv
import glob    
path = "/MyScripts"
all_files = glob.glob(os.path.join(path, "*.csv"))
df_from_each_file = (pd.read_csv(f) for f in all_files)
df_from_each_file.to_excel(writer, sheet_name='ReturnData.csv')
writer.save()

But it gives below error: AttributeError: 'generator' object has no attribute 'to_excel' Not sure where i am going wrong. Do i need to import any specific library to solve the issue?

Python Version is 2.7

like image 688
Soubhik Banerjee Avatar asked Aug 22 '18 09:08

Soubhik Banerjee


People also ask

How do I combine multiple CSV files into one in Python?

To merge all CSV files, use the GLOB module. The os. path. join() method is used inside the concat() to merge the CSV files together.


Video Answer


2 Answers

There are two issues here:

  1. Your generator expression allows you to lazily iterate dataframe objects. You can't export a generator expression to an Excel file.
  2. Your sheet_name parameter is a constant. To export to multiple worksheets, you need to specify a different name for each worksheet.

You can use a simple for loop for this purpose:

writer = pd.ExcelWriter('out.xlsx', engine='xlsxwriter')
df_from_each_file = (pd.read_csv(f) for f in all_files)

for idx, df in enumerate(df_from_each_file):
    df.to_excel(writer, sheet_name='data{0}.csv'.format(idx))

writer.save()

Your worksheets will be named data0.csv, data1.csv, etc. If you need the filename as your sheet name, you can restructure your logic and use the os module to extract the filename from path:

import os

writer = pd.ExcelWriter('out.xlsx', engine='xlsxwriter')

for f in all_files:
    df = pd.read_csv(f)
    df.to_excel(writer, sheet_name=os.path.basename(f))

writer.save()
like image 74
jpp Avatar answered Oct 19 '22 07:10

jpp


Here is the complete source code from jpp solution:

import os
import pandas as pd
import glob

path = './'
all_files = glob.glob(os.path.join(path, "*.csv"))

writer = pd.ExcelWriter('out.xlsx', engine='xlsxwriter')

for f in all_files:
    df = pd.read_csv(f)
    df.to_excel(writer, sheet_name=os.path.splitext(os.path.basename(f))[0], index=False)

writer.save()
like image 39
Dan Avatar answered Oct 19 '22 05:10

Dan