I am trying to combine 2 different Excel files. (thanks to the post Import multiple excel files into python pandas and concatenate them into one dataframe)
The one I work out so far is:
import os
import pandas as pd
df = pd.DataFrame()
for f in ['c:\\file1.xls', 'c:\\ file2.xls']:
data = pd.read_excel(f, 'Sheet1')
df = df.append(data)
df.to_excel("c:\\all.xls")
Here is how they look like.
However I want to:
For example:
Is it possible? Thanks.
On the Data tab, under Tools, click Consolidate. In the Function box, click the function that you want Excel to use to consolidate the data. In each source sheet, select your data, and then click Add. The file path is entered in All references.
For num. 1, you can specify skip_footer
as explained here; or, alternatively, do
data = data.iloc[:-2]
once your read the data.
For num. 2, you may do:
from os.path import basename
data.index = [basename(f)] * len(data)
Also, perhaps would be better to put all the data-frames in a list and then concat
them at the end; something like:
df = []
for f in ['c:\\file1.xls', 'c:\\ file2.xls']:
data = pd.read_excel(f, 'Sheet1').iloc[:-2]
data.index = [os.path.basename(f)] * len(data)
df.append(data)
df = pd.concat(df)
import os
import os.path
import xlrd
import xlsxwriter
file_name = input("Decide the destination file name in DOUBLE QUOTES: ")
merged_file_name = file_name + ".xlsx"
dest_book = xlsxwriter.Workbook(merged_file_name)
dest_sheet_1 = dest_book.add_worksheet()
dest_row = 1
temp = 0
path = input("Enter the path in DOUBLE QUOTES: ")
for root,dirs,files in os.walk(path):
files = [ _ for _ in files if _.endswith('.xlsx') ]
for xlsfile in files:
print ("File in mentioned folder is: " + xlsfile)
temp_book = xlrd.open_workbook(os.path.join(root,xlsfile))
temp_sheet = temp_book.sheet_by_index(0)
if temp == 0:
for col_index in range(temp_sheet.ncols):
str = temp_sheet.cell_value(0, col_index)
dest_sheet_1.write(0, col_index, str)
temp = temp + 1
for row_index in range(1, temp_sheet.nrows):
for col_index in range(temp_sheet.ncols):
str = temp_sheet.cell_value(row_index, col_index)
dest_sheet_1.write(dest_row, col_index, str)
dest_row = dest_row + 1
dest_book.close()
book = xlrd.open_workbook(merged_file_name)
sheet = book.sheet_by_index(0)
print "number of rows in destination file are: ", sheet.nrows
print "number of columns in destination file are: ", sheet.ncols
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With