I am basically trying to copy some specific columns from a CSV file and paste those in an existing excel file[*.xlsx] using python. Say for example, you have a CSV file like this :
col_1 col_2 col_3 col_4
1 2 3 4
5 6 7 8
9 10 11 12
So, i wanted to copy the both col_3 and col_4 and paste those in col_8 and col_9 in an existing excel file [which is a .XLSX format]. I have tried this in various way to solve, but could not find out the exact way. i tried something like this :
with open( read_x_csv, 'rb') as f:
reader = csv.reader(f)
for row in reader:
list1 = row[13]
queue1.append(list1)
list2 = row[14]
queue2.append(list2)
list3 = row[15]
queue3.append(list3)
list4 = row[16]
queue4.append(list4)
and then
rb = open_workbook("Exact file path.....")
wb = copy(rb)
ws = wb.get_sheet(0)
row_no = 0
for item in queue1:
if(item != ""):
ii = int(item)
ws.write(row_no,12,ii)
row_no = row_no + 1
#ws.write(item)
print item
else:
ws.write(row_no,12,item)
row_no = row_no + 1
wb.save("Output.xls")
but problem with this solution is it does not allow me to save as *.XLSX format which is strictly required for me.
I have tried to use Openpyxl as it can handle *.XLSX format, but could not find out a way to modify the existing excel file. can anyone please help on this?
Doubt : 1) Can we really read a whole column from a CSV file and store into an array/list using python? 2) Can we modify the existing excel file which is in .XLSX format using openpyxl or any other package?
Introduction. openpyxl is a Python library to read/write Excel 2010 xlsx/xlsm/xltx/xltm files.
You can try the following implementation
from openpyxl import load_workbook
import csv
def update_xlsx(src, dest):
#Open an xlsx for reading
wb = load_workbook(filename = dest)
#Get the current Active Sheet
ws = wb.get_active_sheet()
#You can also select a particular sheet
#based on sheet name
#ws = wb.get_sheet_by_name("Sheet1")
#Open the csv file
with open(src) as fin:
#read the csv
reader = csv.reader(fin)
#enumerate the rows, so that you can
#get the row index for the xlsx
for index,row in enumerate(reader):
#Assuming space separated,
#Split the row to cells (column)
row = row[0].split()
#Access the particular cell and assign
#the value from the csv row
ws.cell(row=index,column=7).value = row[2]
ws.cell(row=index,column=8).value = row[3]
#save the csb file
wb.save(dest)
Can we really read a whole column from a CSV file and store into an array/list using python? No, because files are read sequentially, csv reader cannot read a column of data to a row. Instead you may read the whole content and use izip and islice to get a particular column. You can also use numpy.array
Can we modify the existing excel file which is in .XLSX format using openpyxl or any other package? Yes, see the example above
As it is 2021, get_sheet_by_name
is deprecated and raises an DeprecationWarning
with the following message:
Call to deprecated function get_sheet_by_name (Use wb[sheetname]).
The following snippet can be used in order to not raise the warning.
from openpyxl import load_workbook
file_path = 'test.xlsx'
wb = load_workbook(file_path)
ws = wb['SHEET_NAME'] # or wb.active
ws['G6'] = 123
wb.save(file_path)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With