I cannot figure out how to iterate through all rows in a specified column with openpyxl.
I want to print all of the cell values for all rows in column "C"
Right now I have:
from openpyxl import workbook path = 'C:/workbook.xlsx' wb = load_workbook(filename = path) ws=wb.get_sheet_by_name('Sheet3') for row in ws.iter_rows(): for cell in row: if column == 'C': print cell.value
The openpyxl module allows a Python program to read and modify Excel files. We will be using this excel worksheet in the below examples: Approach #1: We will create an object of openpyxl, and then we'll iterate through all rows from top to bottom.
Step 3: Load with OpenpyxlStill slow but a tiny drop faster than Pandas. Openpyxl Documentation: Memory use is fairly high in comparison with other libraries and applications and is approximately 50 times the original file size.
Why can't you just iterate over column 'C' (version 2.4.7):
for cell in ws['C']: print cell.value
You can specify a range to iterate over with ws.iter_rows()
:
import openpyxl wb = openpyxl.load_workbook('C:/workbook.xlsx') ws = wb['Sheet3'] for row in ws.iter_rows('C{}:C{}'.format(ws.min_row,ws.max_row)): for cell in row: print cell.value
Edit: per Charlie Clark you can alternately use ws.get_squared_range()
:
# ... ws.get_squared_range(min_col=1, min_row=1, max_col=1, max_row=10) # ...
Edit 2: per your comment you want the cell values in a list:
import openpyxl wb = openpyxl.load_workbook('c:/_twd/2016-06-23_xlrd_xlwt/input.xlsx') ws = wb.get_sheet_by_name('Sheet1') mylist = [] for row in ws.iter_rows('A{}:A{}'.format(ws.min_row,ws.max_row)): for cell in row: mylist.append(cell.value) print mylist
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With