XLRD/Python: Reading Excel file into dict with for-loops

Tags:

I'm looking to read in an Excel workbook with 15 fields and about 2000 rows, and convert each row to a dictionary in Python. I then want to append each dictionary to a list. I'd like each field in the top row of the workbook to be a key within each dictionary, and have the corresponding cell value be the value within the dictionary. I've already looked at examples here and here, but I'd like to do something a bit different. The second example will work, but I feel like it would be more efficient looping over the top row to populate the dictionary keys and then iterate through each row to get the values. My Excel file contains data from discussion forums and looks something like this (obviously with more columns):

id    thread_id    forum_id    post_time    votes    post_text 4     100          3           1377000566   1        'here is some text' 5     100          4           1289003444   0        'even more text here'

So, I'd like the fields id, thread_id and so on, to be the dictionary keys. I'd like my dictionaries to look like:

{id: 4,  thread_id: 100, forum_id: 3, post_time: 1377000566, votes: 1, post_text: 'here is some text'}

Initially, I had some code like this iterating through the file, but my scope is wrong for some of the for-loops and I'm generating way too many dictionaries. Here's my initial code:

import xlrd from xlrd import open_workbook, cellname  book = open_workbook('forum.xlsx', 'r') sheet = book.sheet_by_index(3)  dict_list = []  for row_index in range(sheet.nrows):     for col_index in range(sheet.ncols):         d = {}          # My intuition for the below for-loop is to take each cell in the top row of the          # Excel sheet and add it as a key to the dictionary, and then pass the value of          # current index in the above loops as the value to the dictionary. This isn't         # working.          for i in sheet.row(0):            d[str(i)] = sheet.cell(row_index, col_index).value            dict_list.append(d)

Any help would be greatly appreciated. Thanks in advance for reading.

744

asked May 09 '14 15:05

kylerthecreator

1 Answers

The idea is to, first, read the header into the list. Then, iterate over the sheet rows (starting from the next after the header), create new dictionary based on header keys and appropriate cell values and append it to the list of dictionaries:

from xlrd import open_workbook  book = open_workbook('forum.xlsx') sheet = book.sheet_by_index(3)  # read header values into the list     keys = [sheet.cell(0, col_index).value for col_index in xrange(sheet.ncols)]  dict_list = [] for row_index in xrange(1, sheet.nrows):     d = {keys[col_index]: sheet.cell(row_index, col_index).value           for col_index in xrange(sheet.ncols)}     dict_list.append(d)  print dict_list

For a sheet containing:

A   B   C   D 1   2   3   4 5   6   7   8

it prints:

[{'A': 1.0, 'C': 3.0, 'B': 2.0, 'D': 4.0},   {'A': 5.0, 'C': 7.0, 'B': 6.0, 'D': 8.0}]

UPD (expanding the dictionary comprehension):

d = {} for col_index in xrange(sheet.ncols):     d[keys[col_index]] = sheet.cell(row_index, col_index).value

answered Oct 01 '22 05:10

alecxe

Related questions
                            
                                Python:Extend the 'dict' class
                            
                                Replacing a Django image doesn't delete original
                            
                                Is there something better than django-piston? [closed]
                            
                                Insert image in openpyxl
                            
                                Line is too long. Django PEP8
                            
                                How to sort a dictionary by value (DESC) then by key (ASC)?
                            
                                Python 3.2 Lambda Syntax Error [duplicate]
                            
                                Make contour of scatter
                            
                                KeyError when indexing Pandas dataframe
                            
                                Ceil and floor equivalent in Python 3 without Math module?
                            
                                Creating a temporary directory in PyTest
                            
                                Nesting 'WITH' statements in Python
                            
                                Different behavior between re.finditer and re.findall
                            
                                How can I make a deepcopy of a function in Python?
                            
                                Match a line with multiple regex using Python
                            
                                Find all upper, lower and mixed case combinations of a string
                            
                                What is the underscore prefix for python file name?
                            
                                Python class member lazy initialization
                            
                                Standalone colorbar (matplotlib)
                            
                                Error Pickling in Python: io.UnsupportedOperation: read

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

XLRD/Python: Reading Excel file into dict with for-loops

Tags:

python

dictionary

excel

xlrd

kylerthecreator

People also ask

1 Answers

alecxe

Recent Activity

Donate For Us