Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading numeric Excel data as text using xlrd in Python

I am trying to read in an Excel file using xlrd, and I am wondering if there is a way to ignore the cell formatting used in Excel file, and just import all data as text?

Here is the code I am using for far:

import xlrd

xls_file = 'xltest.xls'
xls_workbook = xlrd.open_workbook(xls_file)
xls_sheet = xls_workbook.sheet_by_index(0)

raw_data = [['']*xls_sheet.ncols for _ in range(xls_sheet.nrows)]
raw_str = ''
feild_delim = ','
text_delim = '"'

for rnum in range(xls_sheet.nrows):
    for cnum in range(xls_sheet.ncols):
        raw_data[rnum][cnum] = str(xls_sheet.cell(rnum,cnum).value)

for rnum in range(len(raw_data)):
    for cnum in range(len(raw_data[rnum])):
        if (cnum == len(raw_data[rnum]) - 1):
            feild_delim = '\n'
        else:
            feild_delim = ','
        raw_str += text_delim + raw_data[rnum][cnum] + text_delim + feild_delim

final_csv = open('FINAL.csv', 'w')
final_csv.write(raw_str)
final_csv.close()

This code is functional, but there are certain fields, such as a zip code, that are imported as numbers, so they have the decimal zero suffix. For example, is there is a zip code of '79854' in the Excel file, it will be imported as '79854.0'.

I have tried finding a solution in this xlrd spec, but was unsuccessful.

like image 656
Brian Avatar asked Apr 29 '10 18:04

Brian


People also ask

Does xlrd work with Xlsx?

xlrd no longer supports . xlsx files. Use openpyxl to read . xlsx files.

How do I convert an Excel cell to a string in Python?

The correct answer to this is to simply use the Cell. value function. This will return a number or a Unicode string depending on what the cell contains.

What is the use of xlrd in Python?

Python xlrd retrieves data from a spreadsheet using the xlrd module. It is used to read, write, or modify data. Furthermore, the user may be required to navigate sheets on certain criteria, as well as modify some rows, and columns and perform other tasks. Thismodule is used to extract data from the sheet.


1 Answers

That's because integer values in Excel are imported as floats in Python. Thus, sheet.cell(r,c).value returns a float. Try converting the values to integers but first make sure those values were integers in Excel to begin with:

cell = sheet.cell(r,c)
cell_value = cell.value
if cell.ctype in (2,3) and int(cell_value) == cell_value:
    cell_value = int(cell_value)

It is all in the xlrd spec.

like image 125
ktdrv Avatar answered Oct 27 '22 17:10

ktdrv