Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Read specific Excel cell value into a variable

Situation:

I am using pandas to parse in separate Excel (.xlsx) sheets from a workbook with the following setup: Python 3.6.0 and Anaconda 4.3.1 on Windows 7 x64.

Problem:

I have been unable to find how to set a variable to a specific Excel sheet cell value e.g. var = Sheet['A3'].value from 'Sheet2' using pandas?

Question:

Is this possible? If so, how?

What i have tried:

I have searched through the pandas documentation on dataframe and various forums but haven't found an answer to this.

I know i can work around this using openpyxl (where i can specify a cell co-ordinate) but I want:

  1. To use pandas -if possible;
  2. Only read in the file once.

I have imported numpy, as well as pandas, so was able to write:

xls = pd.ExcelFile(filenamewithpath) 

data = xls.parse('Sheet1')
dateinfo2 = str(xls.parse('Sheet2', parse_cols = "A", skiprows = 2, nrows = 1, header = None)[0:1]).split('0\n0')[1].strip()

'Sheet1' being read into 'data' is fine as i have a function to collect the range i want.

I am also trying to read in from a separate sheet ('sheet2'), the value in cell "A3", and the code i have at present is clunky. It gets the value out as a string, as required, but is in no way pretty. I only want this cell value and as little additional sheet info as possible.

like image 391
QHarr Avatar asked Apr 21 '17 13:04

QHarr


People also ask

Can pandas Read_csv read Excel?

One crucial feature of Pandas is its ability to write and read Excel, CSV, and many other types of files. Functions like the Pandas read_csv() method enable you to work with files effectively.


Video Answer


2 Answers

Elaborating on @FLab's comment use something along those lines:

Edit:

Updated the answer to correspond to the updated question that asks how to read some sheets at once. So by providing sheet_name=None to read_excel() you can read all the sheets at once and pandas return a dict of DataFrames, where the keys are the Excel sheet names.

import pandas as pd
In [10]:

df = pd.read_excel('Book1.xlsx', sheetname=None, header=None)
df
Out[11]:
{u'Sheet1':    0
 0  1
 1  1, u'Sheet2':     0
 0   1
 1   2
 2  10}
In [13]:
data = df["Sheet1"]
secondary_data = df["Sheet2"]
secondary_data.loc[2,0]
Out[13]:
10

Alternatively, as noted in this post, if your Excel file has several sheets you can pass sheetname a list of strings, sheet names to parse eg.

df = pd.read_excel('Book1.xlsx', sheetname=["Sheet1", "Sheet2"], header=None)

Credits to user6241235 for digging out the last alternative

like image 172
Yannis P. Avatar answered Oct 05 '22 04:10

Yannis P.


Reading an Excel file using Pandas is going to default to a dataframe. You don't need an entire table, just one cell. The way I do it is to make that cell a header, for example:

# Read Excel and select a single cell (and make it a header for a column)
data = pd.read_excel(filename, 'Sheet2', index_col=None, usecols = "C", header = 10, nrows=0)

Will return a "list" of 1 header(s) and no data. Then isolate that header:

# Extract a value from a list (list of headers)
data = data.columns.values[0]
print (data)
like image 44
Arthur D. Howland Avatar answered Oct 05 '22 03:10

Arthur D. Howland