Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas read excel values not formulas

Is there a way to have pandas read in only the values from excel and not the formulas? It reads the formulas in as NaN unless I go in and manually save the excel file before running the code. I am just working with the basic read excel function of pandas,

import pandas as pd

df = pd.read_excel(filename, sheetname="Sheet1")

This will read the values if I have gone in and saved the file prior to running the code. But after running the code to update a new sheet, if I don't go in and save the file after doing that and try to run this again, it will read the formulas as NaN instead of just the values. Is there a work around that anyone knows of that will just read values from excel with pandas?

like image 913
Colton T Avatar asked Jan 18 '17 14:01

Colton T


People also ask

Can pandas read an Excel formula?

We can use the pandas module read_excel() function to read the excel file data into a DataFrame object. If you look at an excel sheet, it's a two-dimensional table. The DataFrame object also represents a two-dimensional tabular data structure.

How do you read data from Excel to pandas?

To read an excel file as a DataFrame, use the pandas read_excel() method. You can read the first sheet, specific sheets, multiple sheets or all sheets. Pandas converts this to the DataFrame structure, which is a tabular like structure.

Can pandas read XLSX files?

Read an Excel file into a pandas DataFrame. Supports xls , xlsx , xlsm , xlsb , odf , ods and odt file extensions read from a local filesystem or URL. Supports an option to read a single sheet or a list of sheets. Any valid string path is acceptable.

How do I view Excel columns in pandas?

Use pandas. read_excel() function to read excel sheet into pandas DataFrame, by default it loads the first sheet from the excel file and parses the first row as a DataFrame column name.


Video Answer


1 Answers

That is strange. The normal behaviour of pandas is read values, not formulas. Likely, the problem is in your excel files. Probably your formulas point to other files, or they return a value that pandas sees as nan.

In the first case, the sheet needs to be updated and there is nothing pandas can do about that (but read on).

In the second case, you could solve by setting explicit nan values in read_excel:

pd.read_excel(path, sheetname="Sheet1", na_values = [your na identifiers])

As for the first case, and as a workaround solution to make your work easier, you can automate what you are doing by hand using xlwings:

import pandas as pd
import xlwings as xl

def df_from_excel(path):
    app = xl.App(visible=False)
    book = app.books.open(path)
    book.save()
    app.kill()
    return pd.read_excel(path)

df = df_from_excel(path to your file)

If you want to keep those formulas in your excel file just save the file in a different location (book.save(different location)). Then you can get rid of the temporary files with shutil.

like image 54
RobatStats Avatar answered Sep 20 '22 14:09

RobatStats