Whenever I have the file open in Excel and run the code, I get the following error which is surprising because I thought read_excel should be a read only operation and would not require the file to be unlocked?
Traceback (most recent call last): File "C:\Users\Public\a.py", line 53, in <module> main() File "C:\Users\Public\workspace\a.py", line 47, in main blend = plStream(rootDir); File "C:\Users\Public\workspace\a.py", line 20, in plStream df = pd.read_excel(fPath, sheetname="linear strategy", index_col="date", parse_dates=True) File "C:\Users\Public\Continuum\Anaconda35\lib\site-packages\pandas\io\excel.py", line 163, in read_excel io = ExcelFile(io, engine=engine) File "C:\Users\Public\Continuum\Anaconda35\lib\site-packages\pandas\io\excel.py", line 206, in __init__ self.book = xlrd.open_workbook(io) File "C:\Users\Public\Continuum\Anaconda35\lib\site-packages\xlrd\__init__.py", line 394, in open_workbook f = open(filename, "rb") PermissionError: [Errno 13] Permission denied: '<Path to File>'
Read an Excel file into a pandas DataFrame. Supports xls , xlsx , xlsm , xlsb , odf , ods and odt file extensions read from a local filesystem or URL. Supports an option to read a single sheet or a list of sheets.
pandas. read_excel() function is used to read excel sheet with extension xlsx into pandas DataFrame. By reading a single sheet it returns a pandas DataFrame object, but reading two sheets it returns a Dict of DataFrame.
Pandas read_excel() Example The first parameter is the name of the excel file. The sheet_name parameter defines the sheet to be read from the excel file.
Generally Excel have a lot of restrictions when opening files (can't open the same file twice, can't open 2 different files with the same name ..etc).
I don't have excel on machine to test, but checking the docs for read_excel I've noticed that it allows you to set the engine
.
from the stack trace you posted it seems like the error is thrown by xlrd
which is the default engine used by pandas.
try using any of the other ones
Supported engines: “xlrd”, “openpyxl”, “odf”, “pyxlsb”, default “xlrd”.
so try with the rest, like
df = pd.read_excel(fPath, sheetname="linear strategy", index_col="date", parse_dates=True, engine="openpyxl")
I know this is not a real answer, but you might want to submit a bug report to pandas or xlrd teams.
I would suggest using the xlwings module instead which allows for greater functionality.
Firstly, you will need to load your workbook using the following line:
If the spreadsheet is in the same folder as your python script:
import xlwings as xw workbook = xw.Book('myfile.xls')
Alternatively:
workbook = xw.Book('"C:\Users\...\myfile.xls')
Then, you can create your Pandas DataFrame, by specifying the sheet within your spreadsheet and the cell where your dataset begins:
df = workbook.sheets[0].range('A1').options(pd.DataFrame, header=1, index=False, expand='table').value
When specifying a sheet you can either specify a sheet by its name or by its location (i.e. first, second etc.) in the following way:
workbook.sheets[0]
or workbook.sheets['sheet_name']
Lastly, you can simply install the xlwings module by using Pip install xlwings
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With