Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas.read_excel: Accessing the home directory

[Solution Found]

I have encountered some unexpected behavior when trying to access my home directory using pandas.read_excel.

The file I want to access can be found at

/users/isys/orsheridanmeth

which is where cd ~/ takes me to. The file I would like to access is

'~/workspace/data/example.xlsx'

The following works to read in the excel file (using import pandas as pd):

df = pd.read_excel('workspace/data/example_.xlsx', 'Sheet1')

whereas

df = pd.read_excel('~/workspace/data/example.xlsx', 'Sheet1')

gives me the following error:

df = pd.read_excel('~/workspace/data/example.xlsx', 'Sheet1')
Traceback (most recent call last):
  File "/users/is/ahlpypi/egg_cache/i/ipython-3.2.0_1_ahl1-py2.7.egg/IPython/core/interactiveshell.py", line 3035, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-397-4412a9e7c128>", line 1, in <module>
    df = pd.read_excel('~/workspace/data/example.xlsx', 'Sheet1')
  File "/users/is/ahlpypi/egg_cache/p/pandas-0.16.2_ahl1-py2.7-linux-x86_64.egg/pandas/io/excel.py", line 151, in read_excel
    return ExcelFile(io, engine=engine).parse(sheetname=sheetname, **kwds)
  File "/users/is/ahlpypi/egg_cache/p/pandas-0.16.2_ahl1-py2.7-linux-x86_64.egg/pandas/io/excel.py", line 188, in __init__
    self.book = xlrd.open_workbook(io)
  File "/users/is/ahlpypi/egg_cache/x/xlrd-0.9.2-py2.7.egg/xlrd/__init__.py", line 394, in open_workbook
    f = open(filename, "rb")
IOError: [Errno 2] No such file or directory: '~/workspace/data/example.xlsx'

pandas.read_csv however worked when I used pd.read_csv('~/workspace/data/example.csv').

I would like to continue to use this relative paths to files. Any explanation why this doesn't work with pandas.read_excel?

Using xlrd

when using xlrd I get a similar error:

import xlrd
xl = xlrd.open_workbook('~/workspace/data/example.xlsx')
Traceback (most recent call last):
  File "/users/is/ahlpypi/egg_cache/i/ipython-3.2.0_1_ahl1-py2.7.egg/IPython/core/interactiveshell.py", line 3035, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-403-90af31feff4b>", line 1, in <module>
    xl = xlrd.open_workbook('~/workspace/data/example.xlsx')
  File "/users/is/ahlpypi/egg_cache/x/xlrd-0.9.2-py2.7.egg/xlrd/__init__.py", line 394, in open_workbook
    f = open(filename, "rb")
IOError: [Errno 2] No such file or directory: '~/workspace/data/example.xlsx'

[SOLUTION]

from os.path import expanduser as ospath
df = pd.read_excel(ospath('~/workspace/data/example.xlsx'), 'Sheet1')
like image 739
oliversm Avatar asked Jul 20 '16 08:07

oliversm


People also ask

How do you read all excel files under a directory as a pandas DataFrame?

To read all excel files in a directory, use the Glob module and the read_excel() method.

What is read_excel in pandas?

We can use the pandas module read_excel() function to read the excel file data into a DataFrame object. If you look at an excel sheet, it's a two-dimensional table. The DataFrame object also represents a two-dimensional tabular data structure.

How do I read a .xlxs file in pandas?

pandas. read_excel() function is used to read excel sheet with extension xlsx into pandas DataFrame. By reading a single sheet it returns a pandas DataFrame object, but reading two sheets it returns a Dict of DataFrame. Can load excel files stored in a local filesystem or from an URL.


1 Answers

I believe ~ is expanded by the shell - in which case your code is literally trying to open a path starting with ~. Oddly enough this doesn't work. :-)

Try running the path through os.path.expanduser() first - that should work to expand the ~ variable to the real value.

You may also want to look into os.path.expandvars().

Hope that helps

like image 129
Paul Walker Avatar answered Sep 21 '22 18:09

Paul Walker