sheet_name param on pandas. read_excel() is used to read multiple sheets from excel. This supports reading excel sheets by name or position. When you read multiple sheets, it creates a Dict of DataFrame, each key in Dictionary is represented as Sheet name and DF for Dict value.
To merge all excel files in a folder, use the Glob module and the append() method. Note − You may need to install openpyxl and xlrd packages.
We can use the pandas module read_excel() function to read the excel file data into a DataFrame object. If you look at an excel sheet, it's a two-dimensional table. The DataFrame object also represents a two-dimensional tabular data structure.
Try pd.ExcelFile
:
xls = pd.ExcelFile('path_to_file.xls')
df1 = pd.read_excel(xls, 'Sheet1')
df2 = pd.read_excel(xls, 'Sheet2')
As noted by @HaPsantran, the entire Excel file is read in during the ExcelFile()
call (there doesn't appear to be a way around this). This merely saves you from having to read the same file in each time you want to access a new sheet.
Note that the sheet_name
argument to pd.read_excel()
can be the name of the sheet (as above), an integer specifying the sheet number (eg 0, 1, etc), a list of sheet names or indices, or None
. If a list is provided, it returns a dictionary where the keys are the sheet names/indices and the values are the data frames. The default is to simply return the first sheet (ie, sheet_name=0
).
If None
is specified, all sheets are returned, as a {sheet_name:dataframe}
dictionary.
There are a few options:
Read all sheets directly into an ordered dictionary.
import pandas as pd
# for pandas version >= 0.21.0
sheet_to_df_map = pd.read_excel(file_name, sheet_name=None)
# for pandas version < 0.21.0
sheet_to_df_map = pd.read_excel(file_name, sheetname=None)
Read the first sheet directly into dataframe
df = pd.read_excel('excel_file_path.xls')
# this will read the first sheet into df
Read the excel file and get a list of sheets. Then chose and load the sheets.
xls = pd.ExcelFile('excel_file_path.xls')
# Now you can list all sheets in the file
xls.sheet_names
# ['house', 'house_extra', ...]
# to read just one sheet to dataframe:
df = pd.read_excel(file_name, sheetname="house")
Read all sheets and store it in a dictionary. Same as first but more explicit.
# to read all sheets to a map
sheet_to_df_map = {}
for sheet_name in xls.sheet_names:
sheet_to_df_map[sheet_name] = xls.parse(sheet_name)
# you can also use sheet_index [0,1,2..] instead of sheet name.
Thanks @ihightower for pointing it out way to read all sheets and @toto_tico for pointing out the version issue.
sheetname : string, int, mixed list of strings/ints, or None, default 0 Deprecated since version 0.21.0: Use sheet_name instead Source Link
You can also use the index for the sheet:
xls = pd.ExcelFile('path_to_file.xls')
sheet1 = xls.parse(0)
will give the first worksheet. for the second worksheet:
sheet2 = xls.parse(1)
You could also specify the sheet name as a parameter:
data_file = pd.read_excel('path_to_file.xls', sheet_name="sheet_name")
will upload only the sheet "sheet_name"
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With