Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Reading excel files when the first row is NOT the column name Excel Files

I am using pandas to read an excel file. It doesn't have column name but it continues to read the first row as the column name.

Following is the excel file that is being read.

data1   0.994676
data2   0.994588
data3   0.99488
data4   0.994483
data5   0.994312
data6   0.993823
data7   0.993575
data8   0.994231
data9   0.993838
data10  0.994007
data11  0.994328
data12  0.993503
data13  0.99342
data14  0.992729
data15  0.993013
data16  0.993049
data17  0.993133
data18  0.99262

I'm reading the 2nd column using the following code. import pandas as pd

df=pd.ExcelFile('C:/Users/JohnDoe/Desktop/080718_output.xlsx', header=None, index_col=False).parse('Data_sheet')
y=df.iloc[0:17,1]

The following is the y.

In[38]:y
Out[38]: 
0     0.994588
1     0.994880
2     0.994483
3     0.994312
4     0.993823
5     0.993575
6     0.994231
7     0.993838
8     0.994007
9     0.994328
10    0.993503
11    0.993420
12    0.992729
13    0.993013
14    0.993049
15    0.993133
16    0.992620
Name: 0.994676, dtype: float64

It skips the first data because the first row is being used as a column name.. Any idea on how I can improve this?

Edit: 'header=False' to 'header=None'. Both cases give the same outcome.

like image 599
user7852656 Avatar asked Aug 07 '18 18:08

user7852656


People also ask

How do I turn first row into column names in pandas?

You can use df. columns=df. iloc[0] to set the column labels by extracting the first row. In pandas, the index starts from 0 hence 0 means first row.

How do I read a specific row in Excel using pandas?

Use pandas. read_excel() function to read excel sheet into pandas DataFrame, by default it loads the first sheet from the excel file and parses the first row as a DataFrame column name.

How do I read a particular column name in Excel using pandas?

Use pandas. read_csv() to read a specific column from a CSV file. To read a CSV file, call pd. read_csv(file_name, usecols=cols_list) with file_name as the name of the CSV file, delimiter as the delimiter, and cols_list as the list of specific columns to read from the CSV file.

How do I skip the first row in Excel using python?

Method 1: Skip One Specific Row #import DataFrame and skip 2nd row df = pd. Method 2: Skip Several Specific Rows #import DataFrame and skip 2nd and 4th row df = pd. Method 3: Skip First N Rows #import DataFrame and skip first 2 rows df = pd.


2 Answers

You can use read_excel with header=None for default columns with rangeIndex:

df = pd.read_excel('file.xlsx', 
                   sheet_name ='Data_sheet', 
                   header=None, 
                   index_col=False)
like image 140
jezrael Avatar answered Oct 12 '22 11:10

jezrael


Create a column header variable and call that in your excel read in statement as well as stating header=None

names=['Column1','Column2']
df=pd.read_excel(r"/Users/JohnDoe/Desktop/080718_output.xlsx",header=None,names=names)
like image 37
Bram van Hout Avatar answered Oct 12 '22 13:10

Bram van Hout