Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to read all the rows until an empty row is encountered using Python Pandas

I have numerous rows in excel and the rows are filled with garbage values after an empty row. Is there a way to read only the records before the first empty row in excel using Python pandas.

like image 904
Praveen Gupta Sanka Avatar asked Mar 10 '23 01:03

Praveen Gupta Sanka


1 Answers

I am not aware of if read_excel can do this. If you import an empty line from excel, the column values for those rows will be filled with NaN, then you could select the values till a first row is filled with all NaN's.

I am assuming your data is something like this, where you have an empty row and data following it is garbage (I included multiple empty rows and garbage following it) enter image description here

df = pd.read_excel(r'Book1.xlsx') # read the file

print df 
'''
   col1 col2 col3
0     1    2    3
1     1    2    3
2     1    2    3
3     1    2    3
....
10    1    2    3
11  NaN  NaN  NaN
12    x    x    x
....
18  NaN  NaN  NaN
19  NaN  NaN  NaN
20    y    y    y
21    y    y    y
....
'''

first_row_with_all_NaN = df[df.isnull().all(axis=1) == True].index.tolist()[0]
# gives me the first row number of the row that has all the values to be NaN. 
'''
11
'''

print df.loc[0:first_row_with_all_NaN-1]

# then I use loc to select the rows from 0 to  first row with all NaN's-1

'''
 col1 col2 col3
0     1    2    3
1     1    2    3
2     1    2    3
3     1    2    3
4     1    2    3
5     1    2    3
6     1    2    3
7     1    2    3
8     1    2    3
9     1    2    3
10    1    2    3
'''
like image 70
plasmon360 Avatar answered Apr 26 '23 09:04

plasmon360