My excel sheet:
A B
1 first second
2
3
4 x y
5 z j
Python code:
df = pd.read_excel (filename, parse_cols=1)
return a correct output:
first second
0 NaN NaN
1 NaN NaN
2 x y
3 z j
If i want work only with second column
df = pd.read_excel (filename, parse_cols=[1])
return:
second
0 y
1 j
I'd have information about empty excel rows (NaN in my df) even if I work only with a specific column. If output loose NaN information it's not ok, for example, for skiprows paramater, etc
Thanks
We can use the pandas module read_excel() function to read the excel file data into a DataFrame object. If you look at an excel sheet, it's a two-dimensional table. The DataFrame object also represents a two-dimensional tabular data structure.
read_excel() function is used to read excel sheet with extension xlsx into pandas DataFrame. By reading a single sheet it returns a pandas DataFrame object, but reading two sheets it returns a Dict of DataFrame.
In order to check null values in Pandas Dataframe, we use notnull() function this function return dataframe of Boolean values which are False for NaN values.
Pandas uses the xlrd as their default engine for reading excel files. However, xlrd has removed support for anything other than xls files in their latest release. This causes you to receive the error that the xlsx filetype is no longer supported when calling the read_excel function on a xlsx excel using pandas.
For me works parameter skip_blank_lines=False
:
df = pd.read_excel ('test.xlsx',
parse_cols=1,
skip_blank_lines=False)
print (df)
A B
0 first second
1 NaN NaN
2 NaN NaN
3 x y
4 z j
Or if need omit first row:
df = pd.read_excel ('test.xlsx',
parse_cols=1,
skiprows=1,
skip_blank_lines=False)
print (df)
first second
0 NaN NaN
1 NaN NaN
2 x y
3 z j
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With