Identify leading and trailing NAs in pandas DataFrame

Is there a way to identify leading and trailing NAs in a pandas.DataFrame

Currently I do the following but it seems not straightforward:

import pandas as pd
df = pd.DataFrame(dict(a=[0.1, 0.2, 0.2],
                       b=[None, 0.1, None],
                       c=[0.1, None, 0.1]) 
lead_na = (df.isnull() == False).cumsum() == 0
trail_na = (df.iloc[::-1].isnull() == False).cumsum().iloc[::-1] == 0
trail_lead_nas = top_na | trail_na

Any ideas how this could be expressed more efficiently?

Answer:

%timeit df.ffill().isna() | df.bfill().isna()
The slowest run took 29.24 times longer than the fastest. This could mean that 
an intermediate result is being cached.
31 ms ± 25.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit ((df.isnull() == False).cumsum() == 0) | ((df.iloc[::-1].isnull() ==False).cumsum().iloc[::-1] == 0)
255 ms ± 66.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

How do you remove leading and trailing spaces in pandas DataFrame column?

Remove Both Leading and Trailing Whitespace CharactersUsing the strip () function, you can also remove both the leading and trailing whitespace characters from a column using the strip() function.

Which method is used in pandas to detect null values?

In order to check null values in Pandas DataFrame, we use isnull() function this function return dataframe of Boolean values which are True for NaN values.

How about this

df.ffill().isna() | df.bfill().isna()

Out[769]:
       a      b      c
0  False   True  False
1  False  False  False
2  False   True  False

df = pd.concat([df] * 1000, ignore_index=True)

In [134]: %%timeit
     ...: lead_na = (df.isnull() == False).cumsum() == 0
     ...: trail_na = (df.iloc[::-1].isnull() == False).cumsum().iloc[::-1] == 0
     ...: trail_lead_nas = lead_na | trail_na
     ...: 
11.8 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [135]: %%timeit
     ...: df.ffill().isna() | df.bfill().isna()
     ...: 
2.1 ms ± 50 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Identify leading and trailing NAs in pandas DataFrame

Tags:

python

pandas

MMCM_

People also ask

1 Answers

Andy L.

Recent Activity

Donate For Us

Identify leading and trailing NAs in pandas DataFrame

Tags:

python

pandas

MMCM_

People also ask

1 Answers

Andy L.

Related questions

Recent Activity

Donate For Us