I have a dataframe that looks like the following:
s1 s2 s3 s4
0 v1 v2 v3 v4
0 v5 v6 v7 np.nan
0 v8 np.nan v9 np.nan
0 v10 np.nan np.nan np.nan
Essentially from top down there are numerical values and across columns at some random index values will switch to np.nan only.
I've used .apply(pd.Series.last_valid_index) to get the indexes for which the values are still numerical, however, I'm not sure of the most efficient way to retrieve a series for which I have the actual value at the last valid index.
Ideally I'd be able to derive a series that looks like:
value
s1 v10
s2 v6
s3 v9
s4 v4
or as a dataframe that looks like
s1 s2 s3 s4
0 v10 v6 v9 v4
Many thanks!
iloc – Pandas Dataframe. iloc is used to retrieve data by specifying its index. In python negative index starts from the end so we can access the last element of the dataframe by specifying its index to -1.
lastIndexOf() The lastIndexOf() method returns the last index at which a given element can be found in the array, or -1 if it is not present.
Method 1: Using tail() method DataFrame. tail(n) to get the last n rows of the DataFrame. It takes one optional argument n (number of rows you want to get from the end). By default n = 5, it return the last 5 rows if the value of n is not passed to the method.
This is one way using NumPy indexing:
# ensure index is normalised
df = df.reset_index(drop=True)
# calculate last valid index across dataframe
idx = df.apply(pd.Series.last_valid_index)
# create result using NumPy indexing
res = pd.Series(df.values[idx, np.arange(df.shape[1])],
index=df.columns,
name='value')
print(res)
s1 v10
s2 v6
s3 v9
s4 v4
Name: value, dtype: object
Here is another way to do it, without resetting the index:
df.apply(lambda x: x[x.notnull()].values[-1])
s1 v10
s2 v6
s3 v9
s4 v4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With