Consider the below dataframe:
import pandas as pd
from numpy import nan
data = [
(111, nan, nan, 111),
(112, 112, nan, 115),
(113, nan, nan, nan),
(nan, nan, nan, nan),
(118, 110, 117, nan),
]
df = pd.DataFrame(data, columns=[f'num{i}' for i in range(len(data[0]))])
num0 num1 num2 num3
0 111.0 NaN NaN 111.0
1 112.0 112.0 NaN 115.0
2 113.0 NaN NaN NaN
3 NaN NaN NaN NaN
4 118.0 110.0 117.0 NaN
Assuming my index is unique, I'm looking to retrieve the unique values per index row, to an output like the one below. I wish to keep the empty rows.
num1 num2 num3
0 111.0 NaN NaN
1 112.0 115.0 NaN
2 113.0 NaN NaN
3 NaN NaN NaN
4 110.0 117.0 118.0
I have a working, albeit slow, solution, see below. The output number order is not relevant, as long all values are presented to the leftmost column and nulls to the right. I'm looking for best practices and potential ideas to speed up the code. Thank you in advance.
def arrange_row(row):
values = list(set(row.dropna(axis=1).values[0]))
values = [nan] if not values else values
series = pd.Series(values, index=[f"num{i}" for i in range(1, len(values)+1)])
return series
df.groupby(level=-1).apply(arrange_row).unstack(level=-1)
pd.version == '1.2.3'
We can stack
to reshape the dataframe, then group the reshaped frame on level=0
and aggregate using unqiue
to get the unique values from each row, then you can create a new dataframe from these unique values
s = df.stack().groupby(level=0).unique()
pd.DataFrame([*s], index=s.index).reindex(df.index)
0 1 2
0 111.0 NaN NaN
1 112.0 115.0 NaN
2 113.0 NaN NaN
3 NaN NaN NaN
4 118.0 110.0 117.0
Use df.values
with List comprehension
and df.dropna
:
# Create a list of rows of dataframe
In [788]: l = df.values
# Use List Comprehension to remove dups from above list of lists
In [789]: l_without_dupes = [list(dict.fromkeys(i)) for i in l]
# Create a new dataframe from above list and drop the column with all NaN's
In [795]: res_df = pd.DataFrame(l_without_dupes).dropna(1, how='all')
In [796]: res_df
Out[796]:
0 1 2
0 111.0 NaN NaN
1 112.0 NaN 115.0
2 113.0 NaN NaN
3 NaN NaN NaN
4 118.0 110.0 117.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With