Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pd.Series.argsort() and nan-values

Tags:

python

pandas

I would like to use argsort() on my pandas Series and achieve the same result as sort_values() does.

x = pd.Series([10.,40.,None,20.,None,30.])


y = x.sort_values()
# Output of y
# 0    10.0
# 3    20.0
# 5    30.0
# 1    40.0
# 2    NaN
# 4    NaN
# dtype: float64

idx = x.argsort()
# Output of idx
# 0    0
# 1    2
# 2   -1
# 3    3
# 4   -1
# 5    1
# dtype: int64

# How could f look like such that y.equals(z)==True
z = f(x,idx)

How to use idx to achieve the same result as with sort_values()? Or put differently: How to apply the output of argsort() on other sequences? In particular, the NaN or None entries should go to the back of the series/list.

like image 452
normanius Avatar asked Dec 28 '25 14:12

normanius


2 Answers

I feel like there is something off with pandas argsort and numpy argsort, for the quick fix , using np.argsort

x.iloc[np.argsort(x.values)]
Out[219]: 
0    10.0
3    20.0
5    30.0
1    40.0
2     NaN
4     NaN
dtype: float64
like image 117
BENY Avatar answered Dec 30 '25 04:12

BENY


I'm afraid that the special treatment of NaNs is so deep in the pd.Series.argsort that you should somehow treat them separately:

pd.concat([x.dropna().iloc[x.dropna().argsort()], x.loc[x.isna()]])
like image 25
josoler Avatar answered Dec 30 '25 05:12

josoler



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!