Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get values after idxmin(1)

Tags:

python

pandas

I apply idxmin() to get the index of the minimum absolute value of two columns:

rep["OffsetFrom"] = rep[["OffsetDates", "OffsetDays"]].abs().idxmin(axis=1).dropna()
    OffsetDates OffsetDays  OffsetFrom
0       0.0     0.0         OffsetDates
1       1.0     1.0         OffsetDates
2       4.0     -3.0        OffsetDays
3       4.0     -3.0        OffsetDays
4       6.0     -1.0        OffsetDays
... ... ... ...
1165    0.0     0.0         OffsetDates
1166    0.0     0.0         OffsetDates
1167    0.0     0.0         OffsetDates
1168    0.0     0.0         OffsetDates
1169    0.0     0.0         OffsetDates
1170 rows × 3 columns

How do I get a column of the actual values after that?
PS. This one wouldn't work, since I need to keep the original signs:

rep[["OffsetFrom", "Offset"]] = rep[["OffsetDates", "OffsetDays"]].abs().agg(['idxmin','min'], axis=1)

Edit: Eventually, I've simply done it with apply:

def get_abs_min_keep_sign(x):
    return min(x.min(), x.max(), key=abs)

rep["Offset"] = rep[["OffsetDates", "OffsetDays"]].apply(get_abs_min_keep_sign, axis=1)

However, there must be a more elegant approach out there.

like image 912
idav Avatar asked Oct 18 '25 12:10

idav


2 Answers

First idea is use lookup by column Offset:

rep["Offset"] = rep[["OffsetDates", "OffsetDays"]].abs().idxmin(axis=1)
    
idx, cols = pd.factorize(rep['Offset'])
rep["OffsetVal"] = rep.reindex(cols, axis=1).to_numpy()[np.arange(len(rep)), idx]
     

Another numpy solution is use numpy.argmin for positions by absolute values, then is possible use indexing by array by columns names for Offset and for values is used indexing by rep[["OffsetDates", "OffsetDays"]] converted to 2d array:

cols = ["OffsetDates", "OffsetDays"]

pos = np.argmin(rep[cols].abs().to_numpy(), axis=1)
rep["Offset2"] = np.array(cols)[pos]
rep["OffsetVal2"] = rep[cols].to_numpy()[np.arange(len(rep.index)),pos]
print (rep)
   OffsetDates  OffsetDays       Offset  OffsetVal      Offset2  OffsetVal2
0          0.0         0.0  OffsetDates        0.0  OffsetDates         0.0
1          1.0         1.0  OffsetDates        1.0  OffsetDates         1.0
2          4.0        -3.0   OffsetDays       -3.0   OffsetDays        -3.0
3          4.0        -3.0   OffsetDays       -3.0   OffsetDays        -3.0
4          6.0        -1.0   OffsetDays       -1.0   OffsetDays        -1.0
    

Your function should be simplify:

rep["OffsetVal"] = rep[["OffsetDates","OffsetDays"]].apply(lambda x: min(x,key=abs),axis=1)
like image 149
jezrael Avatar answered Oct 21 '25 02:10

jezrael


You need check adding to_numpy to remove the column impact and with lambda

df[["OffsetFrom", "Offset"]] = df[["OffsetDates", "OffsetDays"]].agg([lambda x : x.abs().idxmin(),'min'], axis=1).to_numpy()
df
Out[889]: 
   OffsetDates  OffsetDays   OffsetFrom  Offset
0          0.0         0.0  OffsetDates     0.0
1          1.0         1.0  OffsetDates     1.0
2          4.0        -3.0   OffsetDays    -3.0
3          4.0        -3.0   OffsetDays    -3.0
4          6.0        -1.0   OffsetDays    -1.0

Panda dataframe assign will match the index and column , when the column different, will return NaN

like image 23
BENY Avatar answered Oct 21 '25 01:10

BENY