Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

return rows in a dataframe closest to a user-defined number

I have a user defined number which I want to compare to a certain column of a dataframe.

I would like to return the rows of a dataframe which contain (in a certain column of df, say, df.num) the 5 closest numbers to the given number x.

Any suggestions for the best way to do this without loops would be greatly appreciated.

like image 689
Michele Reilly Avatar asked Jul 20 '13 02:07

Michele Reilly


People also ask

How do you find the closest value in a data frame?

To find the row corresponding to a nearest value in an R data frame, we can use which. min function after getting the absolute difference between the value and the column along with single square brackets for subsetting the row. To understand how it works, check out the examples given below.

What pandas DataFrame attribute returns the number of rows and columns?

The shape attribute displays how many rows and columns there are in a pandas dataframe object. The shape attribute returns the number of rows and columns as a tuple. The tuple is compose of 2 values, the rows as the first value and the columns as the second value.


1 Answers

I think you can use the argsort method:

>>> df = pd.DataFrame({"A": 1e4*np.arange(100), "num": np.random.random(100)})
>>> x = 0.75
>>> df.ix[(df.num-x).abs().argsort()[:5]]
         A       num
66  660000  0.748261
92  920000  0.754911
59  590000  0.764449
27  270000  0.765633
82  820000  0.732601
>>> x = 0.33
>>> df.ix[(df.num-x).abs().argsort()[:5]]
         A       num
37  370000  0.327928
76  760000  0.327921
8    80000  0.326528
17  170000  0.334702
96  960000  0.324516
like image 93
DSM Avatar answered Sep 30 '22 03:09

DSM