I have a user defined number which I want to compare to a certain column of a dataframe.
I would like to return the rows of a dataframe which contain (in a certain column of df, say, df.num) the 5 closest numbers to the given number x.
Any suggestions for the best way to do this without loops would be greatly appreciated.
To find the row corresponding to a nearest value in an R data frame, we can use which. min function after getting the absolute difference between the value and the column along with single square brackets for subsetting the row. To understand how it works, check out the examples given below.
The shape attribute displays how many rows and columns there are in a pandas dataframe object. The shape attribute returns the number of rows and columns as a tuple. The tuple is compose of 2 values, the rows as the first value and the columns as the second value.
I think you can use the argsort
method:
>>> df = pd.DataFrame({"A": 1e4*np.arange(100), "num": np.random.random(100)})
>>> x = 0.75
>>> df.ix[(df.num-x).abs().argsort()[:5]]
A num
66 660000 0.748261
92 920000 0.754911
59 590000 0.764449
27 270000 0.765633
82 820000 0.732601
>>> x = 0.33
>>> df.ix[(df.num-x).abs().argsort()[:5]]
A num
37 370000 0.327928
76 760000 0.327921
8 80000 0.326528
17 170000 0.334702
96 960000 0.324516
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With