I have a user defined number which I want to compare to a certain column of a dataframe.
I would like to return the rows of a dataframe which contain (in a certain column of df, say, df.num) the 5 closest numbers to the given number x.
Any suggestions for the best way to do this without loops would be greatly appreciated.
To find the row corresponding to a nearest value in an R data frame, we can use which. min function after getting the absolute difference between the value and the column along with single square brackets for subsetting the row. To understand how it works, check out the examples given below.
The shape attribute displays how many rows and columns there are in a pandas dataframe object. The shape attribute returns the number of rows and columns as a tuple. The tuple is compose of 2 values, the rows as the first value and the columns as the second value.
I think you can use the argsort method:
>>> df = pd.DataFrame({"A": 1e4*np.arange(100), "num": np.random.random(100)})
>>> x = 0.75
>>> df.ix[(df.num-x).abs().argsort()[:5]]
         A       num
66  660000  0.748261
92  920000  0.754911
59  590000  0.764449
27  270000  0.765633
82  820000  0.732601
>>> x = 0.33
>>> df.ix[(df.num-x).abs().argsort()[:5]]
         A       num
37  370000  0.327928
76  760000  0.327921
8    80000  0.326528
17  170000  0.334702
96  960000  0.324516
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With