Say I have a list <code>[1,2,3,4,5,6,7]</code>. I want to find the 3 closest numbers to, say, 6.5. Then the returned value would be <code>[5,6,7]</code>. Finding one closest number is not that tricky in python, which can be done using <pre class="prettyprint"><code>min(myList, key=lambda x:abs(x-myNumber)) </code></pre> But I am trying not to put a loop around this to find k closest numbers. Is there a pythonic way to achieve the above task?

You could compute distances, and sort: <pre class="prettyprint"><code>[n for d, n in sorted((abs(x-myNumber), x) for x in myList)[:k]] </code></pre> This does the following: <ol> <li>Create a sequence of tuples <code>(d, x)</code> where <code>d</code> is the distance to your target</li> <li>Select the first <code>k</code> elements of that list</li> <li>Extract just the number values from the result, discarding the distance</li> </ol>

Finding k closest numbers to a given number

Tags:

python

closest

Say I have a list [1,2,3,4,5,6,7]. I want to find the 3 closest numbers to, say, 6.5. Then the returned value would be [5,6,7].

Finding one closest number is not that tricky in python, which can be done using

min(myList, key=lambda x:abs(x-myNumber))

But I am trying not to put a loop around this to find k closest numbers. Is there a pythonic way to achieve the above task?

641

asked Jun 09 '14 00:06

frazman

3 Answers

The short answer

The heapq.nsmallest() function will do this neatly and efficiently:

>>> from heapq import nsmallest >>> s = [1,2,3,4,5,6,7] >>> nsmallest(3, s, key=lambda x: abs(x - 6.5)) [6, 7, 5]

Essentially this says, "Give me the three input values that have the smallest absolute difference from the number 6.5".

Optimizing for repeated lookups

In the comments, @Phylliida, asked how to optimize for repeated lookups with differing start points. One approach would be to pre-sort the data and then use bisect to locate the center of a small search segment:

from bisect import bisect  def k_nearest(k, center, sorted_data):     'Return *k* members of *sorted_data* nearest to *center*'     i = bisect(sorted_data, center)     segment = sorted_data[max(i-k, 0) : i+k]     return nsmallest(k, segment, key=lambda x: abs(x - center))

For example:

>>> s.sort() >>> k_nearest(3, 6.5, s) [6, 7, 5] >>> k_nearest(3, 0.5, s) [1, 2, 3] >>> k_nearest(3, 4.5, s)     [4, 5, 3] >>> k_nearest(3, 5.0, s) [5, 4, 6]

154

answered Sep 28 '22 23:09

Raymond Hettinger

You could compute distances, and sort:

[n for d, n in sorted((abs(x-myNumber), x) for x in myList)[:k]]

This does the following:

Create a sequence of tuples (d, x) where d is the distance to your target
Select the first k elements of that list
Extract just the number values from the result, discarding the distance

answered Sep 28 '22 23:09

Greg Hewgill

Both answers were good, and Greg was right, Raymond's answer is more high level and easier to implement, but I built upon Greg's answer because it was easier to manipulate to fit my need.

In case anyone is searching for a way to find the n closest values from a list of dicts.

My dict looks like this, where npi is just an identifier that I need along with the value:

mydict = {u'fnpi': u'1982650024',
 u'snpi': {u'npi': u'1932190360', u'value': 2672},
 u'snpis': [{u'npi': u'1831289255', u'value': 20},
  {u'npi': u'1831139799', u'value': 20},
  {u'npi': u'1386686137', u'value': 37},
  {u'npi': u'1457355257', u'value': 45},
  {u'npi': u'1427043645', u'value': 53},
  {u'npi': u'1477548675', u'value': 53},
  {u'npi': u'1851351514', u'value': 57},
  {u'npi': u'1366446171', u'value': 60},
  {u'npi': u'1568460640', u'value': 75},
  {u'npi': u'1326046673', u'value': 109},
  {u'npi': u'1548281124', u'value': 196},
  {u'npi': u'1912989989', u'value': 232},
  {u'npi': u'1336147685', u'value': 284},
  {u'npi': u'1801894142', u'value': 497},
  {u'npi': u'1538182779', u'value': 995},
  {u'npi': u'1932190360', u'value': 2672},
  {u'npi': u'1114020336', u'value': 3264}]}

value = mydict['snpi']['value'] #value i'm working with below
npi = mydict['snpi']['npi'] #npi (identifier) i'm working with below
snpis = mydict['snpis'] #dict i'm working with below

To get an [id, value] list (not just a list of values) , I use this:

[[id,val] for diff, val, id in sorted((abs(x['value']-value), x['value'], x['npi']) for x in snpis)[:6]]

Which produces this:

[[u'1932190360', 2672],
 [u'1114020336', 3264],
 [u'1538182779', 995],
 [u'1801894142', 497],
 [u'1336147685', 284],
 [u'1912989989', 232]]

EDIT

I actually found it pretty easy to manipulate Raymond's answer too, if you're dealing with a dict (or list of lists).

from heapq import nsmallest
[[i['npi'], i['value']] for i in nsmallest(6, snpis, key=lambda x: abs(x['value']-value))]

This will produce the same as the above output.

And this

nsmallest(6, snpis, key=lambda x: abs(x['value']-value)) will produce a dict instead.

answered Sep 28 '22 23:09

tmthyjames

Related questions
                            
                                Can I use a class attribute as a default value for an instance method?
                            
                                How to make a list of n numbers in Python and randomly select any number?
                            
                                Find number of columns in csv file
                            
                                Neural Network training with PyBrain won't converge
                            
                                Can you create a Python list from a string, while keeping characters in specific keywords together?
                            
                                Pandas: append dataframe to another df
                            
                                module 'matplotlib' has no attribute 'verbose'
                            
                                Glade or no glade: What is the best way to use PyGtk?
                            
                                How to retrieve a variable's name in python at runtime?
                            
                                Searching a sorted list? [closed]
                            
                                remove colorbar from figure in matplotlib
                            
                                When to use == and when to use is?
                            
                                Python: avoiding if condition for this code?
                            
                                Valid characters in a python class name
                            
                                raise statement on a conditional expression
                            
                                Which is the most efficient way to iterate through a list in python?
                            
                                SciPy/Python install on Ubuntu
                            
                                How do you join two tables on a foreign key field using django ORM?
                            
                                How to install python packages without root privileges?
                            
                                check for file existence in Python 3 [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Finding k closest numbers to a given number

Tags:

python

closest

frazman

People also ask

3 Answers

The short answer

Optimizing for repeated lookups

Raymond Hettinger

Greg Hewgill

tmthyjames

Recent Activity

Donate For Us