Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why [20, ..., 13, 14].min(2) => [13, 20]?

Tags:

ruby

min

[20, 32, 32, 21, 30, 25, 29, 13, 14].min(2) # => [13, 20] 

Why isn't it [13, 14]? And how do I get what I want, the two smallest elements (in linear time)?

The doc's sentence "If the n argument is given, minimum n elements are returned as an array" isn't quite clear to me, but I think it says min(2) should give me the smallest two elements. I couldn't find much about it, but this thread, which might be the origin, seems to agree with me and say that it should return the same as sort.first(n), which it doesn't:

[20, 32, 32, 21, 30, 25, 29, 13, 14].sort.first(2) # => [13, 14] 

Sorry if dumb question and sorry for the "large" example, but that's already reduced - removing one more number (other than 13 or 14) does give me [13, 14].

like image 409
Stefan Pochmann Avatar asked Aug 20 '15 15:08

Stefan Pochmann


1 Answers

I just posted an explanation for the bug in the Ruby Issue Tracking System:

I suppose I found the problem. Taking the first example:

[20, 32, 32, 21, 30, 25, 29, 13, 14].min(2) 

This will call the function "nmin_run" in the file "enum.c", which sets "bufmax" to 4 times the number of minimums (n) we want (for the example, bufmax is 8), and then in the line 1327 will call the function "nmin_i" for each element of the original array.

In the function "nmin_i", when the buffer is full ("data->curlen == data->bufmax"), the function "nmin_filter" is called. In the example, that happens when curlen is 8, and so the buffer is [20, 32, 32, 21, 30, 25, 29, 13]. The "nmin_filter" will do a quicksort until the n smallest elements so far are on the leftmost part of the buffer, and will discard the rest of the elements, which leaves us with [20, 13] in the buffer.

And now starts the problem. At the end of "nmin_filter" the limit (apparently with the intention of storing the greatest value in the buffer) is set to the last value in the buffer (in the example, 13), which is not true. And then based on that value "nmin_i" will discard all remaining elements greater than that (in the example, discarding the 14). The buffer is then sorted and it returns:

[13, 20] 

So the solution is either remove all the limit-related part, or take the last pivot as the limit.

like image 71
Helder Pereira Avatar answered Oct 27 '22 04:10

Helder Pereira