Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Summing Consecutive Ranges Pythonically

Tags:

python

I have a sumranges() function, which sums all the ranges of consecutive numbers found in a tuple of tuples. To illustrate:

def sumranges(nums):
    return sum([sum([1 for j in range(len(nums[i])) if
                     nums[i][j] == 0 or
                     nums[i][j - 1] + 1 != nums[i][j]]) for
                i in range(len(nums))])

>>> nums = ((1, 2, 3, 4), (1, 5, 6), (19, 20, 24, 29, 400))
>>> print sumranges(nums)
7

As you can see, it returns the number of ranges of consecutive digits within the tuple, that is: len((1, 2, 3, 4), (1), (5, 6), (19, 20), (24), (29), (400)) = 7. The tuples are always ordered.

My problem is that my sumranges() is terrible. I hate looking at it. I'm currently just iterating through the tuple and each subtuple, assigning a 1 if the number is not (1 + previous number), and summing the total. I feel like I am missing a much easier way to accomplish my stated objective. Does anyone know a more pythonic way to do this?

Edit: I have benchmarked all the answers given thus far. Thanks to all of you for your answers.

The benchmarking code is as follows, using a sample size of 100K:

from time import time
from random import randrange
nums = [sorted(list(set(randrange(1, 10) for i in range(10)))) for
        j in range(100000)]

for func in sumranges, alex, matt, redglyph, ephemient, ferdinand:
    start = time()
    result = func(nums)
    end = time()
    print ', '.join([func.__name__, str(result), str(end - start) + ' s'])

Results are as follows. Actual answer shown to verify that all functions return the correct answer:

sumranges, 250281, 0.54171204567 s
alex, 250281, 0.531121015549 s
matt, 250281, 0.843333005905 s
redglyph, 250281, 0.366822004318 s
ephemient, 250281, 0.805964946747 s
ferdinand, 250281, 0.405596971512 s

RedGlyph does edge out in terms of speed, but the simplest answer is probably Ferdinand's, and probably wins for most pythonic.

like image 673
Brent Newey Avatar asked Nov 03 '09 16:11

Brent Newey


People also ask

What is the formula for the sum of consecutive numbers?

Using the Formula(n / 2)(first number + last number) = sum, where n is the number of integers. Let's use the example of adding the numbers 1-100 to see how the formula works.

How do you add multiple consecutive numbers?

An odd number of consecutive numbers has a whole number as an average. This average is always the middle number. So, that means that the sum of the numbers will be: Sum = average \times number of consecutive numbers.

How do you sum consecutive numbers in Python?

Consecutive Numbers Formula The formula for adding 'n' consecutive numbers = [a + (a + 1) + (a + 2) + . {a + (n-1)}]. So, the sum of 'n' consecutive numbers or sum of 'n' terms of AP (Arithmetic Progression) = (n/2) × (first number + last number). Even Consecutive Numbers Formula = 2n, 2n+2, 2n+4, 2n+6,…


3 Answers

My 2 cents:

>>> sum(len(set(x - i for i, x in enumerate(t))) for t in nums)
7

It's basically the same idea as descriped in Alex' post, but using a set instead of itertools.groupby, resulting in a shorter expression. Since sets are implemented in C and len() of a set runs in constant time, this should also be pretty fast.

like image 89
Ferdinand Beyer Avatar answered Nov 04 '22 02:11

Ferdinand Beyer


Consider:

>>> nums = ((1, 2, 3, 4), (1, 5, 6), (19, 20, 24, 29, 400))
>>> flat = [[(x - i) for i, x in enumerate(tu)] for tu in nums]
>>> print flat
[[1, 1, 1, 1], [1, 4, 4], [19, 19, 22, 26, 396]]
>>> import itertools
>>> print sum(1 for tu in flat for _ in itertools.groupby(tu))
7
>>> 

we "flatten" the "increasing ramps" of interest by subtracting the index from the value, turning them into consecutive "runs" of identical values; then we identify and could the "runs" with the precious itertools.groupby. This seems to be a pretty elegant (and speedy) solution to your problem.

like image 9
Alex Martelli Avatar answered Nov 04 '22 03:11

Alex Martelli


Just to show something closer to your original code:

def sumranges(nums):
    return sum( (1 for i in nums
                   for j, v in enumerate(i)
                   if j == 0 or v != i[j-1] + 1) )

The idea here was to:

  • avoid building intermediate lists but use a generator instead, it will save some resources
  • avoid using indices when you already have selected a subelement (i and v above).

The remaining sum() is still necessary with my example though.

like image 7
RedGlyph Avatar answered Nov 04 '22 02:11

RedGlyph