I have a list of integers which I need to parse into a string of ranges.
For example:
[0, 1, 2, 3] -> "0-3"
[0, 1, 2, 4, 8] -> "0-2,4,8"
And so on.
I'm still learning more pythonic ways of handling lists, and this one is a bit difficult for me. My latest thought was to create a list of lists which keeps track of paired numbers:
[ [0, 3], [4, 4], [5, 9], [20, 20] ]
I could then iterate across this structure, printing each sub-list as either a range, or a single value.
I don't like doing this in two iterations, but I can't seem to keep track of each number within each iteration. My thought would be to do something like this:
Here's my most recent attempt. It works, but I'm not fully satisfied; I keep thinking there's a more elegant solution which completely escapes me. The string-handling iteration isn't the nicest, I know -- it's pretty early in the morning for me :)
def createRangeString(zones):
rangeIdx = 0
ranges = [[zones[0], zones[0]]]
for zone in list(zones):
if ranges[rangeIdx][1] in (zone, zone-1):
ranges[rangeIdx][1] = zone
else:
ranges.append([zone, zone])
rangeIdx += 1
rangeStr = ""
for range in ranges:
if range[0] != range[1]:
rangeStr = "%s,%d-%d" % (rangeStr, range[0], range[1])
else:
rangeStr = "%s,%d" % (rangeStr, range[0])
return rangeStr[1:]
Is there a straightforward way I can merge this into a single iteration? What else could I do to make it more Pythonic?
The most Pythonic way to convert a list of integers ints to a list of strings is to use the one-liner strings = [str(x) for x in ints] . It iterates over all elements in the list ints using list comprehension and converts each list element x to a string using the str(x) constructor.
Use the join() Function to Convert a List to a Comma-Separated String in Python. The join() function combines the elements of an iterable and returns a string. We need to specify the character that will be used as the separator for the elements in the string.
In Python an integer can be converted into a string using the built-in str() function. The str() function takes in any python data type and converts it into a string.
for i, x in enumerate(a): if i: print ',' + str(x), else: print str(x), this is a first-time switch (works for any iterable a, whether a list or otherwise) so it places the comma before each item but the first.
>>> from itertools import count, groupby
>>> L=[1, 2, 3, 4, 6, 7, 8, 9, 12, 13, 19, 20, 22, 23, 40, 44]
>>> G=(list(x) for _,x in groupby(L, lambda x,c=count(): next(c)-x))
>>> print ",".join("-".join(map(str,(g[0],g[-1])[:len(g)])) for g in G)
1-4,6-9,12-13,19-20,22-23,40,44
The idea here is to pair each element with count(). Then the difference between the value and count() is constant for consecutive values. groupby() does the rest of the work
As Jeff suggests, an alternative to count()
is to use enumerate()
. This adds some extra cruft that needs to be stripped out in the print statement
G=(list(x) for _,x in groupby(enumerate(L), lambda (i,x):i-x))
print ",".join("-".join(map(str,(g[0][1],g[-1][1])[:len(g)])) for g in G)
Update: for the sample list given here, the version with enumerate runs about 5% slower than the version using count() on my computer
Whether this is pythonic is up for debate. But it is very compact. The real meat is in the Rangify()
function. There's still room for improvement if you want efficiency or Pythonism.
def CreateRangeString(zones):
#assuming sorted and distinct
deltas = [a-b for a, b in zip(zones[1:], zones[:-1])]
deltas.append(-1)
def Rangify((b, p), (z, d)):
if p is not None:
if d == 1: return (b, p)
b.append('%d-%d'%(p,z))
return (b, None)
else:
if d == 1: return (b, z)
b.append(str(z))
return (b, None)
return ','.join(reduce(Rangify, zip(zones, deltas), ([], None))[0])
To describe the parameters:
deltas
is the distance to the next value (inspired from an answer here on SO)Rangify()
does the reduction on these parameters
b
- base or accumulatorp
- previous start rangez
- zone numberd
- deltaTo concatenate strings you should use ','.join
. This removes the 2nd loop.
def createRangeString(zones):
rangeIdx = 0
ranges = [[zones[0], zones[0]]]
for zone in list(zones):
if ranges[rangeIdx][1] in (zone, zone-1):
ranges[rangeIdx][1] = zone
else:
ranges.append([zone, zone])
rangeIdx += 1
return ','.join(
map(
lambda p: '%s-%s'%tuple(p) if p[0] != p[1] else str(p[0]),
ranges
)
)
Although I prefer a more generic approach:
from itertools import groupby
# auxiliary functor to allow groupby to compare by adjacent elements.
class cmp_to_groupby_key(object):
def __init__(self, f):
self.f = f
self.uninitialized = True
def __call__(self, newv):
if self.uninitialized or not self.f(self.oldv, newv):
self.curkey = newv
self.uninitialized = False
self.oldv = newv
return self.curkey
# returns the first and last element of an iterable with O(1) memory.
def first_and_last(iterable):
first = next(iterable)
last = first
for i in iterable:
last = i
return (first, last)
# convert groups into list of range strings
def create_range_string_from_groups(groups):
for _, g in groups:
first, last = first_and_last(g)
if first != last:
yield "{0}-{1}".format(first, last)
else:
yield str(first)
def create_range_string(zones):
groups = groupby(zones, cmp_to_groupby_key(lambda a,b: b-a<=1))
return ','.join(create_range_string_from_groups(groups))
assert create_range_string([0,1,2,3]) == '0-3'
assert create_range_string([0, 1, 2, 4, 8]) == '0-2,4,8'
assert create_range_string([1,2,3,4,6,7,8,9,12,13,19,20,22,22,22,23,40,44]) == '1-4,6-9,12-13,19-20,22-23,40,44'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With