I have a list of stings representing a month in a year (not sorted and not consecutive):
['1/2013', '7/2013', '2/2013', '3/2013', '4/2014', '12/2013', '10/2013', '11/2013', '1/2014', '2/2014']
I'm looking for a Pythonic way to sort all of them and separate each consecutive sequence as the following suggests:
[ ['1/2013', '2/2013', '3/2013', '4/2013'],
['7/2013'],
['10/2013', '11/2013', '12/2013', '1/2014', '2/2014']
]
Any ideas?
Based on the example from the docs that shows how to find runs of consecutive numbers using itertools.groupby()
:
from itertools import groupby
from pprint import pprint
def month_number(date):
month, year = date.split('/')
return int(year) * 12 + int(month)
L = [[date for _, date in run]
for _, run in groupby(enumerate(sorted(months, key=month_number)),
key=lambda (i, date): (i - month_number(date)))]
pprint(L)
The key to the solution is differencing with a range generated by enumerate()
so that consecutive months all appear in same group (run).
[['1/2013', '2/2013', '3/2013'],
['7/2013'],
['10/2013', '11/2013', '12/2013', '1/2014', '2/2014'],
['4/2014']]
The groupby examples are cute, but too dense and will break on this input: ['1/2013', '2/2017']
, i.e. when there are adjacent months from non-adjacent years.
from datetime import datetime
from dateutil.relativedelta import relativedelta
def areAdjacent(old, new):
return old + relativedelta(months=1) == new
def parseDate(s):
return datetime.strptime(s, '%m/%Y')
def generateGroups(seq):
group = []
last = None
for (current, formatted) in sorted((parseDate(s), s) for s in seq):
if group and last is not None and not areAdjacent(last, current):
yield group
group = []
group.append(formatted)
last = current
if group:
yield group
Result:
[['1/2013', '2/2013', '3/2013'],
['7/2013'],
['10/2013', '11/2013', '12/2013', '1/2014', '2/2014'],
['4/2014']]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With