Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most pythonic way to sort dates sequences?

I have a list of stings representing a month in a year (not sorted and not consecutive): ['1/2013', '7/2013', '2/2013', '3/2013', '4/2014', '12/2013', '10/2013', '11/2013', '1/2014', '2/2014']

I'm looking for a Pythonic way to sort all of them and separate each consecutive sequence as the following suggests:

[ ['1/2013', '2/2013', '3/2013', '4/2013'], 
  ['7/2013'], 
  ['10/2013', '11/2013', '12/2013', '1/2014', '2/2014'] 
]

Any ideas?

like image 385
omer bach Avatar asked Mar 20 '23 21:03

omer bach


2 Answers

Based on the example from the docs that shows how to find runs of consecutive numbers using itertools.groupby():

from itertools import groupby
from pprint import pprint

def month_number(date):
    month, year = date.split('/')
    return int(year) * 12 + int(month)

L = [[date for _, date in run]
     for _, run in groupby(enumerate(sorted(months, key=month_number)),
                           key=lambda (i, date): (i - month_number(date)))]
pprint(L)

The key to the solution is differencing with a range generated by enumerate() so that consecutive months all appear in same group (run).

Output

[['1/2013', '2/2013', '3/2013'],
 ['7/2013'],
 ['10/2013', '11/2013', '12/2013', '1/2014', '2/2014'],
 ['4/2014']]
like image 165
jfs Avatar answered Mar 22 '23 11:03

jfs


The groupby examples are cute, but too dense and will break on this input: ['1/2013', '2/2017'], i.e. when there are adjacent months from non-adjacent years.

from datetime import datetime
from dateutil.relativedelta import relativedelta

def areAdjacent(old, new):
    return old + relativedelta(months=1) == new

def parseDate(s):
    return datetime.strptime(s, '%m/%Y')

def generateGroups(seq):
    group = []
    last = None
    for (current, formatted) in sorted((parseDate(s), s) for s in seq):
        if group and last is not None and not areAdjacent(last, current):
            yield group
            group = []
        group.append(formatted)
        last = current
    if group:
        yield group

Result:

[['1/2013', '2/2013', '3/2013'], 
 ['7/2013'],
 ['10/2013', '11/2013', '12/2013', '1/2014', '2/2014'],
 ['4/2014']]
like image 36
moe Avatar answered Mar 22 '23 12:03

moe