I have a list of names, e.g. ['Agrajag', 'Colin', 'Deep Thought', ... , 'Zaphod Beeblebrox', 'Zarquon']
. Now I want to partition this list into approximately equally sized sublists, so that the boundaries of the subgroups are at the first letter of the names, e.g A-F, G-L, M-P, Q-Z, not A-Fe, Fi-Mo, Mu-Pra, Pre-Z.
I could only come up with a statically sized parition that doesn't take size of the subgroups into account:
import string, itertools
def _group_by_alphabet_key(elem):
char = elem[0].upper()
i = string.ascii_uppercase.index(char)
if i > 19:
to_c = string.ascii_uppercase[-1];
from_c = string.ascii_uppercase[20]
else:
from_c = string.ascii_uppercase[i/5*5]
to_c = string.ascii_uppercase[i/5*5 + 4]
return "%s - %s" % (from_c, to_c)
subgroups = itertools.groupby(name_list, _group_by_alphabet_key)
Any better ideas?
P.S.: this may sound somewhat like homework, but it actually is for a webpage where members should be displayed in 5-10 tabs of equally sized groups.
Here's something that might work. I feel certain there's a simpler way though... probably involving itertools
. Note that num_pages
only roughly determines how many pages you'll actually get.
EDIT: Whoops! There was a bug -- it was cutting off the last group! The below should be fixed, but note that the length of the last page will be slightly unpredictable. Also, I added .upper()
to account for possible lowercase names.
EDIT2: The previous method of defining letter_groups was inefficient; the below dict-based code is more scalable:
names = ['Agrajag', 'Colin', 'Deep Thought', 'Ford Prefect' , 'Zaphod Beeblebrox', 'Zarquon']
num_pages = 3
def group_names(names, num_pages):
letter_groups = defaultdict(list)
for name in names: letter_groups[name[0].upper()].append(name)
letter_groups = [letter_groups[key] for key in sorted(letter_groups.keys())]
current_group = []
page_groups = []
group_size = len(names) / num_pages
for group in letter_groups:
current_group.extend(group)
if len(current_group) > group_size:
page_groups.append(current_group)
current_group = []
if current_group: page_groups.append(current_group)
return page_groups
print group_names(names, num_pages)
Since your name_list
has to be sorted for groupby
to work, can't you just check every Nth value and build your divisions that way?
right_endpoints = name_list[N-1::N]
And using "A"
as your leftmost endpoint and "Z"
as your rightmost endpoint, you can construct the N divisions accordingly and they should all have the same size.
right_endpoints[0]
.right_endpoints[0]
, the next right endpoint would be right_endpoints[1]
.The issue you may run into is what if two of these right_endpoints
are the same...
edit: example
>>> names = ['Aaron', 'Abel', 'Cain', 'Daniel', 'Darius', 'David', 'Ellen', 'Gary', 'James', 'Jared', 'John', 'Joseph', 'Lawrence', 'Michael', 'Nicholas', 'Terry', 'Victor', 'Zulu']
>>> right_ends, left_ends = names[2::3], names[3::3]
>>> left_ends = ['A'] + left_ends
>>> left_ends, right_ends
>>> ["%s - %s" % (left, right) for left, right in zip(left_ends, right_ends)]
['A - Cain', 'Daniel - David', 'Ellen - James', 'Jared - Joseph', 'Lawrence - Nicholas', 'Terry - Zulu']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With