How to sort a list to end up with:
['a', 'aa', 'aaa', 'A', 'AA', 'AAA', 'b', 'bb', 'bbb', 'B', 'BB', 'BBB']
Assume a shuffled version of it for convenience:
['bb', 'a', 'B', 'BB', 'AAA', 'BBB', 'b', 'aa', 'aaa', 'A', 'AA', 'bbb']
I tried sorting by ignoring case:
l = sorted(l, key=lambda x: x.lower())
which results in ['a', 'A', 'aa', 'AA', 'aaa', 'AAA']
From answers below, there are two solutions for the mixed case, I'm not sure which is better.
L = ['ABC1', 'abc1', 'ABC2', 'abc2', 'Abc']
l = sorted(L, key=lambda x: "".join([y.lower() + y.swapcase() for y in x]))
print(l)
l = sorted(L, key=lambda x: [(c.lower(), c.isupper()) for c in x])
print(l)
You can use sorted()
with a custom key function:
>>> L = ['bb', 'a', 'B', 'BB', 'AAA', 'BBB', 'b', 'aa', 'aaa', 'A', 'AA', 'bbb']
>>> sorted(L, key=lambda x: (x[0].lower(), x[0].isupper(), len(x)))
['a', 'aa', 'aaa', 'A', 'AA', 'AAA', 'b', 'bb', 'bbb', 'B', 'BB', 'BBB']
This works by comparing each element's first character lowercased first, then the element's case and finally its length.
P.S. To also handle mixed-case and mixed-character elements you'd need to compare tuples for individual characters, e.g.:
>>> L = ['ab', 'aA', 'bb', 'a', 'B', 'BB', 'b', 'aa', 'A', 'AA']
>>> sorted(L, key=lambda x: [(c.lower(), c.isupper()) for c in x])
['a', 'aa', 'aA', 'ab', 'A', 'AA', 'b', 'bb', 'B', 'BB']
TLDR
result = sorted(lst, key=lambda s: [(c.lower(), c.isupper()) for c in s])
You can transform each string to a list of tuples, one per character. A tuple for a character c
takes a form (c.lower(), c.isupper())
. The usual list comparison gives your desired sort.
lst = ["a", "aa", "aaa", "A", "AA", "AAA", "b", "bb", "bbb", "B", "BB", "BBB"]
lsts = [[(c.lower(), c.isupper()) for c in s] for s in lst]
# [[('a', False)],
# [('a', False), ('a', False)],
# [('a', False), ('a', False), ('a', False)],
# [('a', True)],
# [('a', True), ('a', True)],
# [('a', True), ('a', True), ('a', True)],
# [('b', False)],
# [('b', False), ('b', False)],
# [('b', False), ('b', False), ('b', False)],
# [('b', True)],
# [('b', True), ('b', True)],
# [('b', True), ('b', True), ('b', True)]]
res = ["".join(c.upper() if u else c for c, u in ls) for ls in lsts]
Recovering the result:
['a', 'aa', 'aaa', 'A', 'AA', 'AAA', 'b', 'bb', 'bbb', 'B', 'BB', 'BBB']
Note that there are many distinct ways to order mixed-case elements consistent with the OPs original example. This approach is the only reasonable sort that I can think of which arises from an anti-symmetric order relation. In particular, this sort admits no equivalent elements that are not equal.
For example, ['aAa', 'aaA']
and ['aaA', 'aAa']
will lead to the same output of ['aaA', 'aAa']
.
Short answer :
sorted(l, key=lambda x: "".join([y.lower() + y.swapcase() for y in x]))
Each word is transformed by doubling each letter, first letter is the lower version of the letter, second letter is the swaped version. Second letter is swaped in order to have lowercase sorted before uppercase.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With