Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast way of crossing strings in a list

If have a list like so:

shops=['A','B','C','D']

And would like to create the following new lists (I cross each element with every other and create a string where first part is alphanumerically before the second):

['A-B', 'A-C', 'A-D']

['A-B', 'B-C', 'B-D']

['A-C', 'B-C', 'C-D']

['A-D', 'B-D', 'C-D']

I have something like this:

for a in shops:
    cons = []
    for b in shops:
        if a!=b:
            con = [a,b]
            con = sorted(con, key=lambda x: float(x))
            cons.append(con[0]+'-'+con[1])
    print(cons)

However, this is pretty slow for large lists (e.g. 1000 where I have 1000*999*0.5 outputs). I was looking for a more efficient way of doing this?

I could have used an if-else clause for the sort e.g.

for a in shops:
    cons = []
    for b in shops:
        if a<b:
            cons.append(a+"-"+b)
        elif a>b:
            cons.append(b+"-"+a)
    print(cons)

Which, I haven't timed yet - however I thought the main slow-down was the double for-loop

like image 273
mptevsion Avatar asked Feb 02 '16 15:02

mptevsion


2 Answers

You can create a nested list-comprehension with some additional checks:

>>> shops=['A','B','C','D']
>>> [["-".join((min(a,b), max(a,b))) for b in shops if b != a] for a in shops]
[['A-B', 'A-C', 'A-D'],
 ['A-B', 'B-C', 'B-D'],
 ['A-C', 'B-C', 'C-D'],
 ['A-D', 'B-D', 'C-D']]

Note that this will probably not be much faster than your code, as you still have to generate all those combinations. In practice, you could make it a generator expression, so the elements are not generated all at once but only "as needed":

gen = (["-".join((min(a,b), max(a,b))) for b in shops if b != a] for a in shops)
for item in gen:
    print(item)

Update: I did some timing analysis using IPython's %timeit. Turns out your second implementation is the fastest. Tested with a list of 100 strings (map(str, range(100))) and after turning each of the methods into generators.

In [32]: %timeit list(test.f1())         # your first implementation
100 loops, best of 3: 13.5 ms per loop

In [33]: %timeit list(test.f2())         # your second implementation
1000 loops, best of 3: 1.63 ms per loop

In [34]: %timeit list(test.g())          # my implementation
100 loops, best of 3: 3.49 ms per loop

You can speed it up by using a simple if/else instead of min/max, as in your 2nd implementation, then they are about equally fast.

(["-".join((a,b) if a < b else (b,a)) for b in shops if b != a] for a in shops)
like image 130
tobias_k Avatar answered Sep 23 '22 14:09

tobias_k


If the list is sorted and there are no duplicates, you can keep track of your position in the list to avoid having to do comparisons to get the order.

from itertools import chain, islice

combos = []
for i, s in enumerate(shops):
    combo = ['{0}-{1}'.format(a, b) for a, b in chain(
        ((c, s) for c in islice(shops, None, i),
        ((s, c) for c in islice(shops, i+1))]
    combos.append(combo)

EDIT: updated to use generators

like image 37
Brendan Abel Avatar answered Sep 23 '22 14:09

Brendan Abel