Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient way to create strings from a list

I have some very inefficient code I'd like to make more general/efficient. I am trying to create strings from a set of lists.

Here is what I currently have:

#contains categories
numind = [('Length',), ('Fungus',)] 

#contains values that pertain to the categories
records = [('Length', 'Long'), ('Length', 'Med'), ('Fungus', 'Yes'), ('Fungus', 'No')] 

#contains every combination of values between the 2 categories. 
#for example, (Long, Yes) = Length=Long & Fungus = Yes.
combinations = [('Long', 'Yes'), ('Long', 'No'), ('Med', 'Yes'), ('Med', 'No')] 

Now I want to create strings that have every combination in my combination list. This is the inefficient part. I'd like it so that I don't have to hardwire the length of the "numind" list. Any ideas?

values = combinations
valuestring = []

if len(numind) == 0:
   pass
elif len(numind) == 1:
   for a in xrange(len(values)):
      valuestring.append(numind[0][0]+values[a][0]) 

elif len(numind) == 2:
   for a in xrange(len(values)):
      valuestring.append(numind[0][0]+values[a][0]+'_'+numind[1][0]+values[a][1]) 

#and so forth until numind is 10+

output

['LengthLong_FungusYes', 'LengthLong_FungusNo', 'LengthMed_FungusYes', 'LengthMed_FungusNo']
like image 337
nlr25 Avatar asked Oct 04 '22 03:10

nlr25


1 Answers

I'd use itertools.product with collections.OrderedDict (the latter isn't strictly necessary, but means that you get the order right without having to think about it):

>>> from collections import OrderedDict
>>> from itertools import product
>>> 
>>> d = OrderedDict()
>>> for k, v in records:
...     d.setdefault(k, []).append(v)
...     
>>> d
OrderedDict([('Length', ['Long', 'Med']), ('Fungus', ['Yes', 'No'])])
>>> ['_'.join(k+v for k,v in zip(d, v)) for v in product(*d.values())]
['LengthLong_FungusYes', 'LengthLong_FungusNo', 'LengthMed_FungusYes', 'LengthMed_FungusNo']

itertools.product naturally produces the "every combination" part (which is really called the Cartesian product, not a combination):

>>> product(["Long", "Med"], ["Yes", "No"])
<itertools.product object at 0x96b0dec>
>>> list(product(["Long", "Med"], ["Yes", "No"]))
[('Long', 'Yes'), ('Long', 'No'), ('Med', 'Yes'), ('Med', 'No')]

The advantage here is that it doesn't matter how many categories there are or how many values there are associated with any category: as long as they're specified in records, it should work.

like image 119
DSM Avatar answered Oct 07 '22 18:10

DSM