After searching a lot without success I need help.
I have a list of list of tuples. Each list inside the list of list represent certain numbers of formulas in my system. Any element in this list is a tuple that represent the type of the element (variable, parameter, constant, an operation...) and the name of the element. For example for the formulas x1+x2+A1, x1-x3 and sin(x2)+A1 we'll have:
[
[('VAR', 'x1'), ('PLUS', '+'), ('VAR', 'x2'), ('PLUS', '+'), ('PAR', 'A1')],
[('VAR', 'x1'), ('LESS', '-'), ('VAR', 'x3')],
[('SIN', 'sin'), ('VAR', 'x2'), ('PLUS', '+'), ('PAR', 'A1')]
]
I'm trying to determine in which formula each variable appear. In the example above I have that x1 variable is on 1 and 2 formula, x2 variable is on 1 and 3 formula and x3 in 2 formula, so my output will be something like:
[
['x1', 1, 2],
['x2', 1, 3],
['x3', 2],
]
At the moment I have very inefficient code that doesn't work at all, but here it is:
cont = 0
for subL1 in L:
for subL2 in L:
if len(subL1) != 1 and len(subL2) != 1:
if subL1 != subL2 and subL2:
for x,y in subL1:
for z,t in subL2:
if ( x == 'VAR'
and z == 'VAR'
and y == t
):
print "Variable", y , "repeated"
else:
print "list with 1 lenght\n"
subL1.pop(0)
cont = cont + 1
In the majority of programming languages when you need to access a nested data type (such as arrays, lists, or tuples), you append the brackets to get to the innermost item. The first bracket gives you the location of the tuple in your list. The second bracket gives you the location of the item in the tuple.
Initial approach that can be applied is that we can iterate on each tuple and check it's count in list using count() , if greater than one, we can add to list. To remove multiple additions, we can convert the result to set using set() .
In Python, tuples are allocated large blocks of memory with lower overhead, since they are immutable; whereas for lists, small memory blocks are allocated. Between the two, tuples have smaller memory. This helps in making tuples faster than lists when there are a large number of elements.
You could use a collections.defaultdict
to store the formulas (actually the indices inside your list of lists) for each variable:
from collections import defaultdict
dd = defaultdict(set) # use a set as factory so we don't keep duplicates
for idx, subl in enumerate(l, 1): # iterate over the sublists with index starting at 1
for subt in subl: # iterate over each tuple in each sublist
label, val = subt # unpack the tuple
if label == 'VAR': # if it's a VAR save the index in the defaultdict
dd[val].add(idx)
For example with:
l = [[('VAR', 'x1'), ('PLUS', '+'), ('VAR', 'x2'), ('PLUS', '+'), ('PAR', 'A1')],
[('VAR', 'x1'), ('LESS', '-'), ('VAR', 'x3')],
[('SIN', 'sin'), ('VAR', 'x2'), ('PLUS', '+'), ('PAR', 'A1')]
]
It gives:
print(dd)
# defaultdict(set, {'x1': {1, 2}, 'x2': {1, 3}, 'x3': {2}})
To get your desired output you only need to convert that to a list again, for example (python-3.x only):
>>> [[name, *sorted(formulas)] for name, formulas in sorted(dd.items())]
[['x1', 1, 2], ['x2', 1, 3], ['x3', 2]]
formula = [
[('VAR', 'x1'), ('PLUS', '+'), ('VAR', 'x2'), ('PLUS', '+'), ('PAR', 'A1')],
[('VAR', 'x1'), ('LESS', '-'), ('VAR', 'x3')],
[('SIN', 'sin'), ('VAR', 'x2'), ('PLUS', '+'), ('PAR', 'A1')]
]
variables = collections.defaultdict(set)
for line_no, line in enumerate(formula):
for typ, value in line:
if typ == 'VAR':
variables[value].add(line_no)
variables
defaultdict(set, {'x1': {0, 1}, 'x2': {0, 2}, 'x3': {1}})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With