Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python list comprehension with if else conditions

I have a small (<100) list of chemical names called detected_chems .

And a second much larger (>1000) iterable; a dictionary chem_db containing chemical names as the key, and a dictionary of chemical properties as the value. Like this:

{'chemicalx':{'property1':'smells','property2':'poisonous'},
 'chemicaly':{'property1':'stinks','property2':'toxic'}}

I am trying to match all the detected chemicals with those in the database and pull their properties.

I have studied these questions/answers but can't seem to apply it to my case (sorry)

  • Is it possible to use 'else' in a list comprehension?
  • if/else in a list comprehension?
  • if/else in a list comprehension?
  • Python Nested List Comprehension with If Else

So I am making a list of results res, but instead of nested for loops with an if x in condition, I've created this.

res = [{chem:chem_db[chem]}
       for det_chem in detected_chems
       for chem in chem_db.keys()
       if det_chem in chem]

This works to an extent!

What I (think) am doing here is creating a list of dictionaries, which will have the key:value pair of chemical names (keys) and information about the chemicals (as a dictionary itself, as values), if the detected chemical is found somewhere in the chemical database (chem_db).

The problem is not all the detected chemicals are found in the database. This is probably because of misspelling or name variation (e.g. they include numbers) or something similar.

So to solve the problem I need to identify which detected chemicals are not being matched. I thought this might be a solution:

not_matched=[]
res = [{chem:chem_db[chem]}
       for det_chem in detected_chems
       for chem in chem_db.keys()
       if det_chem in chem else not_matched.append(det_chem)]

I am getting a syntax error, due to the else not_matched.append(det_chem) part.

I have two questions:

1) Where should I put the else condition to avoid the syntax error?

2) Can the not_matched list be built within the list comprehension, so I don't create that empty list first.

res = [{chem:chem_db[chem]}
       for det_chem in detected_chems
       for chem in chem_db.keys()
       if det_chem in chem else print(det_chem)]

What I'd like to achieve is something like:

in: len(detected_chems)
out: 20
in: len(res)
out: 18
in: len(not_matched)
out: 2

in: print(not_matched)
out: ['chemical_strange_character$$','chemical___WeirdSPELLING']

That will help me find trouble shoot the matching.

like image 456
Westworld Avatar asked Feb 23 '26 12:02

Westworld


2 Answers

You should

if det_chem in chem or not_matched.append(det_chem)

but that being said if you clean up a bit as per comments I think there is a much more efficient way of doing what you want. The explanation of the above is that append returns None so the whole if-condition will evaluate to False (but the item still appended to the not_matched list)

Re: efficiency:

res = [{det_chem:chem_db[det_chem]}
       for det_chem in detected_chems
       if det_chem in chem_db or not_matched.append(det_chem)]

This should be drastically faster - the for loop on dictionary keys is an O(n) operation while dictionaries are used precisely because lookup is O(1) so instead of retrieving the keys and comparing them one by one we use the det_chem in chem_db lookup which is hash based

Bonus: dict comprehension (to address question 2)

I am not sure why a list of one-key-dicts is built but probably what needed is a dict comprehension as in:

chem_db = {1: 2, 4: 5}
detected_chems = [1, 3]
not_matched = []
res = {det_chem: chem_db[det_chem] for det_chem in detected_chems if
       det_chem in chem_db or not_matched.append(det_chem)}
# output
print(res) # {1: 2}
print(not_matched) # [3]

No way I can think of to build the not_matched list while also building res using a single list/dict comprehension.

like image 169
Mr_and_Mrs_D Avatar answered Feb 25 '26 09:02

Mr_and_Mrs_D


List comprehension consists formally up to 3 parts. Let's show them in an example:

[2 * i          for i in range(10)         if i % 3 == 0]
  1. The first part is an expression — and it may be (or used in it) the ternary operator (x if y else z)

  2. The second part is a list (or more lists in nested for loops) to select values for a variable from it.

  3. The third part (optional) is a filter (for selecting in the part 2) - and the else clause in not allowed here!

So if you want to use the else branch, you have to put it into the first part, for example

[2 * i  if i < 5  else 3 * i           for i in range(10)          if i % 3 == 0]
like image 37
MarianD Avatar answered Feb 25 '26 08:02

MarianD



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!