I have two list:
main_list = ['Smith', 'Smith', 'Roger', 'Roger-Smith', '42']
master_list = ['Smith', 'Roger']
I want to count the number of times I find a string from master_list in a string of main_list without counting two times the same item.
Example: for the two lists above, the result of my function should be 4. 'Smith' can be retrieved 3 times in main_list. 'Roger can be found 2 times but as 'Smith' was already found in 'Roger-Smith', this one doesn't count anymore, so 'Roger' is just count as 1 which make 4 in total.
The function I wrote for know is below but I think there is a faster way to do it:
def string_detection(master_list, main_list):
    count = 0
    for substring in master_list:
        temp = list(main_list)
        for string in temp:
            if substring in string:
                main_list.remove(string)
                count+=1
    return count
                A one liner
>>>sum(any(m in L for m in master_list) for L in main_list)
4
Iterate over main_list and check if any of the values from master_list are in that string. This leaves you with a list of bool values. It will stop after it finds one and so adds only one to the count for each string. Conveniently sum counts all the Trues to give you the count.
You can use pandas (which provide fast vectorized operations) with str.contains and sum() 
import pandas as pd
main_list = pd.Series(['Smith', 'Smith', 'Roger', 'Roger-Smith', '42'])
master_list = ['Smith', 'Roger']
count = main_list.str.contains('|'.join(master_list)).sum()
                        You can do it other way around. Create list that will contain only elements from main_list that have substring from master_list
temp_list = [ string for string in main_list if any(substring in string for substring in master_list)]
Now temp_list looks like this:
['Smith', 'Smith', 'Roger', 'Roger-Smith']
So the length of temp_list is your answer.
What about this
main_list = ['Smith', 'Smith', 'Roger', 'Roger-Smith', '42']
master_list = ['Smith', 'Roger']
print len([word for word in main_list if any(mw in word for mw in master_list)])
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With