Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pythonic way to count occurrences from a list in a string

What's the best way to find the count of occurrences of strings from a list in a target string? Specifically, I have a list :

string_list = [
    "foo",
    "bar",
    "baz"
]

target_string = "foo bar baz bar"

# Trying to write this function!
count = occurrence_counter(target_string) # should return 4

I'd like to optimize to minimize speed and memory usage, if that makes a difference. In terms of size, I would expect that string_list may end up containing several hundred substrings.

like image 317
Kevin Bedell Avatar asked May 10 '17 13:05

Kevin Bedell


3 Answers

Another way using collelctions.Counter:

from collections import Counter
word_counts = Counter(target_string.split(' '))
total = sum(word_counts.get(w, 0)) for w in string_list)
like image 139
Aamir Rind Avatar answered Oct 25 '22 15:10

Aamir Rind


This works!

def occurrence_counter(target_string):
    return sum(map(lambda x: x in string_list, target_string.split(' ')))

The string gets split into tokens, then each token gets transformed into a 1 if it is in the list, a 0 otherwise. The sum function, at last, sums those values.

EDIT: also:

def occurrence_counter(target_string):
    return len(list(filter(lambda x: x in string_list, target_string.split(' '))))
like image 36
gioaudino Avatar answered Oct 25 '22 15:10

gioaudino


This Python3 should work:

In [4]: string_list = [
   ...:     "foo",
   ...:     "bar",
   ...:     "baz"
   ...: ]
   ...: 
   ...: set_of_counted_word = set(string_list)
   ...: 
   ...: def occurrence_counter(target_str, words_to_count=set_of_counted_word):
   ...:     return sum(1 for word in target_str.strip().split()
   ...:                if word in words_to_count)
   ...: 
   ...: 
   ...: for target_string in ("foo bar baz bar", " bip foo bap foo dib baz   "):
   ...:     print("Input: %r -> Count: %i" % (target_string, occurrence_counter(target_string)))
   ...: 
   ...: 
Input: 'foo bar baz bar' -> Count: 4
Input: ' bip foo bap foo dib baz   ' -> Count: 3

In [5]:
like image 33
Paddy3118 Avatar answered Oct 25 '22 16:10

Paddy3118