Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python: find first string in string

Given a string and a list of substrings I want to the first position any substring occurs in the string. If no substring occurs, return 0. I want to ignore case.

Is there something more pythonic than:

given = 'Iamfoothegreat'
targets = ['foo', 'bar', 'grea', 'other']
res = len(given)
for t in targets:
    i = given.lower().find(t)
    if i > -1 and i < res:
        res = i

if res == len(given):
    result = 0
else:
    result = res

That code works, but seems inefficient.

like image 235
foosion Avatar asked Mar 04 '16 17:03

foosion


People also ask

How do I find the first occurrence of a string in a string python?

Method #2 : Using List Slice + index() + list() One can convert the string to list using list() and then using list slicing we reverse the list and use the conventional index method to get the index of first occurrence of element.

How do I find the first substring in Python?

To find the position of first occurrence of a string, you can use string. find() method. where string is the string in which you have to find the index of first occurrence of substring . start and end are optional and are starting and ending positions respectively in which substring has to be found.

How do you find the first occurrence of a string in another string?

To find the index of first occurrence of a substring in a string you can use String. indexOf() function.

How do you search for a string in another string in Python?

Python String find() method returns the lowest index or first occurrence of the substring if it is found in a given string. If it is not found, then it returns -1. Parameters: sub: Substring that needs to be searched in the given string.


1 Answers

Use regex

Another example just use regex, cause think the python regex implementation is super fast. Not my regex function is

import re

given = 'IamFoothegreat'
targets = ['foo', 'bar', 'grea', 'other']

targets = [re.escape(x) for x in targets]    
pattern = r"%(pattern)s" % {'pattern' : "|".join(targets)}
firstMatch = next(re.finditer(pattern, given, re.IGNORECASE),None)
if firstMatch:
    print firstMatch.start()
    print firstMatch.group()

Output is

3
foo

If nothing is found output is nothing. Should be self explained to check if nothing is found.

Much more normal not really pythonic

Give you the matched string, too

given = 'Iamfoothegreat'.lower()
targets = ['foo', 'bar', 'grea', 'other']

dct = {'pos' : - 1, 'string' : None};
given = given.lower()

for t in targets:
    i = given.find(t)
    if i > -1 and (i < list['pos'] or list['pos'] == -1):
        dct['pos'] = i;
        dct['string'] = t;

print dct

Output is:

{'pos': 3, 'string': 'foo'}

If element is not found:

{'pos': -1, 'string': None}

Performance Comparision of both

with this string and pattern

given = "hello world" * 5000
given += "grea" + given
targets = ['foo', 'bar', 'grea', 'other']

1000 loops with timeit:

regex approach: 4.08629107475 sec for 1000
normal approach: 1.80048894882 sec for 1000

10 loops. Now with much bigger targets (targets * 1000):

normal approach: 4.06895017624 for 10
regex approach: 34.8153910637 for 10
like image 187
Kordi Avatar answered Sep 29 '22 20:09

Kordi