Is there a nice(er) way to find the end index of a word in a string?
My method is like that:
text = "fed up of seeing perfect fashion photographs"
word = "fashion"
wordEndIndex = text.index(word) + len(word) - 1
You should use regex (with word boundary) as str. find returns the first occurrence. Then use the start attribute of the match object to get the starting index.
Method 1: Get the position of a character in Python using rfind() Python String rfind() method returns the highest index of the substring if found in the given string. If not found then it returns -1.
The slicing starts with the start_pos index (included) and ends at end_pos index (excluded). The step parameter is used to specify the steps to take from start to end index. Python String slicing always follows this rule: s[:i] + s[i:] == s for any index 'i'.
It depends whether you really want to know the end index or not. Presumably you're actually more interested in the bits of the text
after that? Are you then doing something like this?
>>> text[wordEndIndex:]
'n photographs'
If you really do need the index, then do what you've done, but wrap it inside a function that you can call for different text
s and word
s so you don't have to repeat this code. Then it's simple and understandable, if you give the function a descriptive name.
On the other hand, if you're more interested in the bits of text
, then don't even bother working out what the index is:
>>> text.split(word)
['fed up of seeing perfect ', ' photographs']
Of course this will get more complicated if the word can appear more than once in the text. In that case maybe you could define a different function to split on the first occurrence of the word and just give back the before and after components, without ever returning any numerical indexes.
I cannot comment on whether this is a better way, but an alternative to what you have suggested would be to find the next space after that word and use that to get the index.
text = "fed up of seeing perfect fashion photographs"
word = "fashion"
temp = text.index(word)
wordEndIndex = temp + text[temp:].index(' ') - 1
Your approach seems more natural, and is possibly faster too.
Just for fun, here's a first–principles version that finds the index of the last character of the word in a single pass:
def word_end_index(text, word):
wi = wl = len(word)
for ti, tc in enumerate(text):
wi = wi - 1 if tc == word[-wi] else wl
if not wi:
return ti
return -1
I had some shorter versions but they used slices which would duplicate strings all over the place which is rather inefficient.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With