I'm currently using regular expressions to search through RSS feeds to find if certain words and phrases are mentioned, and would then like to extract the text on either side of the match as well. For example: <pre class="prettyprint"><code>String = "This is an example sentence, it is for demonstration only" re.search("is", String) </code></pre> I'd like to know the position(s) of where the 'is' matches are found so that I can extract and output something like this: <pre class="prettyprint"><code>1 match found: "This is an example sentence" </code></pre> I know that it would be easy to do with splits, but I'd need to know what the index of first character of the match was in the string, which I don't know how to find

You could use <code>.find("is")</code>, it would return position of "is" in the string or use .start() from re <pre class="prettyprint"><code>>>> re.search("is", String).start() 2 </code></pre> Actually its match "is" from "This" If you need to match per word, you should use <code>\b</code> before and after "is", <code>\b</code> is the word boundary. <pre class="prettyprint"><code>>>> re.search(r"\bis\b", String).start() 5 >>> </code></pre> for more info about python regular expressions, docs here

I don't think this question has been completely answered yet because all of the answers only give single match examples. The OP's question demonstrates the nuances of having 2 matches as well as a substring match which should not be reported because it is not a word/token. To match multiple occurrences, one might do something like this: <pre class="prettyprint"><code>iter = re.finditer(r"\bis\b", String) indices = [m.start(0) for m in iter] </code></pre> This would return a list of the two indices for the original string.

Python - Locating the position of a regex match in a string?

Tags:

python

regex

I'm currently using regular expressions to search through RSS feeds to find if certain words and phrases are mentioned, and would then like to extract the text on either side of the match as well. For example:

String = "This is an example sentence, it is for demonstration only" re.search("is", String)

I'd like to know the position(s) of where the 'is' matches are found so that I can extract and output something like this:

1 match found: "This is an example sentence"

I know that it would be easy to do with splits, but I'd need to know what the index of first character of the match was in the string, which I don't know how to find

647

asked Apr 20 '10 10:04

nb.

2 Answers

You could use .find("is"), it would return position of "is" in the string

or use .start() from re

>>> re.search("is", String).start() 2

Actually its match "is" from "This"

If you need to match per word, you should use \b before and after "is", \b is the word boundary.

>>> re.search(r"\bis\b", String).start() 5 >>>

for more info about python regular expressions, docs here

answered Sep 19 '22 00:09

YOU

I don't think this question has been completely answered yet because all of the answers only give single match examples. The OP's question demonstrates the nuances of having 2 matches as well as a substring match which should not be reported because it is not a word/token.

To match multiple occurrences, one might do something like this:

iter = re.finditer(r"\bis\b", String) indices = [m.start(0) for m in iter]

This would return a list of the two indices for the original string.

answered Sep 23 '22 00:09

demongolem

Related questions
                            
                                Can pip be used with Python Tools in Visual Studio?
                            
                                How to find char in string and get all the indexes?
                            
                                Python name 'os' is not defined [duplicate]
                            
                                How to fix "could not find or load the Qt platform plugin windows" while using Matplotlib in PyCharm
                            
                                formatting long numbers as strings in python
                            
                                ModuleNotFoundError: No module named 'virtualenv.seed.embed.via_app_data' when I created new env by virtualenv
                            
                                Is there a function to make scatterplot matrices in matplotlib?
                            
                                How can I check if a string only contains letters in Python?
                            
                                How can I quickly estimate the distance between two (latitude, longitude) points?
                            
                                How can I get the Unix permission mask from a file? [duplicate]
                            
                                Error #15: Initializing libiomp5.dylib, but found libiomp5.dylib already initialized
                            
                                Django package to generate random alphanumeric string
                            
                                Format string dynamically [duplicate]
                            
                                How to cache downloaded PIP packages [duplicate]
                            
                                How can I define a class in Python?
                            
                                Python: Random numbers into a list
                            
                                Python: count repeated elements in the list [duplicate]
                            
                                How to handle response encoding from urllib.request.urlopen() , to avoid TypeError: can't use a string pattern on a bytes-like object
                            
                                Get a list of numbers as input from the user
                            
                                Python Inverse of a Matrix

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With