Consider I have the following text:
Temp:C5E501374D0343090957F7E5929E765C931F7D3EC7A96189FDA88549D54D9E4E5DB3FC1C2, adfsafd1242412,
And I want to catch all the data after Temp:
and until first occurrence of ,
which means: C5E501374D0343090957F7E5929E765C931F7D3EC7A96189FDA88549D54D9E4E5DB3FC1C2
I tried using regex Temp:(.+,)
without success
How do I tell the regex that ,
should be the first found?
Match any specific character in a setUse square brackets [] to match any characters in a set. Use \w to match any single alphanumeric character: 0-9 , a-z , A-Z , and _ (underscore). Use \d to match any single digit. Use \s to match any single whitespace character.
. means match any character in regular expressions. * means zero or more occurrences of the SINGLE regex preceding it.
The metacharacter \b is an anchor like the caret and the dollar sign. It matches at a position that is called a “word boundary”. This match is zero-length. There are three different positions that qualify as word boundaries: Before the first character in the string, if the first character is a word character.
The [^;] is a character class, it matches everything but a semicolon. You can specify a character class, by enclosing a list of characters in [] , which will match any character from the list. If the first character after the " [" is "^", the class matches any character not in the list. This should work in most regex dialects.
this is not a regex solution, but something simple enough for your problem description. Just split your string and get the first item from your array. This will match up to the first occurrence only in each string and will ignore subsequent occurrences.
By default regex will match as much as it can (greedy) Simply add a ? and it will be non-greedy and match as little as possible! Good luck, hope that helps. Thanks for contributing an answer to Stack Overflow!
I tend to assume that a regex is going to be used repeatedly. In theory the objects are cached internally by re and this isn't a requirement, but I favor explicit solutions from implicit ones and I find keeping RegexObjects to be more readable.
To capture the value you need, you could try and use lazy matching dot (.+?
matches 1 or more characters - but as few as possible - that are any characters but a newline):
Temp:(.+?),
Since lazy matching might eat up more than you need, a negated character class ([^,]+
matches 1 or more characters other than a comma) looks preferable:
Temp:([^,]+)
The result is captured into Group 1 with the capturing group (parentheses).
IDEONE sample code:
import re
p = re.compile(r'Temp:([^,]+)')
test_str = "Temp:C5E501374D0343090957F7E5929E765C931F7D3EC7A96189FDA88549D54D9E4E5DB3FC1C2, adfsafd1242412,"
print (re.search(p, test_str).group(1))
Output: C5E501374D0343090957F7E5929E765C931F7D3EC7A96189FDA88549D54D9E4E5DB3FC1C2
NOTE that a look-around based solution is more resource-consuming that the capturing group one that you and I are using.
You can use this lookbehind based regex:
(?<=Temp:)[^,]+
RegEx Demo
Code:
s='Temp:C5E501374D0343090957F7E5929E765C931F7D3EC7A96189FDA88549D54D9E4E5DB3FC1C2, adfsafd1242412,'
print re.search(r"(?<=Temp:)[^,]+", s).group()
Output:
C5E501374D0343090957F7E5929E765C931F7D3EC7A96189FDA88549D54D9E4E5DB3FC1C2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With