Say I have str = "qwop(8) 5"
and I want to return the position of 8.
I have the following solution:
import re
str = "qwop(8) 5"
regex = re.compile("\(\d\)")
match = re.search(regex, string) # match object has span = (4, 7)
print(match.span()[0] + 1) # +1 gets at the number 8 rather than the first bracket
This seems really messy. Is there a more sophisticated solution? Preferably using re
as I've already imported that for other uses.
The simplest way to extract the string between two parentheses is to use slicing and string. find() .
To find whether a given string contains a number, convert it to a character array and find whether each character in the array is a digit using the isDigit() method of the Character class.
How do you get a string between brackets in Python? Use re.search() to get a part of a string between two brackets. Call re.search(pattern, string) with the pattern r”\[([A-Za-z0-9_]+)\]” to extract the part of the string between two brackets.
Check if string contains any number using any() + isdigit() In this, we check for numbers using isdigit() and check for any occurrence using any().
Use match.start()
to get the start index of the match, and a capturing group to capture specifically the digit between the brackets to avoid the +1
in the index. If you want the very start of the pattern, use match.start()
, if you only want the digit, use match.start(1)
;
import re
test_str = 'qwop(8) 5'
pattern = r'\((\d)\)'
match = re.search(pattern, test_str)
start_index = match.start()
print('Start index:\t{}\nCharacter at index:\t{}'.format(start_index,
test_str[start_index]))
match_index = match.start(1)
print('Match index:\t{}\nCharacter at index:\t{}'.format(match_index,
test_str[match_index]))
Outputs;
Start index: 4
Character at index: (
Match index: 5
Character at index: 8
You can use:
regex = re.compile(r'\((\d+)\)')
The r
prefix means that we are working with a raw string. A raw string means that if you write for instance r'\n'
, Python will not interpret this as a string with a new line character. But as a string with two characters: a backslash ('\\'
) and an 'n'
.
The additional brackets are there to define a capture group. Furthermore a number is a sequence of (one or more) digits. So the +
makes sure that we will capture (1425)
as well.
We can then perform a .search()
and obtain a match. You then can use .start(1)
to obtain the start of the first capture group:
>>> regex.search(data)
<_sre.SRE_Match object; span=(4, 7), match='(8)'>
>>> regex.search(data).start(1)
5
If you are inteested in the content of the first capture group, you can call .group(1)
:
>>> regex.search(data).group(1)
'8'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With