Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get re.search to return a string?

I'm trying to match strings in the lines of a file and write the matches minus the first one and the last one

import os, re

infile=open("~/infile", "r")
out=open("~/out", "w")
pattern=re.compile("=[A-Z0-9]*>")
for line in infile:
    out.write( pattern.search(line)[1:-1] + '\n' )

Problem is that it says that Match is not subscriptable, when I try to add .group() it says that Nonegroup has no attritube group, groups() returns that .write needs a tuple etc

Any idea how to get .search to return a string ?

like image 628
ChiseledAbs Avatar asked Dec 19 '16 10:12

ChiseledAbs


People also ask

Does re search return a string?

While re. findall() returns matches of a substring found in a text, re. match() searches only from the beginning of a string and returns match object if found. However, if a match is found somewhere in the middle of the string, it returns none.

What is the type of the return value of the re search () method?

Above we used re.search() to find the first match for a pattern. findall() finds *all* the matches and returns them as a list of strings, with each string representing one match.

What does re search return if no match found?

The re.search() function takes two parameters and returns a match object if there is a match. If there is more than one match, only the first occurrence of the match will be returned. If no matches are found, the value None is returned.

What is the output of re search in Python?

However, re.search() only returns the first match. The lower case letter pattern matches: The sequence of letters at the beginning of the string. The zero-width spot between the 1 and 2.


2 Answers

The re.search function returns a Match object.

If the match fails, the re.search function will return None. To extract the matching text, use the Match.group method.

>>> match = re.search("a.", "abc")
>>> if match is not None:
...     print(match.group(0))
'ab'
>>> print(re.search("a.", "a"))
None

That said, it's probably a better idea to use groups to find the required section of the match:

>>> match = re.search("=([A-Z0-9]*)>", "=abc>")  # Notice brackets
>>> match.group(0)
'=abc>'
>>> match.group(1)
'abc'

This regex can then be used with findall as @WiktorStribiżew suggests.

like image 126
pradyunsg Avatar answered Oct 14 '22 09:10

pradyunsg


You seem to need only the part of strings between = and >. In this case, it is much easier to use a capturing group around the alphanumeric pattern and use it with re.findall that will never return None, but just an empty list upon no match, or a list of captured texts if found. Also, I doubt you need empty matches, so use + instead of *:

pattern=re.compile(r"=([A-Z0-9]+)>")
                      ^         ^

and then

"\n".join(pattern.findall(line))
like image 29
Wiktor Stribiżew Avatar answered Oct 14 '22 11:10

Wiktor Stribiżew