Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex: AttributeError: 'NoneType' object has no attribute 'groups'

Tags:

python

regex

I have a string which I want to extract a subset of. This is part of a larger Python script.

This is the string:

import re  htmlString = '</dd><dt> Fine, thank you.&#160;</dt><dd> Molt bé, gràcies. (<i>mohl behh, GRAH-syuhs</i>)' 

Which I want to pull-out "Molt bé, gràcies. mohl behh, GRAH-syuhs". And for that I use regular expression using re.search:

SearchStr = '(\<\/dd\>\<dt\>)+ ([\w+\,\.\s]+)([\&\#\d\;]+)(\<\/dt\>\<dd\>)+ ([\w\,\s\w\s\w\?\!\.]+) (\(\<i\>)([\w\s\,\-]+)(\<\/i\>\))'  Result = re.search(SearchStr, htmlString)  print Result.groups() AttributeError: 'NoneType' object has no attribute 'groups' 

Since Result.groups() doesn't work, neither do the extractions I want to make (i.e. Result.group(5) and Result.group(7)). But I don't understand why I get this error? The regular expression works in TextWrangler, why not in Python? Im a beginner in Python.

like image 911
jO. Avatar asked Mar 05 '13 19:03

jO.


People also ask

How do I fix NoneType object has no attribute?

The Python "AttributeError: 'NoneType' object has no attribute 'get'" occurs when we try to call the get() method on a None value, e.g. assignment from function that doesn't return anything. To solve the error, make sure to only call get() on dict objects.

WHAT IS group in regex python?

What is Group in Regex? A group is a part of a regex pattern enclosed in parentheses () metacharacter. We create a group by placing the regex pattern inside the set of parentheses ( and ) . For example, the regular expression (cat) creates a single group containing the letters 'c', 'a', and 't'.

How do you fix attribute errors in Python?

Solution for AttributeError Errors and exceptions in Python can be handled using exception handling i.e. by using try and except in Python. Example: Consider the above class example, we want to do something else rather than printing the traceback Whenever an AttributeError is raised.


2 Answers

You are getting AttributeError because you're calling groups on None, which hasn't any methods.

regex.search returning None means the regex couldn't find anything matching the pattern from supplied string.

when using regex, it is nice to check whether a match has been made:

Result = re.search(SearchStr, htmlString)  if Result:     print Result.groups() 
like image 69
thkang Avatar answered Oct 07 '22 18:10

thkang


import re  htmlString = '</dd><dt> Fine, thank you.&#160;</dt><dd> Molt bé, gràcies. (<i>mohl behh, GRAH-syuhs</i>)'  SearchStr = '(\<\/dd\>\<dt\>)+ ([\w+\,\.\s]+)([\&\#\d\;]+)(\<\/dt\>\<dd\>)+ ([\w\,\s\w\s\w\?\!\.]+) (\(\<i\>)([\w\s\,\-]+)(\<\/i\>\))'  Result = re.search(SearchStr.decode('utf-8'), htmlString.decode('utf-8'), re.I | re.U)  print Result.groups() 

Works that way. The expression contains non-latin characters, so it usually fails. You've got to decode into Unicode and use re.U (Unicode) flag.

I'm a beginner too and I faced that issue a couple of times myself.

like image 42
antonavy Avatar answered Oct 07 '22 16:10

antonavy