Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using the Regex OR operator to accommodate user input of "A" or "An"

Tags:

python

regex

I am attempting to validate a user input of either 'a' | 'an' which will satisfy the if statement. If it is not satisfied, the elif block will check to see if the second word is "about", if not "about", it will then check for "anyone". Unfortunately 'about' & "anyone" both start with the letters 'a' or 'an' so I needed to add a 'space' after the end of 'a' and 'an' to allow for Regex to detect the difference.

# Receive User input.
secrets = {}
secrets['text'] = request.GET.get('text')

regex_a = re.compile("(a|an)")
regex_about = re.compile('about')
regex_anyone = re.compile('anyone')

# Get second word from secrets[text]
secondword = secrets['text'].split()[1]
# If 2nd word is == 'a/an'
if regex_a.match(secondword):
    return HttpResponse("Text was (a) or (an)")

# Else if 2nd word is == about
elif regex_about.match(secondword):
    return HttpResponse("Second word was (about)")

elif regex_anyone.match(secondword):
    return HttpResponse("Second word was (anyone)")

else:
    return HttpResponse("Failed to interpret user input")

The current Regex ("(a|an)") returns Text was (a) or (an) even when the user inputs "about" or "anyone" as the second word, this is expected.

So I also tried ("(a\s|an\s)") which returns Failed to interpret user input when the input for the second word is 'a' or 'an'. However it returns the correct response for 'about' & 'anyone'. Which is really confusing...

I then also tried ("(a_|an_)") which returns the same results as the previous test.

Apart from these three tests I have attempted many others, but will not list them here as there are far to many.

like image 590
liam m Avatar asked Dec 25 '22 20:12

liam m


2 Answers

Use this :

(a\b|an\b)

the \b is a word boundary, matching the end of the word.

Demo here. Welcome on Stack Overflow! Take the site tour in the Help section if you haven't already! :-)

like image 141
Docteur Avatar answered Feb 06 '23 18:02

Docteur


You can use:

regex_a = re.compile("(a|an)$")

That way you are telling the regex that the string needs to end right there for a match.

The regex ("(a\s|an\s)") would not work never because it expects the substrings 'a ' and 'an ' to match, and the problem is that the split() in secondword = secrets['text'].split()[1] returns whitespace-trimmed strings.

like image 30
biomorgoth Avatar answered Feb 06 '23 19:02

biomorgoth