Regular Expression Matching Stock Ticker

Question

I'm having trouble matching stock tickers in a string of text. I want a regular expression to match a space , 3 uppercase letters, and finally a space, period, OR question mark.

Below is the sample pattern that I created.

> `example = 'These are the tickers that I am trying to find: FAB. APL APL? GJA ADJ AKE EBY ZKE SPR TYL'

re.findall('[ ][A-Z]{3}[ .!?]',example)`

The regular expression misses quite a few of the matches.

glibdud · Accepted Answer

If you notice, there's a pattern to which items are missed. It's most obvious in the long section of non-punctuated symbols: it misses every other item.

This is because re.findall() finds non-overlapping matches, and your pattern is matching both the space before and after each match. That means after one item is matched, the initial space for the next item has already been gobbled up and cannot be used again.

Use word boundaries (\b) instead of matching leading/trailing spaces, and make your character class optional:

>>> re.findall(r'\b[A-Z]{3}\b[.!?]?',example)
['FAB.', 'APL', 'APL?', 'GJA', 'ADJ', 'AKE', 'EBY', 'ZKE', 'SPR', 'TYL']

Regular Expression Matching Stock Ticker

Tags:

python

regex

python-3.x

chris302107

1 Answers

glibdud

Recent Activity

Donate For Us

Regular Expression Matching Stock Ticker

Tags:

python

regex

python-3.x

chris302107

1 Answers

glibdud

Related questions

Recent Activity

Donate For Us