I have the following regular expression: <pre class="prettyprint"><code>REGEX = /^.+(\d+.+(?=AL|AK|AS|AZ|AR|CA|CO|CT|DE|DC|FM|FL|GA|GU|HI|ID|IL|IN|IA|KS|KY|LA|ME|MH|MD|MA|MI|MN|MS|MO|MT|NE|NV|NH|NJ|NM|NY|NC|ND|MP|OH|OK|OR|PW|PA|PR|RI|SC|SD|TN|TX|UT|VT|VI|VA|WA|WV|WI|WY)[A-Z]{2}[, ]+\d{5}(?:-\d{4})?).+/ </code></pre> I have the following string: <pre class="prettyprint"><code>str = "fdsfd 8126 E Bowen AVE Bensalem, PA 19020-1642 dfdf" </code></pre> Notice my capturing group begins with one or more digits that match the pattern. Yet this is what I get: <pre class="prettyprint"><code>str =~ REGEX $1 => "6 E Bowen AVE Bensalem, PA 19020-1642" </code></pre> Or <pre class="prettyprint"><code>match = str.match(REGEX) match[1] => "6 E Bowen AVE Bensalem, PA 19020-1642" </code></pre> Why is it missing the first 3 digits of 812?

The below regex works properly, as you can see at Regex101 <pre class="prettyprint"><code>REGEX = /^.+?(\d+.+(?=AL|AK|AS|AZ|AR|CA|CO|CT|DE|DC|FM|FL|GA|GU|HI|ID|IL|IN|IA|KS|KY|LA|ME|MH|MD|MA|MI|MN|MS|MO|MT|NE|NV|NH|NJ|NM|NY|NC|ND|MP|OH|OK|OR|PW|PA|PR|RI|SC|SD|TN|TX|UT|VT|VI|VA|WA|WV|WI|WY)[A-Z]{2}[, ]+\d{5}(?:-\d{4})?).+/ </code></pre> Note the addition of the question mark near the beginning of the regex <pre class="prettyprint"><code>/^.+?(\d+... ^ </code></pre> By default, your first <code>.+</code> is being greedy, consuming all digits it can, and still allowing the regex pass. By adding <code>?</code> after the plus, you can make it lazy instead of greedy. An alternative would be to not capture digits, like this: <pre class="prettyprint"><code>/^[^\d]+(\d+... </code></pre> <code>[^\d]+</code> will capture everything except for digits.

why is \d+ not matching all digits?

Tags:

regex

ruby

I have the following regular expression:

Click to copy

REGEX = /^.+(\d+.+(?=AL|AK|AS|AZ|AR|CA|CO|CT|DE|DC|FM|FL|GA|GU|HI|ID|IL|IN|IA|KS|KY|LA|ME|MH|MD|MA|MI|MN|MS|MO|MT|NE|NV|NH|NJ|NM|NY|NC|ND|MP|OH|OK|OR|PW|PA|PR|RI|SC|SD|TN|TX|UT|VT|VI|VA|WA|WV|WI|WY)[A-Z]{2}[, ]+\d{5}(?:-\d{4})?).+/

I have the following string:

Click to copy

str = "fdsfd 8126 E Bowen AVE Bensalem, PA 19020-1642 dfdf"

Notice my capturing group begins with one or more digits that match the pattern. Yet this is what I get:

Click to copy

str =~ REGEX
$1
 => "6 E Bowen AVE Bensalem, PA 19020-1642"

Click to copy

match = str.match(REGEX)
match[1]
=> "6 E Bowen AVE Bensalem, PA 19020-1642"

Why is it missing the first 3 digits of 812?

597

asked Mar 14 '18 20:03

Daniel Viglione

1 Answers

The below regex works properly, as you can see at Regex101

Click to copy

REGEX = /^.+?(\d+.+(?=AL|AK|AS|AZ|AR|CA|CO|CT|DE|DC|FM|FL|GA|GU|HI|ID|IL|IN|IA|KS|KY|LA|ME|MH|MD|MA|MI|MN|MS|MO|MT|NE|NV|NH|NJ|NM|NY|NC|ND|MP|OH|OK|OR|PW|PA|PR|RI|SC|SD|TN|TX|UT|VT|VI|VA|WA|WV|WI|WY)[A-Z]{2}[, ]+\d{5}(?:-\d{4})?).+/

Note the addition of the question mark near the beginning of the regex

Click to copy

/^.+?(\d+...
    ^

By default, your first .+ is being greedy, consuming all digits it can, and still allowing the regex pass. By adding ? after the plus, you can make it lazy instead of greedy.

An alternative would be to not capture digits, like this:

Click to copy

/^[^\d]+(\d+...

[^\d]+ will capture everything except for digits.

200

answered Nov 15 '22 11:11

Adam

Related questions
                            
                                Building and linking shared Tensorflow library on OSX El Capitan to call from Ruby via Swig
                            
                                run selenium with chrome driver on heroku: `cannot find Chrome binary`
                            
                                Ruby rubocop: how to freeze an array constant generated with splat
                            
                                Mailchimp "invalid_grant" error
                            
                                How do I get the row index after I do an .add_row using the axlsx gem?
                            
                                Rails: An unhandled lowlevel error occurred. The application logs may have details
                            
                                How can I set the expire time in Redis' ruby client when using mapped_mset?
                            
                                Difference between require and load wrt to "load" and "execute"
                            
                                How to deal with keyword arguments that happen to be keywords in Ruby?
                            
                                Rails: link_to - passing a subdomain
                            
                                Float Rounding Changes in Ruby 2.4
                            
                                Ruby default assignment ( ||= ) vs Rescuing error
                            
                                Rails RSpec, DRY specs: shared example vs. helper method vs. custom matcher
                            
                                How to prevent retrying for some Exception/Error on sidekiq
                            
                                Why postfix `if` in Ruby work so strange
                            
                                using headless chrome with watir webdriver
                            
                                LoadError for dotenv/load while Dotenv.load works
                            
                                usage of attr_accessor in a rails model class
                            
                                Method named `hash` in main module overrides some object's `hash` method
                            
                                Rails 5 select from two different tables and get one result

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

why is \d+ not matching all digits?

Tags:

regex

ruby

Daniel Viglione

People also ask

1 Answers

Adam

Recent Activity

Donate For Us