Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegEx: How can I match all numbers greater than 954?

Tags:

regex

I tried ^([0-9]\d|\d{4,})$ but it does not give the correct result.

like image 796
Grk Avatar asked Apr 30 '15 20:04

Grk


People also ask

How do you match a number range in regex?

To match any number from 0 to 9 we use \d in regex. It will match any single digit number from 0 to 9. \d means [0-9] or match any number from 0 to 9. Instead of writing 0123456789 the shorthand version is [0-9] where [] is used for character range.

How do you match upper and lower cases in regex?

Using character sets For example, the regular expression "[ A-Za-z] " specifies to match any single uppercase or lowercase letter. In the character set, a hyphen indicates a range of characters, for example [A-Z] will match any one capital letter.

How do you match everything including newline regex?

The dot matches all except newlines (\r\n). So use \s\S, which will match ALL characters.


6 Answers

I would not use a regex for this since you will fall in an ugly chains of patterns.

However, if still have to or want to use one, you can use a regex like this:

[1-9]\d{3,}|9[6-9]\d|9[5-9]{2}

Working demo

Regular expression visualization

The idea behind this regex is:

[1-9]\d{3,}   --> This will match 4 or more digit numbers
9[6-9]\d      --> This will match numbers between 960 to 999
9[5-9]{2}     --> This will match numbers between 955 to 999
                  you could write this pattern as `95[5-9]` to cover 
                  numbers from 955 to 959 if you wish (it's up to you)
like image 77
Federico Piazza Avatar answered Oct 04 '22 10:10

Federico Piazza


You can use the following:

95[5-9]|9[6-9]\d|[1-9]\d{3,}

Explanation:

  • 95[5-9] matches from 955-959
  • 9[6-9]\d matches from 960-999
  • [1-9]\d{3,} matches > 1000
like image 42
karthik manchala Avatar answered Oct 04 '22 08:10

karthik manchala


Here it is:

([1-9]\d{3,}|9[6-9]\d|95[5-9])

See it in action on regex101.

And a nice diagram :

regex diagram

like image 23
uraimo Avatar answered Oct 04 '22 10:10

uraimo


A bit long but designed to have only one possible path for each digit (to fail fast):

^(?:
[1-8][0-9]{3,}
|
9 (?:
      [0-4][0-9]{2,}   
    |
      [6-9][0-9]+     
    |
      5 (?:
            [5-9][0-9]*
          |
            [0-4][0-9]+
        )
  )
)$

Note that branches are sorted by probability.

Condensed:

^(?:[1-8][0-9]{3,}|9(?:[0-4][0-9]{2,}|[6-9][0-9]+|5(?:[5-9][0-9]*|[0-4][0-9]+)))$

Note: doing this with a regex pattern is most of the time inappropriate and complicated (regex are not designed to solve arithmetic problems). So if you can, cast your string to integer and test it with a simple comparison.

like image 29
Casimir et Hippolyte Avatar answered Oct 04 '22 08:10

Casimir et Hippolyte


\b(?<!\.)0*(?:[1-9]\d{3,}|9(?:[6-9]\d|5[5-9]))(?:\.\d+)?\b

This line will match leading zeroes and decimals, but only if the entire value is over 954. So it will match 955.62 and 0001253.125 but not 00954.999 or 125.967. regex101

To break it down:

(?<!\.) says not to match if there is a period immediately before the number. This is there to avoid matching things like 0.957.

e2:0* was added to make a full match with leading zeroes

(?:[1-9]\d{3,}|9(?:[6-9]\d|5[5-9])) sets up the match for everything substantial on the left of a decimal point. [1-9]\d{3,} matches any number equal to or higher than 1000. The other side of the | (or statement), 9(?:[6-9]\d|5[5-9]) matches any numbers in the 900s with another or nested inside. The internal or will match when the tens and ones digits are 60-99 or 55-59.

(?:\.\d+)? is a statement that matches decimals. The ? at the end makes it optional such that it will match numbers that don't have decimal points in them.

e2: The regex was wrapped in \bs to make sure it is it's own word. The regex will no longer match so1337, the769s, or 960things.

EDIT1: Forgot to make my . literal.

EDIT2: Made changes marked by "e2:"

like image 41
Eric Ed Lohmar Avatar answered Oct 04 '22 08:10

Eric Ed Lohmar


I was attempting to parse an application log file of database operation times captured for specific queries. The times are recorded in milliseconds (e.g. DatabaseTime=12035 ~ Database Time = ~12 seconds). To elaborate further, I needed to find cases where the application UI timed out after 120 seconds (i.e. DatabaseTime > 120000) so I could capture the timestamps on that same line.

Here is what I came up with for a RegEx pattern:

[1-9][2-9][0-9]\d{3,}|[1-9]\d{6,}

Many of you here are regex experts enough to break this down in your mind, but for those who aren't, I've tested this on regex101.com which also shows the breakdown of the Regular Expression for you: https://regex101.com/r/hG2iU7/28

Some thoughts/considerations on this subject (again completely relevant to the original question):

  • I feel I took a very minimalistic approach to this, however, it
    appears to have satisfied the use-case, which required me to search through logs in NotePad ++ using the RegEx search option.

  • I agree Regex isn't the solution to simulate such numeric operations (which would otherwise better be done using regex in combination with a programming/scripting language) but I have to tell you, if you need to quickly search through a log file in a text editor such as NotePad++, and you don't have the patience or client-privileges to
    create some pretty code or install a Python plug-in, regex may be
    your only quick (and admittedly GREEDY) option, and if thats the
    case, this is a completely useful scenario to know about in the
    workplace.

Lastly, please allow me to state: There'll be plenty of time for pretty coding and efficiency after you do your initial research....why take the time to code when you might not even find what you're looking for in the first place? A little research never hurt anyone...and everyone knows Rome was not built in a day.

like image 35
ViceKnightTA Avatar answered Oct 04 '22 08:10

ViceKnightTA