I'm using R and need a regex for
a block of N characters starting with zero or more whitespaces and continuing with one or more digits afterwards
For N = 9 here are
examples of valid strings
123456789kfasdf  3456789asdfa        1and examples of invalid strings
12345 7891       9a     678aThe ‹ ^ › and ‹ $ › anchors ensure that the regex matches the entire subject string; otherwise, it could match 10 characters within longer text. The ‹ [A-Z] › character class matches any single uppercase character from A to Z, and the interval quantifier ‹ {1,10} › repeats the character class from 1 to 10 times.
There is a method for matching specific characters using regular expressions, by defining them inside square brackets. For example, the pattern [abc] will only match a single a, b, or c letter and nothing else.
i) makes the regex case insensitive. (? s) for "single line mode" makes the dot match all characters, including line breaks.
Another option is to match 8 times either a digit OR a space not preceded by a digit and then match a digit at the end.
(?<![\d\h])(?>\d|(?<!\d)\h){8}\d
In parts
(?<![\d\h]) Negative lookbehind, assert what is on the left is not a horizontal whitespace char or digit(?> Atomic group (no backtracking)
\d Match a digit| Or\h(?<!\d\h) Match a horizontal whitespace char asserting that it is not preceded by a digit){8} Close the group and repeat 8 times\d Match the last digitRegex demo | R demo
Example code, using perl=TRUE
x <- "123456789
kfasdf  3456789asdf
a        1
12345 789
1       9
a     678a"
    regmatches(x, gregexpr("(?<![\\d\\h])(?>\\d|(?<!\\d)\\h){8}\\d", x, perl=TRUE))
Output
[[1]]
[1] "123456789" "  3456789" "        1"
If there can not be a digit present after matching the last 9th digit, you could end the pattern with a negative lookahead asserting not a digit.
(?<![\d\h])(?>\d|(?<!\d)\h){8}\d(?!\d)
Regex demo
If there can not be any digits on any side:
 (?<!\d)(?>\d|(?<!\d)\h){8}\d(?!\d)
Regex demo
Using string s from @d.b's answer. 
Extract optional whitespace followed by numbers.
library(stringr)
str_extract(s, '(\\s+)?\\d+')
#[1] "123456789" "  3456789" "        1" "12345"     "1"         "     678" 
Check their length using nchar. 
nchar(str_extract(s, '(\\s+)?\\d+')) == 9
#[1]  TRUE  TRUE  TRUE FALSE FALSE FALSE
Using the same logic in base R function.
nchar(regmatches(s, regexpr('(\\s+)?\\d+', s))) == 9
#[1]  TRUE  TRUE  TRUE FALSE FALSE FALSE
If there could be multiple such instances we can use str_extract_all : 
sapply(str_extract_all(s, '(\\s+)?\\d+'), function(x) any(nchar(x) == 9))
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With