I want to write a simple regex, in vim, that will find all strings lexicographically smaller than another string.
Specifically, I want to use this to compare dates formatted as 2014-02-17. These dates are lexicographically sortable, which is why I use them.
My specific use case: I'm trying to run through a script and find all the dates that are earlier than today's today.
I'm also OK with comparing these as numbers, or any other solution.
I don't think there is anyway to do this easily in regex. For matching any date earlier than the current date you can use run the function below (Some of the stuff was stolen from benjifisher)
function! Convert_to_char_class(cur)
if a:cur =~ '[2-9]'
return '[0-' . (a:cur-1) . ']'
endif
return '0'
endfunction
function! Match_number_before(num)
let branches = []
let init = ''
for i in range(len(a:num))
if a:num[i] =~ '[1-9]'
call add(branches, init . Convert_to_char_class(a:num[i]) . repeat('\d', len(a:num) - i - 1))
endif
let init .= a:num[i]
endfor
return '\%(' . join(branches, '\|') .'\)'
endfunction
function! Match_date_before(date)
if a:date !~ '\v\d{4}-\d{2}-\d{2}'
echo "invalid date"
return
endif
let branches =[]
let parts = split(a:date, '-')
call add(branches, Match_number_before(parts[0]) . '-\d\{2}-\d\{2}')
call add(branches, parts[0] . '-' . Match_number_before(parts[1]) . '-\d\{2}')
call add(branches, parts[0] . '-' . parts[1] . '-' .Match_number_before(parts[2]))
return '\%(' . join(branches, '\|') .'\)'
endfunction
To use you the following to search for all matches before 2014-02-24
.
/<C-r>=Match_date_before('2014-02-24')
You might be able to wrap it in a function to set the search register if you wanted to.
The generated regex for dates before 2014-02-24
is the following.
\%(\%([0-1]\d\d\d\|200\d\|201[0-3]\)-\d\{2}-\d\{2}\|2014-\%(0[0-1]\)-\d\{2}\|2014-02-\%([0-1]\d\|2[0-3]\)\)
It does not do any validation of dates. It assumes if you are in that format you are a date.
Equivalent set of functions for matching after the passed in date.
function! Convert_to_char_class_after(cur)
if a:cur =~ '[0-7]'
return '[' . (a:cur+1) . '-9]'
endif
return '9'
endfunction
function! Match_number_after(num)
let branches = []
let init = ''
for i in range(len(a:num))
if a:num[i] =~ '[0-8]'
call add(branches, init . Convert_to_char_class_after(a:num[i]) . repeat('\d', len(a:num) - i - 1))
endif
let init .= a:num[i]
endfor
return '\%(' . join(branches, '\|') .'\)'
endfunction
function! Match_date_after(date)
if a:date !~ '\v\d{4}-\d{2}-\d{2}'
echo "invalid date"
return
endif
let branches =[]
let parts = split(a:date, '-')
call add(branches, Match_number_after(parts[0]) . '-\d\{2}-\d\{2}')
call add(branches, parts[0] . '-' . Match_number_after(parts[1]) . '-\d\{2}')
call add(branches, parts[0] . '-' . parts[1] . '-' .Match_number_after(parts[2]))
return '\%(' . join(branches, '\|') .'\)'
endfunction
The regex generated was
\%(\%([3-9]\d\d\d\|2[1-9]\d\d\|20[2-9]\d\|201[5-9]\)-\d\{2}-\d\{2}\|2014-\%([1-9]\d\|0[3-9]\)-\d\{2}\|2014-02-\%([3-9]\d\|2[5-9]\)\)
You do not say how you want to use this; are you sure that you really want a regular expression? Perhaps you could get away with
if DateCmp(date, '2014-02-24') < 0
" ...
endif
In that case, try this function.
" Compare formatted date strings:
" @param String date1, date2
" dates in YYYY-MM-DD format, e.g. '2014-02-24'
" @return Integer
" negative, zero, or positive according to date1 < date2, date1 == date2, or
" date1 > date2
function! DateCmp(date1, date2)
let [year1, month1, day1] = split(a:date1, '-')
let [year2, month2, day2] = split(a:date2, '-')
if year1 != year2
return year1 - year2
elseif month1 != month2
return month1 - month2
else
return day1 - day2
endif
endfun
If you really want a regular expression, then try this:
" Construct a pattern that matches a formatted date string if and only if the
" date is less than the input date. Usage:
" :echo '2014-02-24' =~ DateLessRE('2014-03-12')
function! DateLessRE(date)
let init = ''
let branches = []
for c in split(a:date, '\zs')
if c =~ '[1-9]'
call add(branches, init . '[0-' . (c-1) . ']')
endif
let init .= c
endfor
return '\d\d\d\d-\d\d-\d\d\&\%(' . join(branches, '\|') . '\)'
endfun
Does that count as a "simple" regex? One way to use it would be to type :g/
and then CRTL-R and = and then DateLessRE('2014-02-24')
and Enter, followed by the rest of your command. In other words,
:g/<C-R>=DateLessRE('2014-02-24')<CR>/s/foo/bar
EDIT: I added a concat (:help /\&
) that matches a complete "formatted date string". Now, there is no need to anchor the pattern.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With