So I need to get hours, minutes and seconds out of entries like these:
The first two is hours, minutes and seconds. Next to is minutes and seconds. Last two is just seconds.
And I came up with this regexp, that works..:
\A(?<hours>\d{1,2})(?::|\.)(?<minutes>\d{1,2})(?::|\.)(?<seconds>\d{1,2})\z|\A(?<minutes>\d{1,2})(?::|\.)(?<seconds>\d{1,2})\z|\A(?<seconds>\d{1,2})\z
But it is ugly, and I want to refactor it down to not be 3 different expressions (mostly just to learn). I tried this:
\A(?:(?<hours>\d{1,2})(?::|\.){0,1})(?:(?<minutes>\d{1,2})(?::|\.){0,1})(?:(?<seconds>\d{1,2}){0,1})\z
But that does not work - minutes and seconds sometimes get screwed up. My brain is hurting, and I can't figure out, what I am doing wrong.
My suggestion:
(?:(?:(?<hh>\d{1,2})[:.])?(?<mm>\d{1,2})[:.])?(?<ss>\d{1,2})
structured:
(?: # group 1 (non-capturing)
(?: # group 2 (non-capturing)
(?<hh>\d{1,2}) # hours
[:.] # delimiter
)? # end group 2, make optional
(?<mm>\d{1,2}) # minutes
[:.] # delimiter
)? # end group 1, make optional
(?<ss>\d{1,2}) # seconds (required)
If you wish, you can wrap the regex in delimiters - like word boundaries \b
or string anchors (^
and $
).
EDIT: Thinking about it, you can restrict that further to capture times that make sense only. Use
[0-5]?\d
in place of
\d{1,2}
to capture values between 0 and 59 only, where appropriate (seconds and minutes).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With