I have data for tasks that were recorded with a time sheet app. I'm trying to parse the breaks for each task.
An example break string attached to a task can look like this:
1:19pm – 10:33pm ate tacos 10:35pm – 11:38pm 12:40am – 1:24am took a nap
I need to group this into time stamps with their associated descriptions. The above should be grouped like:
1:19pm – 10:33pm ate tacos
10:35pm – 11:38pm
12:40am – 1:24am took a nap
The description for a break interval can have basically any characters or be any length. Some intervals don't have descriptions.
I figure regex would be the simplest way to get an array of intervals with their descriptions (if they have one).
So far I have:
\d{1,2}:\d{2}[ap]m\s–\s\d{1,2}:\d{2}[ap]m
which matches the time stamps 1:19pm – 10:33pm
, 10:35pm – 11:38pm
, and 12:40am – 1:24am
I am using JavaScript, and the match function, to parse this data. I want to make a regular expression that will match the time stamp and everything that follows it until the next time stamp.
I'm a beginner with regex so go easy on me. I've been at this for hours, watched several videos, read tutorial blogs, and been experimenting with regex101. Anchors, lookahead/behinds, are confusing and I can't seem to get anything to do what I want. Not looking to become an expert in writing regular expressions, but I would really like learning something new that can be directly applied to what I'm doing.
You can use the following regex:
(\d{1,2}:\d{2}[ap]m\s*–\s*\d{1,2}:\d{2}[ap]m)(\D*(?:\d(?!\d?:\d{2}[ap]m\s)\D*)*)
See the regex demo
The problem you face is matching a text that does not match a specific pattern. This can be achieved either with a tempered greedy token or an unroll-the-loop technique. The latter is preferable since it involves less backtracking. My regex is based on that technique.
Here is the regex explanation:
(\d{1,2}:\d{2}[ap]m\s*–\s*\d{1,2}:\d{2}[ap]m)
- matches and captures into Group #1 time period (I just added outer parentheses and the *
quantifiers to \s
classes) - as it is your regex, I won't go into detail(\D*(?:\d(?!\d?:\d{2}[ap]m\s)\D*)*)
- this is an unrolled .*?(?=\d{1,2}:\d{2}[ap]m\s)
construct matching anything up to the first \d{1,2}:\d{2}[ap]m\s
pattern. It is placed in Group #2.
\D*
- 0 or more characters other than a digit(?:\d(?!\d?:\d{2}[ap]m\s)\D*)*
- 0 or more sequences of...
\d(?!\d?:\d{2}[ap]m\s)
- a digit (\d
) that is not followed by 1 or 0 digits followed with :
followed with 2 digits, then a
or p
, then m
, and then a whitespace\D*
- again, 0 or more characters other than a digit.JS demo:
var re = /(\d{1,2}:\d{2}[ap]m\s*–\s*\d{1,2}:\d{2}[ap]m)(\D*(?:\d(?!\d?:\d{2}[ap]m\s)\D*)*)/ig;
var str = '1:19pm – 10:33pm ate tacos 10:35pm – 11:38pm 12:40am – 1:24am took a nap';
var m;
while ((m = re.exec(str)) !== null) {
document.getElementById("r").innerHTML += "Period: " + m[1] + "<br/>";
document.getElementById("r").innerHTML += "Description: " + m[2] + "<br/><br/>";
}
<div id="r"/>
I'm sure this can be simplified, but the following regular expression seems to work:
Example Here
/(\d{1,2}:\d{2}[ap]m\s–\s\d{1,2}:\d{2}[ap]m(?:.(?!\d{1,2}:\d{2}[ap]m))*)/g
var input = '1:19pm – 10:33pm ate tacos 10:35pm – 11:38pm 12:40am – 1:24am took a nap';
var matches = input.match(/(\d{1,2}:\d{2}[ap]m\s–\s\d{1,2}:\d{2}[ap]m(?:.(?!\d{1,2}:\d{2}[ap]m))*)/g);
for (var i = 0; i < matches.length; i++) {
snippet.log(matches[i]);
}
<script src="http://tjcrowder.github.io/simple-snippets-console/snippet.js"></script>
Output:
1:19pm – 10:33pm ate tacos
10:35pm – 11:38pm
12:40am – 1:24am took a nap
hope it will help:
https://regex101.com/r/dV7vY5/1
(\d{1,2}:\d{2}[ap]m) – (\d{1,2}:\d{2}[ap]m)([\s|a-z|A-Z]+)
output:
1:19pm – 10:33pm ate tacos
10:35pm – 11:38pm
12:40am – 1:24am took a nap
and you can acess each patter:
$1 - first hour (1:19pm)
$2 - second hour (10:33pm)
$3 - string ( ate tacos)
example below:
var string = '1:19pm – 10:33pm ate tacos 10:35pm – 11:38pm 12:40am – 1:24am took a nap';
var regex = /(\d{1,2}:\d{2}[ap]m) – (\d{1,2}:\d{2}[ap]m)([\s|a-z|A-Z]+)/gi;
var eachMatche = string.match(regex);
for (var i = 0; i < eachMatche.length; i++) {
snippet.log(eachMatche[i]);
snippet.log('period : '+ eachMatche[i].replace(regex,'$1') +' - ' + eachMatche[i].replace(regex,'$2'));
snippet.log('description : '+eachMatche[i].replace(regex,'$3'));
}
<script src="http://tjcrowder.github.io/simple-snippets-console/snippet.js"></script>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With