Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing a string for dates in PHP

Tags:

Given an arbitrary string, for example ("I'm going to play croquet next Friday" or "Gadzooks, is it 17th June already?"), how would you go about extracting the dates from there?

If this is looking like a good candidate for the too-hard basket, perhaps you could suggest an alternative. I want to be able to parse Twitter messages for dates. The tweets I'd be looking at would be ones which users are directing at this service, so they could be coached into using an easier format, however I'd like it to be as transparent as possible. Is there a good middle ground you could think of?

like image 455
nickf Avatar asked Jun 16 '10 13:06

nickf


People also ask

What is PHP date parse?

PHP | date_parse() Function Return Value: Returns an associative array containing information about the parsed date. Errors/Exceptions: In case if the date format has an error, an error message will appear. Below programs illustrate the date_parse() function.

How does Strtotime work in PHP?

The strtotime() function parses an English textual datetime into a Unix timestamp (the number of seconds since January 1 1970 00:00:00 GMT). Note: If the year is specified in a two-digit format, values between 0-69 are mapped to 2000-2069 and values between 70-100 are mapped to 1970-2000.

What is Date_parse?

The date_parse() function returns an associative array with detailed information about a specified date.

How can I compare two dates in PHP?

we can analyze the dates by simple comparison operator if the given dates are in a similar format. <? php $date1 = "2018-11-24"; $date2 = "2019-03-26"; if ($date1 > $date2) echo "$date1 is latest than $date2"; else echo "$date1 is older than $date2"; ?>


2 Answers

If you have the horsepower, you could try the following algorithm. I'm showing an example, and leaving the tedious work up to you :)

//Attempt to perform strtotime() on each contiguous subset of words...  //1st iteration strtotime("Gadzooks, is it 17th June already") strtotime("is it 17th June already") strtotime("it 17th June already") strtotime("17th June already") strtotime("June already") strtotime("already")  //2nd iteration strtotime("Gadzooks, is it 17th June") strtotime("is it 17th June") strtotime("17th June") //date! strtotime("June") //date!  //3rd iteration strtotime("Gadzooks, is it 17th") strtotime("is it 17th") strtotime("it 17th") strtotime("17th") //date!  //4th iteration strtotime("Gadzooks, is it") //etc 

And we can assume that strtotime("17th June") is more accurate than strtotime("17th") simply because it contains more words... i.e. "next Friday" will always be more accurate than "Friday".

like image 171
Dolph Avatar answered Sep 23 '22 19:09

Dolph


I would do it this way:

First check if the entire string is a valid date with strtotime(). If so, you're done.

If not, determine how many words are in your string (split on whitespace for example). Let this number be n.

Loop over every n-1 word combination and use strtotime() to see if the phrase is a valid date. If so you've found the longest valid date string within your original string.

If not, loop over every n-2 word combination and use strtotime() to see if the phrase is a valid date. If so you've found the longest valid date string within your original string.

...and so on until you've found a valid date string or searched every single/individual word. By finding the longest matches, you'll get the most informed dates (if that makes sense). Since you're dealing with tweets, your strings will never be huge.

like image 34
Scott Saunders Avatar answered Sep 22 '22 19:09

Scott Saunders