I was wondering if I could somehow to detect a date
in a string and convert it into a standard date format
.
Let's consider the input strings below:
Company registered on 16 March 2003
or
Activity between 10 May 2006 an 10 July 2008 - no changes.
Now I would like a PHP function to apply over the strings and get the dates as YYYY-mm-dd
Example:
$date = DateExtract($sting1); // output: 2003-03-16
$date = DateExtract($sting2); // output: ['2006-05-10','2008-07-10']
For finding first two digit number Date
Regexp would be - (?<![0-9])[0-9]{2}(?![0-9])
This can also be apply to four digit for Year
also and for Month you can use hard-coded string search
code.
<?php
$string = "Activity between 10 May 2006 an 10 July 2008 - no changes.";
preg_match_all('/(\d{1,2}) (\w+) (\d{4})/', $string, $matches);
print_r($matches);
?>
Output :-
Array
(
[0] => Array
(
[0] => 10 May 2006
[1] => 10 July 2008
)
[1] => Array
(
[0] => 10
[1] => 10
)
[2] => Array
(
[0] => May
[1] => July
)
[3] => Array
(
[0] => 2006
[1] => 2008
)
)
For find Complete Date in string you can use this -
It works for short code for Month like
Jan
and complete name likeJanuary
also.
Code -
<?php
$string = "Activity between 10 May 2006 an 10 July 2008 - no changes.";
preg_match_all('/(\b\d{1,2}\D{0,3})?\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?)\D?(\d{1,2}\D?)?\D?((19[7-9]\d|20\d{2})|\d{2})/', $string, $complete);
print_r($complete);
?>
Result -
Array
(
[0] => Array
(
[0] => 10 May 2006
[1] => 10 July 2008
)
[1] => Array
(
[0] => 10
[1] => 10
)
[2] => Array
(
[0] =>
[1] =>
)
[3] => Array
(
[0] => 20
[1] => 20
)
[4] => Array
(
[0] => 06
[1] => 08
)
[5] => Array
(
[0] =>
[1] =>
)
)
So you can fetch complete date form here and convert it into standard date format.
Rahul Dambare
Tricky. One approach might be to reason that dates always appear after certain grammatical words, as shown in your examples, e.g. "between", "on" etc. Using such words as a beginning anchor, we would then match until we find what we can reasonably assume to be the end of the date string. Here's what I hacked together:
//some strings
$strs = [
"Company was in business between 14 March 2008 and 21 November 2012 inclusive",
"I was born on 29 May 1980, 17:37 - it was a Thursday",
"The big bang did not occur at 2pm, 14 Jun 1971, that's for sure."
];
//container to store possible date matches from strings
$possible_dates = array();
//prep months - long and short forms, to be used in matching
$date_prefix_words = array('between', 'on', 'at', 'during', 'and');
$months = array('January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December');
$months_short = array_map(function($month) { return substr($month, 0, 3); }, $months);
//iterate over and search strings - convert times like 2pm to 14:00, as, if they appear before the date, e.g. string 3, it doesn't get parsed
foreach($strs as $str) {
$str = preg_replace_callback('/\b\d{1,2}[ap]m\b/', function($time) { return date('H:i', strtotime($time[0])); }, $str);
preg_match_all('/(?<=\b'.implode('\b( |:)|\b', $date_prefix_words).'\b( |:))(\d|am|pm| |,|\'|:|'.implode('|', $months).'|'.implode('|', $months_short).')+/i', $str, $matches);
if (count($matches)) $possible_dates = array_merge($possible_dates, $matches[0]);
}
//output before and after results
foreach($possible_dates as &$pd) {
$pd = preg_replace('/, ?$/', '', $pd);
echo '<p>Before: '.$pd.'<br />After: '.date('Y-m-d', strtotime($pd)).'</p>';
}
Clearly I'm making certain assumptions about your date formats, and you may need to tweak the REGEX, but it sort of works.
First of all You have to extract all part of date from the string separately.
First approach:
<?php
function standard_date_format($str) {
preg_match_all('/(\d{1,2}) (\w+) (\d{4})/', $str, $matches);
foreach ( $matches[1] as $day ) { $days [] = $day; }
foreach ( $matches[2] as $month ) { $months[] = $month; }
foreach ( $matches[3] as $year ) { $years [] = $year; }
$all_months = array('January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December');
for ($i = sizeof ($days) - 1; $i >= 0; $i--) {
$month = array_search ($months[$i], $all_months) + 1;
$month = strlen ($month) < 2 ? '0'.$month : $month;
$results[] = $years[$i] . '-' . $month . '-' . $days[$i];
}
return $results;
}
$str1 = "Company registered on 16 March 2003";
$str2 = "Activity between 10 May 2006 an 10 July 2008 - no changes.";
print_r(standard_date_format($str1)); // output: 2003-03-16
print_r(standard_date_format($str2)); // output: ['2006-05-10','2008-07-10']
Second approach:
<?php
function standard_date_format($str) {
preg_match_all('/(\d{1,2}) (\w+) (\d{4})/', $str, $matches);
$dates = array_map("strtotime", $matches[0]);
$result = array_map(function($v) {return date("Y-m-d", $v); }, $dates);
return $result;
}
$str1 = "Company registered on 16 March 2003";
$str2 = "Activity between 10 May 2006 an 10 July 2008 - no changes.";
print_r(standard_date_format($str1)); // output: 2003-03-16
print_r(standard_date_format($str2)); // output: ['2006-05-10','2008-07-10']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With