Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting date from a string in Python

How can I extract the date from a string like "monkey 2010-07-10 love banana"? Thanks!

like image 412
dmpop Avatar asked Jul 18 '10 15:07

dmpop


People also ask

How do I extract date and time from text in Python?

Extracting Dates from a Text File with the Datefinder Module. The Python datefinder module can locate dates in a body of text. Using the find_dates() method, it's possible to search text data for many different types of dates. Datefinder will return any dates it finds in the form of a datetime object.

How do I convert a string to a date?

Using strptime() , date and time in string format can be converted to datetime type. The first parameter is the string and the second is the date time format specifier. One advantage of converting to date format is one can select the month or date or time individually.


2 Answers

Using python-dateutil:

In [1]: import dateutil.parser as dparser  In [18]: dparser.parse("monkey 2010-07-10 love banana",fuzzy=True) Out[18]: datetime.datetime(2010, 7, 10, 0, 0) 

Invalid dates raise a ValueError:

In [19]: dparser.parse("monkey 2010-07-32 love banana",fuzzy=True) # ValueError: day is out of range for month 

It can recognize dates in many formats:

In [20]: dparser.parse("monkey 20/01/1980 love banana",fuzzy=True) Out[20]: datetime.datetime(1980, 1, 20, 0, 0) 

Note that it makes a guess if the date is ambiguous:

In [23]: dparser.parse("monkey 10/01/1980 love banana",fuzzy=True) Out[23]: datetime.datetime(1980, 10, 1, 0, 0) 

But the way it parses ambiguous dates is customizable:

In [21]: dparser.parse("monkey 10/01/1980 love banana",fuzzy=True, dayfirst=True) Out[21]: datetime.datetime(1980, 1, 10, 0, 0) 
like image 66
unutbu Avatar answered Oct 13 '22 06:10

unutbu


If the date is given in a fixed form, you can simply use a regular expression to extract the date and "datetime.datetime.strptime" to parse the date:

import re from datetime import datetime  match = re.search(r'\d{4}-\d{2}-\d{2}', text) date = datetime.strptime(match.group(), '%Y-%m-%d').date() 

Otherwise, if the date is given in an arbitrary form, you can't extract it easily.

like image 31
lunaryorn Avatar answered Oct 13 '22 06:10

lunaryorn