Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spacy not Recognizing Date Properly

nlp = spacy.load('en_core_web_md')
text =" Activity Date: 12/18/2019 06:00:00AM CST "
doc  = nlp(text)
for entity in doc.ents:
    print(entity.label_+" "+ entity.text)

Here spacy is not able to extract date time. I also tried 'en' and 'en_core_web_lg'.

I also encounter a situation when we change the format of the date to (DD/MM/YYYY).It recognizes the date .

text = " 18/12/2019"
doc  = nlp(text)
for entity in doc.ents:
    print(entity.label_+" "+ entity.text)

Has anyone encountered the same Problem.

like image 288
Mayank sharma Avatar asked Dec 05 '25 15:12

Mayank sharma


2 Answers

Spacy employs probabilistic models to try and identify Named Entities in Natural Language. This means that it gives probabilities that Named Entities are of a certain type (such as a date, a person or an organisation).

You can influence the probability that a Date is recognized correctly in two ways: Make sure more contextual clues are included in the text surrounding the date, i.e.: The activity occurred on 12/18/2019 at 06:00:00AM CST

Or, alternatively, you can train the Spacy probabilistic model on your dataset, feeding it where it needs to recognize dates. More info here: https://spacy.io/usage/training

However, maybe your use-case is better suited for Regex approaches or even datetime imports to date recognition? This has been done before, check for example: match dates using python regular expressions

like image 122
T. Altena Avatar answered Dec 07 '25 05:12

T. Altena


For my particular use case I resolved it by using the dateparser. You can check it our here Dateparser

like image 44
Mayank sharma Avatar answered Dec 07 '25 05:12

Mayank sharma



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!