I have a pretty simple problem - recognize money/currency in text. Sample test case: "Pocket money should NOT exceed INR 4000 (USD 100) per annum." Fails on the default Stanford parser - online - (with the 7 class model, including Currency) http://nlp.stanford.edu:8080/ner/process - works only with text like "$ 100".
On the Alchemy demo site - https://alchemy-language-demo.mybluemix.net/ , "$ 100" is recognised as an Entity, while "USD 100" is recogised as a Concept - United States Dollar
Not sure this is still useful after all this time, but here goes:
I think you have two options:
1) replace "USD" by "$" - this would be a simple find and replace and can be done in any tool you're likely to be using.
2) use a different tool or program.
Stanford NLP is great, but there are also other tools available.
Depending on what system/language you are using, there are many packages that already do the job for you.
For Python I'd recommend SpaCy:
# pip install spacy
# python -m spacy download en_core_web_sm
import spacy
# Load English tokenizer, tagger, parser, NER and word vectors
nlp = spacy.load("en_core_web_sm")
text = "Pocket money should NOT exceed INR 4000 (USD 100) per annum."
doc = nlp(text)
print("Money in USD:", [ent.lemma_ for ent in doc if ent.ent_type_ == "MONEY"])
# Money in USD: ['100']
This is just a simple example, you can find a more detailed script here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With