I'm new to NLP and recently been playing with NTLK and Spacy. However, I could not find a way to search for job titles (ex: product manager, chief marketing officer, etc) in an article.
Example, I have 1000 articles and I want to get all the articles that have job titles that I am interested in.
Also, what entity type does job titles fall in? I check https://spacy.io/docs/usage/entity-recognition and did not see it in there. I there a plan to add it?
Thanks.
"Job Titles" entity is not supported by Spacy NER, as also stated by Nathan. But you can create a custom named entity for your use case. Here is official documentation link. You can find step by step guide to train Spacy NER there.
You would need labeled data to train your NER. Generally you would need atleast 4000-5000 examples for train and 2000 examples for test. The more training data you have, the better will be the NER performance.
Here is some sample training data.
TRAIN_DATA = [
('Who is Shaka Khan?', {
'entities': [(7, 17, 'PERSON')]
}),
('I like London and Berlin.', {
'entities': [(7, 13, 'LOC'), (18, 24, 'LOC')]
}),
('I work as software engineer.', {
'entities': [(9, 18, 'JOBTITLE')]
}),
]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With