How can you detect / find out the meaning (the extension) of an acronym using NLP / Information Extraction (IE) methods?
We want to detect in free text if a word or it's acronym is used and map it to the same entity / token.
Most papers available online are about medical acronyms and they do not provide a library for acomplish this task.
Any ideas?
Use the Acronyms pane in WordGo to References > Acronyms. In the Acronyms pane, find the acronyms from your document with their definitions. To see where the acronym definition was found, select Found in a shared file, Found in your email, or Defined by your organization .
Abbreviations/AcronymsSpell out the full term at its first mention, indicate its abbreviation in parenthesis and use the abbreviation from then on, with the exception of acronyms that would be familiar to most readers, such as MCC and USAID.
Basic Computer Terms and Acronyms PC (Personal Computer) – a small computer designed for use by a single user at a time.
A backronym is an acronym formed from an already existing word by expanding its letters into the words of a phrase. Backronyms may be invented with either serious or humorous intent, or they may be a type of false etymology or folk etymology. The word is a portmanteau of back and acronym.
Reading your question and the comments I understand that you want to create a mapping from an acronym to its extension.
Assuming you have a collection of textual documents where both the acronym and its expansion occur you can apply an algorithm to extract (acronym,extension) pairs.
A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text by A.S Schwartz and M.A. Hearst, does exactly this by looking at patterns. The Java implementation is available here.
I applied this algorithm to the English Wikipedia, you can see the results here. I also applied it to a collection of Portuguese new articles, results are here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With