Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to automatically detect acronym meaning / extension

How can you detect / find out the meaning (the extension) of an acronym using NLP / Information Extraction (IE) methods?

We want to detect in free text if a word or it's acronym is used and map it to the same entity / token.

Most papers available online are about medical acronyms and they do not provide a library for acomplish this task.

Any ideas?

like image 502
Thorsten Niehues Avatar asked Nov 03 '14 14:11

Thorsten Niehues


People also ask

Is there a way to find all acronyms in word?

Use the Acronyms pane in WordGo to References > Acronyms. In the Acronyms pane, find the acronyms from your document with their definitions. To see where the acronym definition was found, select Found in a shared file, Found in your email, or Defined by your organization .

How do you indicate an acronym?

Abbreviations/AcronymsSpell out the full term at its first mention, indicate its abbreviation in parenthesis and use the abbreviation from then on, with the exception of acronyms that would be familiar to most readers, such as MCC and USAID.

Can you remember what the acronym stands for PC?

Basic Computer Terms and Acronyms PC (Personal Computer) – a small computer designed for use by a single user at a time.

What is expanding an acronym called?

A backronym is an acronym formed from an already existing word by expanding its letters into the words of a phrase. Backronyms may be invented with either serious or humorous intent, or they may be a type of false etymology or folk etymology. The word is a portmanteau of back and acronym.


1 Answers

Reading your question and the comments I understand that you want to create a mapping from an acronym to its extension.

Assuming you have a collection of textual documents where both the acronym and its expansion occur you can apply an algorithm to extract (acronym,extension) pairs.

A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text by A.S Schwartz and M.A. Hearst, does exactly this by looking at patterns. The Java implementation is available here.

I applied this algorithm to the English Wikipedia, you can see the results here. I also applied it to a collection of Portuguese new articles, results are here.

like image 169
David Batista Avatar answered Oct 09 '22 11:10

David Batista