I need to retrieve the definition of an acronym based on the number of letters enclosed in parentheses. For the data I'm dealing with, the number of letters in parentheses corresponds to the number of words to retrieve. I know this isn't a reliable method for getting abbreviations, but in my case it will be. For example:
String = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
Desired output: family health history (FHH), nurse practitioner (NP)
I know how to extract parentheses from a string, but after that I am stuck. Any help is appreciated.
import re
a = 'Although family health history (FHH) is commonly accepted as an
important risk factor for common, chronic diseases, it is rarely considered
by a nurse practitioner (NP).'
x2 = re.findall('(\(.*?\))', a)
for x in x2:
length = len(x)
print(x, length)
An acronym is an abbreviation that forms a word. An initialism is an abbreviation that uses the first letter of each word in the phrase (thus, some but not all initialisms are acronyms).
There are many different kinds of abbreviations, including acronyms, initialisms, portmanteau, truncations and clipped words.
Abbreviations/AcronymsSpell out the full term at its first mention, indicate its abbreviation in parenthesis and use the abbreviation from then on, with the exception of acronyms that would be familiar to most readers, such as MCC and USAID.
Use the regex match to find the position of the start of the match. Then use python string indexing to get the substring leading up to the start of the match. Split the substring by words, and get the last n words. Where n is the length of the abbreviation.
import re
s = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
for match in re.finditer(r"\((.*?)\)", s):
start_index = match.start()
abbr = match.group(1)
size = len(abbr)
words = s[:start_index].split()[-size:]
definition = " ".join(words)
print(abbr, definition)
This prints:
FHH family health history
NP nurse practitioner
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With