Retrieve definition for parenthesized abbreviation, based on letter count

Multi tool use
Can a character learn spells from someone else's spellbook and then sell it?
What is this airplane that sits in front of Barringer High School in Newark, NJ?
How can I ping multiple IP addresses at the same time?
Why do you need to heat the pan before heating the olive oil?
Boundaries and Buddhism
Explain why a line can never intersect a plane in exactly two points.
Why things float in space, though there is always gravity of our star is present
Would a 7805 5 V regulator drain a 9 V battery?
I just entered the USA without passport control at Atlanta airport
Is the author of the Shu"t HaRidvaz the same one as the one known to be the rebbe of the Ariza"l?
First occurrence in the Sixers sequence
Unrecognized IC Package Style
Why is Havana covered in 5-digit numbers in Our Man in Havana?
Make symbols atomic, without losing their type
How do I find which software is doing an SSH connection?
Setting up the trap
How to compute the inverse of an operation in Q#?
I have found ports on my Samsung smart tv running a display service. What can I do with it?
Why are there no file insertion syscalls
'No arbitrary choices' intuition for natural transformation.
How can a warlock learn from a spellbook?
What kind of chart is this?
Time at 1 g acceleration to travel 100 000 light years
What is the highest power supply a Raspberry pi 3 B can handle without getting damaged?
Retrieve definition for parenthesized abbreviation, based on letter count
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I need to retrieve the definition of an acronym based on the number of letters enclosed in parentheses. For the data I'm dealing with, the number of letters in parentheses corresponds to the number of words to retrieve. I know this isn't a reliable method for getting abbreviations, but in my case it will be. For example:
String = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
Desired output: family health history (FHH), nurse practitioner (NP)
I know how to extract parentheses from a string, but after that I am stuck. Any help is appreciated.
import re
a = 'Although family health history (FHH) is commonly accepted as an
important risk factor for common, chronic diseases, it is rarely considered
by a nurse practitioner (NP).'
x2 = re.findall('((.*?))', a)
for x in x2:
length = len(x)
print(x, length)
python regex text text-parsing abbreviation
add a comment |
I need to retrieve the definition of an acronym based on the number of letters enclosed in parentheses. For the data I'm dealing with, the number of letters in parentheses corresponds to the number of words to retrieve. I know this isn't a reliable method for getting abbreviations, but in my case it will be. For example:
String = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
Desired output: family health history (FHH), nurse practitioner (NP)
I know how to extract parentheses from a string, but after that I am stuck. Any help is appreciated.
import re
a = 'Although family health history (FHH) is commonly accepted as an
important risk factor for common, chronic diseases, it is rarely considered
by a nurse practitioner (NP).'
x2 = re.findall('((.*?))', a)
for x in x2:
length = len(x)
print(x, length)
python regex text text-parsing abbreviation
I think you will need to write some parsing logic here, in addition to maybe using regex.
– Tim Biegeleisen
Jun 2 at 2:55
I know I can run a loop and do a Len(string) to get the number of letters, but I guess it's after that point I'm lost. Like if it's 3 letters, how to capture the previous 3 words.
– tenebris silentio
Jun 2 at 2:59
1
You should use"""
instead of'
for multiline string
– Keatinge
Jun 2 at 3:00
add a comment |
I need to retrieve the definition of an acronym based on the number of letters enclosed in parentheses. For the data I'm dealing with, the number of letters in parentheses corresponds to the number of words to retrieve. I know this isn't a reliable method for getting abbreviations, but in my case it will be. For example:
String = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
Desired output: family health history (FHH), nurse practitioner (NP)
I know how to extract parentheses from a string, but after that I am stuck. Any help is appreciated.
import re
a = 'Although family health history (FHH) is commonly accepted as an
important risk factor for common, chronic diseases, it is rarely considered
by a nurse practitioner (NP).'
x2 = re.findall('((.*?))', a)
for x in x2:
length = len(x)
print(x, length)
python regex text text-parsing abbreviation
I need to retrieve the definition of an acronym based on the number of letters enclosed in parentheses. For the data I'm dealing with, the number of letters in parentheses corresponds to the number of words to retrieve. I know this isn't a reliable method for getting abbreviations, but in my case it will be. For example:
String = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
Desired output: family health history (FHH), nurse practitioner (NP)
I know how to extract parentheses from a string, but after that I am stuck. Any help is appreciated.
import re
a = 'Although family health history (FHH) is commonly accepted as an
important risk factor for common, chronic diseases, it is rarely considered
by a nurse practitioner (NP).'
x2 = re.findall('((.*?))', a)
for x in x2:
length = len(x)
print(x, length)
python regex text text-parsing abbreviation
python regex text text-parsing abbreviation
edited Jun 2 at 5:54
cdlane
21.6k31245
21.6k31245
asked Jun 2 at 2:45
tenebris silentiotenebris silentio
949
949
I think you will need to write some parsing logic here, in addition to maybe using regex.
– Tim Biegeleisen
Jun 2 at 2:55
I know I can run a loop and do a Len(string) to get the number of letters, but I guess it's after that point I'm lost. Like if it's 3 letters, how to capture the previous 3 words.
– tenebris silentio
Jun 2 at 2:59
1
You should use"""
instead of'
for multiline string
– Keatinge
Jun 2 at 3:00
add a comment |
I think you will need to write some parsing logic here, in addition to maybe using regex.
– Tim Biegeleisen
Jun 2 at 2:55
I know I can run a loop and do a Len(string) to get the number of letters, but I guess it's after that point I'm lost. Like if it's 3 letters, how to capture the previous 3 words.
– tenebris silentio
Jun 2 at 2:59
1
You should use"""
instead of'
for multiline string
– Keatinge
Jun 2 at 3:00
I think you will need to write some parsing logic here, in addition to maybe using regex.
– Tim Biegeleisen
Jun 2 at 2:55
I think you will need to write some parsing logic here, in addition to maybe using regex.
– Tim Biegeleisen
Jun 2 at 2:55
I know I can run a loop and do a Len(string) to get the number of letters, but I guess it's after that point I'm lost. Like if it's 3 letters, how to capture the previous 3 words.
– tenebris silentio
Jun 2 at 2:59
I know I can run a loop and do a Len(string) to get the number of letters, but I guess it's after that point I'm lost. Like if it's 3 letters, how to capture the previous 3 words.
– tenebris silentio
Jun 2 at 2:59
1
1
You should use
"""
instead of '
for multiline string– Keatinge
Jun 2 at 3:00
You should use
"""
instead of '
for multiline string– Keatinge
Jun 2 at 3:00
add a comment |
5 Answers
5
active
oldest
votes
Use the regex match to find the position of the start of the match. Then use python string indexing to get the substring leading up to the start of the match. Split the substring by words, and get the last n words. Where n is the length of the abbreviation.
import re
s = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
for match in re.finditer(r"((.*?))", s):
start_index = match.start()
abbr = match.group(1)
size = len(abbr)
words = s[:start_index].split()[-size:]
definition = " ".join(words)
print(abbr, definition)
This prints:
FHH family health history
NP nurse practitioner
Man, what a life saver. That makes sense. Thanks so much .
– tenebris silentio
Jun 2 at 3:09
You can addoutput = ""
to the top of the code, andoutput += definition + ", (" + abbr + ")"
to the end of the loop to get your desired output.
– MarsNebulaSoup
Jun 2 at 3:12
I would suggest to match only capital letters:re.finditer(r"(([A-Z]*?))", s)
– igrinis
Jun 2 at 13:24
add a comment |
An idea, to use a recursive pattern with PyPI regex module.
b[A-Za-z]+s+(?R)?(?[A-Z](?=[A-Z]*)))?
See this pcre demo at regex101
b[A-Za-z]+s+
matches a word boundary, one or more alpha, one or more white space(?R)?
recursive part: optionally paste the pattern from start(?
need to make the parenthesis optional for recursion to fit in)?
[A-Z](?=[A-Z]*)
match one upper alpha if followed by closing)
with any A-Z in between
- Does not check if the first word letter actually match the letter at position in the abbreviation.
- Does not check for an opening parenthesis in front of the abbreviation. To check, add a variable length lookbehind. Change
[A-Z](?=[A-Z]*))
to(?<=([A-Z]*)[A-Z](?=[A-Z]*))
.
add a comment |
does this solve your problem?
a = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
splitstr=a.replace('.','').split(' ')
output=''
for i,word in enumerate(splitstr):
if '(' in word:
w=word.replace('(','').replace(')','').replace('.','')
for n in range(len(w)+1):
output=splitstr[i-n]+' '+output
print(output)
actually, Keatinge beat me to it
add a comment |
Using re
with list-comprehension
x_lst = [ str(len(i[1:-1])) for i in re.findall('((.*?))', a) ]
[re.search( r'(S+s+)' + i + '(.' + i + ')', a).group(0) for i in x_lst]
#['family health history (FHH)', 'nurse practitioner (NP)']
add a comment |
This solution isn't particularly clever, it simpy searches for the acronyms and then builds up a pattern to extract the words ahead of each one:
import re
string = "Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP)."
definitions = []
for acronym in re.findall(r'(([A-Z]+?))', string):
length = len(acronym)
match = re.search(r'(?:w+W+)' + str(length) + r'(' + acronym + r')', string)
definitions.append(match.group(0))
print(", ".join(definitions))
OUTPUT
> python3 test.py
family health history (FHH), nurse practitioner (NP)
>
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56411861%2fretrieve-definition-for-parenthesized-abbreviation-based-on-letter-count%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
Use the regex match to find the position of the start of the match. Then use python string indexing to get the substring leading up to the start of the match. Split the substring by words, and get the last n words. Where n is the length of the abbreviation.
import re
s = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
for match in re.finditer(r"((.*?))", s):
start_index = match.start()
abbr = match.group(1)
size = len(abbr)
words = s[:start_index].split()[-size:]
definition = " ".join(words)
print(abbr, definition)
This prints:
FHH family health history
NP nurse practitioner
Man, what a life saver. That makes sense. Thanks so much .
– tenebris silentio
Jun 2 at 3:09
You can addoutput = ""
to the top of the code, andoutput += definition + ", (" + abbr + ")"
to the end of the loop to get your desired output.
– MarsNebulaSoup
Jun 2 at 3:12
I would suggest to match only capital letters:re.finditer(r"(([A-Z]*?))", s)
– igrinis
Jun 2 at 13:24
add a comment |
Use the regex match to find the position of the start of the match. Then use python string indexing to get the substring leading up to the start of the match. Split the substring by words, and get the last n words. Where n is the length of the abbreviation.
import re
s = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
for match in re.finditer(r"((.*?))", s):
start_index = match.start()
abbr = match.group(1)
size = len(abbr)
words = s[:start_index].split()[-size:]
definition = " ".join(words)
print(abbr, definition)
This prints:
FHH family health history
NP nurse practitioner
Man, what a life saver. That makes sense. Thanks so much .
– tenebris silentio
Jun 2 at 3:09
You can addoutput = ""
to the top of the code, andoutput += definition + ", (" + abbr + ")"
to the end of the loop to get your desired output.
– MarsNebulaSoup
Jun 2 at 3:12
I would suggest to match only capital letters:re.finditer(r"(([A-Z]*?))", s)
– igrinis
Jun 2 at 13:24
add a comment |
Use the regex match to find the position of the start of the match. Then use python string indexing to get the substring leading up to the start of the match. Split the substring by words, and get the last n words. Where n is the length of the abbreviation.
import re
s = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
for match in re.finditer(r"((.*?))", s):
start_index = match.start()
abbr = match.group(1)
size = len(abbr)
words = s[:start_index].split()[-size:]
definition = " ".join(words)
print(abbr, definition)
This prints:
FHH family health history
NP nurse practitioner
Use the regex match to find the position of the start of the match. Then use python string indexing to get the substring leading up to the start of the match. Split the substring by words, and get the last n words. Where n is the length of the abbreviation.
import re
s = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
for match in re.finditer(r"((.*?))", s):
start_index = match.start()
abbr = match.group(1)
size = len(abbr)
words = s[:start_index].split()[-size:]
definition = " ".join(words)
print(abbr, definition)
This prints:
FHH family health history
NP nurse practitioner
edited Jun 2 at 3:16
answered Jun 2 at 3:07


KeatingeKeatinge
3,27341532
3,27341532
Man, what a life saver. That makes sense. Thanks so much .
– tenebris silentio
Jun 2 at 3:09
You can addoutput = ""
to the top of the code, andoutput += definition + ", (" + abbr + ")"
to the end of the loop to get your desired output.
– MarsNebulaSoup
Jun 2 at 3:12
I would suggest to match only capital letters:re.finditer(r"(([A-Z]*?))", s)
– igrinis
Jun 2 at 13:24
add a comment |
Man, what a life saver. That makes sense. Thanks so much .
– tenebris silentio
Jun 2 at 3:09
You can addoutput = ""
to the top of the code, andoutput += definition + ", (" + abbr + ")"
to the end of the loop to get your desired output.
– MarsNebulaSoup
Jun 2 at 3:12
I would suggest to match only capital letters:re.finditer(r"(([A-Z]*?))", s)
– igrinis
Jun 2 at 13:24
Man, what a life saver. That makes sense. Thanks so much .
– tenebris silentio
Jun 2 at 3:09
Man, what a life saver. That makes sense. Thanks so much .
– tenebris silentio
Jun 2 at 3:09
You can add
output = ""
to the top of the code, and output += definition + ", (" + abbr + ")"
to the end of the loop to get your desired output.– MarsNebulaSoup
Jun 2 at 3:12
You can add
output = ""
to the top of the code, and output += definition + ", (" + abbr + ")"
to the end of the loop to get your desired output.– MarsNebulaSoup
Jun 2 at 3:12
I would suggest to match only capital letters:
re.finditer(r"(([A-Z]*?))", s)
– igrinis
Jun 2 at 13:24
I would suggest to match only capital letters:
re.finditer(r"(([A-Z]*?))", s)
– igrinis
Jun 2 at 13:24
add a comment |
An idea, to use a recursive pattern with PyPI regex module.
b[A-Za-z]+s+(?R)?(?[A-Z](?=[A-Z]*)))?
See this pcre demo at regex101
b[A-Za-z]+s+
matches a word boundary, one or more alpha, one or more white space(?R)?
recursive part: optionally paste the pattern from start(?
need to make the parenthesis optional for recursion to fit in)?
[A-Z](?=[A-Z]*)
match one upper alpha if followed by closing)
with any A-Z in between
- Does not check if the first word letter actually match the letter at position in the abbreviation.
- Does not check for an opening parenthesis in front of the abbreviation. To check, add a variable length lookbehind. Change
[A-Z](?=[A-Z]*))
to(?<=([A-Z]*)[A-Z](?=[A-Z]*))
.
add a comment |
An idea, to use a recursive pattern with PyPI regex module.
b[A-Za-z]+s+(?R)?(?[A-Z](?=[A-Z]*)))?
See this pcre demo at regex101
b[A-Za-z]+s+
matches a word boundary, one or more alpha, one or more white space(?R)?
recursive part: optionally paste the pattern from start(?
need to make the parenthesis optional for recursion to fit in)?
[A-Z](?=[A-Z]*)
match one upper alpha if followed by closing)
with any A-Z in between
- Does not check if the first word letter actually match the letter at position in the abbreviation.
- Does not check for an opening parenthesis in front of the abbreviation. To check, add a variable length lookbehind. Change
[A-Z](?=[A-Z]*))
to(?<=([A-Z]*)[A-Z](?=[A-Z]*))
.
add a comment |
An idea, to use a recursive pattern with PyPI regex module.
b[A-Za-z]+s+(?R)?(?[A-Z](?=[A-Z]*)))?
See this pcre demo at regex101
b[A-Za-z]+s+
matches a word boundary, one or more alpha, one or more white space(?R)?
recursive part: optionally paste the pattern from start(?
need to make the parenthesis optional for recursion to fit in)?
[A-Z](?=[A-Z]*)
match one upper alpha if followed by closing)
with any A-Z in between
- Does not check if the first word letter actually match the letter at position in the abbreviation.
- Does not check for an opening parenthesis in front of the abbreviation. To check, add a variable length lookbehind. Change
[A-Z](?=[A-Z]*))
to(?<=([A-Z]*)[A-Z](?=[A-Z]*))
.
An idea, to use a recursive pattern with PyPI regex module.
b[A-Za-z]+s+(?R)?(?[A-Z](?=[A-Z]*)))?
See this pcre demo at regex101
b[A-Za-z]+s+
matches a word boundary, one or more alpha, one or more white space(?R)?
recursive part: optionally paste the pattern from start(?
need to make the parenthesis optional for recursion to fit in)?
[A-Z](?=[A-Z]*)
match one upper alpha if followed by closing)
with any A-Z in between
- Does not check if the first word letter actually match the letter at position in the abbreviation.
- Does not check for an opening parenthesis in front of the abbreviation. To check, add a variable length lookbehind. Change
[A-Z](?=[A-Z]*))
to(?<=([A-Z]*)[A-Z](?=[A-Z]*))
.
answered Jun 2 at 10:42


bobble bubblebobble bubble
7,40011630
7,40011630
add a comment |
add a comment |
does this solve your problem?
a = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
splitstr=a.replace('.','').split(' ')
output=''
for i,word in enumerate(splitstr):
if '(' in word:
w=word.replace('(','').replace(')','').replace('.','')
for n in range(len(w)+1):
output=splitstr[i-n]+' '+output
print(output)
actually, Keatinge beat me to it
add a comment |
does this solve your problem?
a = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
splitstr=a.replace('.','').split(' ')
output=''
for i,word in enumerate(splitstr):
if '(' in word:
w=word.replace('(','').replace(')','').replace('.','')
for n in range(len(w)+1):
output=splitstr[i-n]+' '+output
print(output)
actually, Keatinge beat me to it
add a comment |
does this solve your problem?
a = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
splitstr=a.replace('.','').split(' ')
output=''
for i,word in enumerate(splitstr):
if '(' in word:
w=word.replace('(','').replace(')','').replace('.','')
for n in range(len(w)+1):
output=splitstr[i-n]+' '+output
print(output)
actually, Keatinge beat me to it
does this solve your problem?
a = 'Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP).'
splitstr=a.replace('.','').split(' ')
output=''
for i,word in enumerate(splitstr):
if '(' in word:
w=word.replace('(','').replace(')','').replace('.','')
for n in range(len(w)+1):
output=splitstr[i-n]+' '+output
print(output)
actually, Keatinge beat me to it
answered Jun 2 at 3:09
3NiGMa3NiGMa
559
559
add a comment |
add a comment |
Using re
with list-comprehension
x_lst = [ str(len(i[1:-1])) for i in re.findall('((.*?))', a) ]
[re.search( r'(S+s+)' + i + '(.' + i + ')', a).group(0) for i in x_lst]
#['family health history (FHH)', 'nurse practitioner (NP)']
add a comment |
Using re
with list-comprehension
x_lst = [ str(len(i[1:-1])) for i in re.findall('((.*?))', a) ]
[re.search( r'(S+s+)' + i + '(.' + i + ')', a).group(0) for i in x_lst]
#['family health history (FHH)', 'nurse practitioner (NP)']
add a comment |
Using re
with list-comprehension
x_lst = [ str(len(i[1:-1])) for i in re.findall('((.*?))', a) ]
[re.search( r'(S+s+)' + i + '(.' + i + ')', a).group(0) for i in x_lst]
#['family health history (FHH)', 'nurse practitioner (NP)']
Using re
with list-comprehension
x_lst = [ str(len(i[1:-1])) for i in re.findall('((.*?))', a) ]
[re.search( r'(S+s+)' + i + '(.' + i + ')', a).group(0) for i in x_lst]
#['family health history (FHH)', 'nurse practitioner (NP)']
edited Jun 2 at 3:25
answered Jun 2 at 3:17
TranshumanTranshuman
3,0061412
3,0061412
add a comment |
add a comment |
This solution isn't particularly clever, it simpy searches for the acronyms and then builds up a pattern to extract the words ahead of each one:
import re
string = "Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP)."
definitions = []
for acronym in re.findall(r'(([A-Z]+?))', string):
length = len(acronym)
match = re.search(r'(?:w+W+)' + str(length) + r'(' + acronym + r')', string)
definitions.append(match.group(0))
print(", ".join(definitions))
OUTPUT
> python3 test.py
family health history (FHH), nurse practitioner (NP)
>
add a comment |
This solution isn't particularly clever, it simpy searches for the acronyms and then builds up a pattern to extract the words ahead of each one:
import re
string = "Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP)."
definitions = []
for acronym in re.findall(r'(([A-Z]+?))', string):
length = len(acronym)
match = re.search(r'(?:w+W+)' + str(length) + r'(' + acronym + r')', string)
definitions.append(match.group(0))
print(", ".join(definitions))
OUTPUT
> python3 test.py
family health history (FHH), nurse practitioner (NP)
>
add a comment |
This solution isn't particularly clever, it simpy searches for the acronyms and then builds up a pattern to extract the words ahead of each one:
import re
string = "Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP)."
definitions = []
for acronym in re.findall(r'(([A-Z]+?))', string):
length = len(acronym)
match = re.search(r'(?:w+W+)' + str(length) + r'(' + acronym + r')', string)
definitions.append(match.group(0))
print(", ".join(definitions))
OUTPUT
> python3 test.py
family health history (FHH), nurse practitioner (NP)
>
This solution isn't particularly clever, it simpy searches for the acronyms and then builds up a pattern to extract the words ahead of each one:
import re
string = "Although family health history (FHH) is commonly accepted as an important risk factor for common, chronic diseases, it is rarely considered by a nurse practitioner (NP)."
definitions = []
for acronym in re.findall(r'(([A-Z]+?))', string):
length = len(acronym)
match = re.search(r'(?:w+W+)' + str(length) + r'(' + acronym + r')', string)
definitions.append(match.group(0))
print(", ".join(definitions))
OUTPUT
> python3 test.py
family health history (FHH), nurse practitioner (NP)
>
edited Jun 2 at 3:29
answered Jun 2 at 3:22
cdlanecdlane
21.6k31245
21.6k31245
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56411861%2fretrieve-definition-for-parenthesized-abbreviation-based-on-letter-count%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3rn,Ww,N8r1ffSeiuxNGxRh7wpNv X9,6RFzFNm7of qnmZDwTQdfFek,d7HO9,Pqm3O5q,BYBHoO
I think you will need to write some parsing logic here, in addition to maybe using regex.
– Tim Biegeleisen
Jun 2 at 2:55
I know I can run a loop and do a Len(string) to get the number of letters, but I guess it's after that point I'm lost. Like if it's 3 letters, how to capture the previous 3 words.
– tenebris silentio
Jun 2 at 2:59
1
You should use
"""
instead of'
for multiline string– Keatinge
Jun 2 at 3:00