Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding punctuations to a list?

i have a small problem with punctuations.

My assignment was to check if there were any duplicated words in a text, if there were any duplicated words in the list my job was to highlight them by using .upper().

Example on text: I like apples, apples is the best thing i know.

So i took the original text, striped it from punctuations, transformed all words to lowercase and then split the list. With a for-loop i compared every word in the list with each other and i found all duplicated word, all of this were placed in a new list.

Example (after using the for-loop): i like apples APPLES is the best thing I know

So the new list is now similar to the original list but with one major exception, it is lacking the punctuations.

Is there a way to add the punctuations on the new list were they are "suppose to be" (from the old lists position)? Is there some kind of method build in python that can do this, or do i have to compare the two lists with another for-loop and then add the punctuations to the new list?

NewList = [] # Creates an empty list

for word in text: 
    if word not in NewList: 
        NewList.append(word)
    elif word in NewList: # 
        NewList.append(word.upper())
List2 = ' '.join(NewList)

the code above works for longer text and thats the code i have been using for Highlighting duplicated words. The only problem is that the punctations doesn't exist in the new file, thats the only problem i have.

like image 541
SoIsUrFace Avatar asked Mar 26 '26 13:03

SoIsUrFace


1 Answers

Here's an example of using sub method with callback from build-in regexp module. This solution respects all the punctuation.

import re

txt = "I like,, ,apples, apples! is the .best. thing *I* know!!1"


def repl(match, stack):
    word = match.group(0)
    word_upper = word.upper()
    if word_upper in stack:
        return word_upper
    stack.add(word_upper)
    return word

def highlight(s):
    stack = set()
    return re.sub('\\b([a-zA-Z]+)\\b', lambda match: repl(match, stack), s)

print txt
print highlight(txt)
like image 191
Andrew Dunai Avatar answered Mar 29 '26 02:03

Andrew Dunai



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!