Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python rstrip or remove end of string by a pattern of characters

Tags:

python

strip

I'm trying to strip the end of the strings in this column. I've seen how to rstrip a specific character, or a set number of characters at the end of a string, but how do you do it based on a pattern?

I'd like to remove the entire end of the strings in the 'team' column at where we see a lowercase followed by an upper case. Then remove starting at the uppercase. I would like the below 'team' column:

   team                              pts/g
St. Louis RamsSt. Louis             32.875
Washington RedskinsWashington       27.6875
Minnesota VikingsMinnesota          24.9375
Indianapolis ColtsIndianapolis      26.4375
Oakland RaidersOakland              24.375
Carolina PanthersCarolina           26.3125
Jacksonville JaguarsJacksonville    24.75
Chicago BearsChicago                17.0
Green Bay PackersGreen Bay          22.3125
San Francisco 49ersSan Francisco    18.4375
Buffalo BillsBuffalo                20.0

to look like this:

   team                              pts/g
St. Louis Rams                      32.875
Washington Redskins                 27.6875
Minnesota Vikings                   24.9375
Indianapolis Colts                  26.4375
Oakland Raiders                     24.375
Carolina Panthers                   26.3125
Jacksonville Jaguars                24.75
Chicago Bears                       17.0
Green Bay Packers                   22.3125
San Francisco 49ers                 18.4375
Buffalo Bills                       20.0
like image 571
chitown88 Avatar asked Sep 22 '17 11:09

chitown88


People also ask

How do I remove a specific pattern from a string in Python?

In Python you can use the replace() and translate() methods to specify which characters you want to remove from the string and return a new modified string result. It is important to remember that the original string will not be altered because strings are immutable.

How do you remove the end of a string in Python?

Use the . strip() method to remove whitespace and characters from the beginning and the end of a string.

What does Rstrip \n do in Python?

The rstrip() method returns a copy of the string by removing the trailing characters specified as argument. If the characters argument is not provided, all trailing whitespaces are removed from the string.


1 Answers

You can use re.sub(pattern, repl, string) for that.

Let's use this regular expression for matching:

([a-z])[A-Z].*?(  )

It matches a lowercase character ([a-z]), followed by an uppercase character [A-Z] and any character .*? until it hits two spaces ( ). The lowercase character and the two spaces are in a group, so they can be re-inserted using \1 for the first and \2 for the second group when using re.sub:

new_text = re.sub(r"([a-z])[A-Z].*?(  )", r"\1\2", text)

Output for your example:

   team                              pts/g
St. Louis Rams             32.875
Washington Redskins       27.6875
Minnesota Vikings          24.9375
Indianapolis Colts      26.4375
Oakland Raiders              24.375
Carolina Panthers           26.3125
Jacksonville Jaguars    24.75
Chicago Bears                17.0
Green Bay Packers          22.3125
San Francisco 49ers    18.4375
Buffalo Bills                20.0

This messed the space-alignment up. Might not be relevant for you, but if you want to replace the wiped characters with space, you can pass a function instead of a replacement string to re.sub, which takes a Match object and returns a str:

def replace_with_spaces(match):
    return match.group(1) + " "*len(match.group(2)) + match.group(3)

And then use it like this (notice how I put the to-be-replaced part into a regex-group too):

new_text = re.sub(r"([a-z])([A-Z].*?)(  )", replace_with_spaces, text)

This produces:

   team                              pts/g
St. Louis Rams                      32.875
Washington Redskins                 27.687
Minnesota Vikings                   24.937
Indianapolis Colts                  26.437
Oakland Raiders                     24.375
Carolina Panthers                   26.312
Jacksonville Jaguars                24.75
Chicago Bears                       17.0
Green Bay Packers                   22.312
San Francisco 49ers                 18.437
Buffalo Bills                       20.0
like image 78
Felk Avatar answered Sep 28 '22 04:09

Felk