Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to rermove non-alphanumeric characters at the beginning or end of a string

I have a list with elements that have unnecessary (non-alphanumeric) characters at the beginning or end of each string.

Ex.

'cats--'

I want to get rid of the --

I tried:

for i in thelist:
    newlist.append(i.strip('\W'))

That didn't work. Any suggestions.

like image 405
user3175999 Avatar asked Mar 26 '14 02:03

user3175999


People also ask

How to remove all non-alphanumeric characters from a string in Java?

The approach is to use the String.replaceAll method to replace all the non-alphanumeric characters with an empty string. Below is the implementation of the above approach: Java. Java. class GFG {. public static String. removeNonAlphanumeric (String str) {. str = str.replaceAll (.

Which character should not be removed from a string?

Explanation: No need to remove any character, because the given string doesn’t have any non-alphanumeric character. Since the alphanumeric characters lie in the ASCII value range of [65, 90] for uppercase alphabets, [97, 122] for lowercase alphabets, and [48, 57] for digits.

How to check if a string character is non-alphanumeric in Python?

Hence traverse the string character by character and fetch the ASCII value of each character. If the ASCII value is not in the above three ranges, then the character is a non-alphanumeric character.

What are non-alphanumeric characters in Microsoft Word?

Non-alphanumeric characters comprise of all the characters except alphabets and numbers. It can be punctuation characters like exclamation mark (!), at symbol (@), commas (, ), question mark (?), colon (:), dash (-) etc and special characters like dollar sign ($), equal symbol (=), plus sign (+), apostrophes (‘) .


4 Answers

def strip_nonalnum(word):
    if not word:
        return word  # nothing to strip
    for start, c in enumerate(word):
        if c.isalnum():
            break
    for end, c in enumerate(word[::-1]):
        if c.isalnum():
            break
    return word[start:len(word) - end]

print([strip_nonalnum(s) for s in thelist])

Or

import re

def strip_nonalnum_re(word):
    return re.sub(r"^\W+|\W+$", "", word)
like image 196
jfs Avatar answered Oct 18 '22 19:10

jfs


To remove one or more chars other than letters, digits and _ from both ends you may use

re.sub(r'^\W+|\W+$', '', '??cats--') # => cats

Or, if _ is to be removed, too, wrap \W into a character class and add _ there:

re.sub(r'^[\W_]+|[\W_]+$', '', '_??cats--_')

See the regex demo and the regex graph:

enter image description here

See the Python demo:

import re
print( re.sub(r'^\W+|\W+$', '', '??cats--') )          # => cats
print( re.sub(r'^[\W_]+|[\W_]+$', '', '_??cats--_') )  # => cats
like image 22
Wiktor Stribiżew Avatar answered Oct 18 '22 18:10

Wiktor Stribiżew


You can use a regex expression. The method re.sub() will take three parameters:

  • The regex expression
  • The replacement
  • The string

Code:

import re

s = 'cats--'
output = re.sub("[^\\w]", "", s)

print output

Explanation:

  • The part "\\w" matches any alphanumeric character.
  • [^x] will match any character that is not x
like image 35
Christian Tapia Avatar answered Oct 18 '22 18:10

Christian Tapia


I believe that this is the shortest non-regex solution:

text = "`23`12foo--=+"

while len(text) > 0 and not text[0].isalnum():
    text = text[1:]
while len(text) > 0 and not text[-1].isalnum():
    text = text[:-1]

print text
like image 28
Shadyjames Avatar answered Oct 18 '22 19:10

Shadyjames