I have a list with elements that have unnecessary (non-alphanumeric) characters at the beginning or end of each string.
Ex.
'cats--'
I want to get rid of the --
I tried:
for i in thelist:
newlist.append(i.strip('\W'))
That didn't work. Any suggestions.
The approach is to use the String.replaceAll method to replace all the non-alphanumeric characters with an empty string. Below is the implementation of the above approach: Java. Java. class GFG {. public static String. removeNonAlphanumeric (String str) {. str = str.replaceAll (.
Explanation: No need to remove any character, because the given string doesn’t have any non-alphanumeric character. Since the alphanumeric characters lie in the ASCII value range of [65, 90] for uppercase alphabets, [97, 122] for lowercase alphabets, and [48, 57] for digits.
Hence traverse the string character by character and fetch the ASCII value of each character. If the ASCII value is not in the above three ranges, then the character is a non-alphanumeric character.
Non-alphanumeric characters comprise of all the characters except alphabets and numbers. It can be punctuation characters like exclamation mark (!), at symbol (@), commas (, ), question mark (?), colon (:), dash (-) etc and special characters like dollar sign ($), equal symbol (=), plus sign (+), apostrophes (‘) .
def strip_nonalnum(word):
if not word:
return word # nothing to strip
for start, c in enumerate(word):
if c.isalnum():
break
for end, c in enumerate(word[::-1]):
if c.isalnum():
break
return word[start:len(word) - end]
print([strip_nonalnum(s) for s in thelist])
Or
import re
def strip_nonalnum_re(word):
return re.sub(r"^\W+|\W+$", "", word)
To remove one or more chars other than letters, digits and _
from both ends you may use
re.sub(r'^\W+|\W+$', '', '??cats--') # => cats
Or, if _
is to be removed, too, wrap \W
into a character class and add _
there:
re.sub(r'^[\W_]+|[\W_]+$', '', '_??cats--_')
See the regex demo and the regex graph:
See the Python demo:
import re
print( re.sub(r'^\W+|\W+$', '', '??cats--') ) # => cats
print( re.sub(r'^[\W_]+|[\W_]+$', '', '_??cats--_') ) # => cats
You can use a regex expression. The method re.sub()
will take three parameters:
Code:
import re
s = 'cats--'
output = re.sub("[^\\w]", "", s)
print output
Explanation:
"\\w"
matches any alphanumeric character.[^x]
will match any character that is not x
I believe that this is the shortest non-regex solution:
text = "`23`12foo--=+"
while len(text) > 0 and not text[0].isalnum():
text = text[1:]
while len(text) > 0 and not text[-1].isalnum():
text = text[:-1]
print text
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With