I have a large string with brackets and commas and such. I want to strip all those characters but keep the spacing. How can I do this. As of now I am using
strippedList = re.sub(r'\W+', '', origList)
Use the isalnum() Method to Remove All Non-Alphanumeric Characters in Python String. We can use the isalnum() method to check whether a given character or string is alphanumeric or not. We can compare each character individually from a string, and if it is alphanumeric, then we combine it using the join() function.
Python String isalnum() Method The isalnum() method returns True if all the characters are alphanumeric, meaning alphabet letter (a-z) and numbers (0-9). Example of characters that are not alphanumeric: (space)!
To remove all non-alphanumeric characters from a string, call the replace() method, passing it a regular expression that matches all non-alphanumeric characters as the first parameter and an empty string as the second. The replace method returns a new string with all matches replaced.
Short example re. sub(r'\W+', '_', 'bla: bla**(bla)') replaces one or more consecutive non-alphanumeric characters by an underscore.
re.sub(r'([^\s\w]|_)+', '', origList)
A bit faster implementation:
import re
pattern = re.compile('([^\s\w]|_)+')
strippedList = pattern.sub('', value)
The regular-expression based versions might be faster (especially if you switch to using a compiled expression), but I like this for clarity:
"".join([c for c in origList if c in string.letters or c in string.whitespace])
It's a bit weird with the join()
call, but I think that is pretty idiomatic Python for converting a list of characters into a string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With