What is an efficient way to pad punctuation with whitespace?
input:
s = 'bla. bla? bla.bla! bla...'
desired output:
s = 'bla . bla ? bla . bla ! bla . . .'
Comments:
To add space after dot or comma using replace() in Python' with ', ' or '.
Use regex to Strip Punctuation From a String in Python The regex pattern [^\w\s] captures everything which is not a word or whitespace(i.e. the punctuations) and replaces it with an empty string.
Note The string. punctuation values do not include Unicode symbols or whitespace characters. Remove punctuation.
One of the easiest ways to remove punctuation from a string in Python is to use the str. translate() method. The translate() method typically takes a translation table, which we'll do using the . maketrans() method.
You can use a regular expression to match the punctuation characters you are interested and surround them by spaces, then use a second step to collapse multiple spaces anywhere in the document:
s = 'bla. bla? bla.bla! bla...'
import re
s = re.sub('([.,!?()])', r' \1 ', s)
s = re.sub('\s{2,}', ' ', s)
print(s)
Result:
bla . bla ? bla . bla ! bla . . .
If you use python3, use the maketrans() function.
import string
text = text.translate(str.maketrans({key: " {0} ".format(key) for key in string.punctuation}))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With