I have a Unicode string with some non-breaking spaces at the beginning and end. I get different results when using strip()
vs. strip(string.whitespace)
.
>>> import string
>>> s5 = u'\xa0\xa0hello\xa0\xa0'
>>> print s5.strip()
hello
>>> print s5.strip(string.whitespace)
hello
The documentation for strip()
says, "If omitted or None
, the chars
argument defaults to removing whitespace." The documentation for string.whitespace
says, "A string containing all characters that are considered whitespace."
So if string.whitespace
contains all characters that are considered whitespace, then why are the results different? Does it have something to do with Unicode?
I am using Python 2.7.6
The string strip() method in python is built-in from Python. It helps the developer to remove the whitespaces or specific characters from the string at the beginning and end of the string. Strip() method in string accepts only one parameter which is optional and has characters.
strip(): returns a new string after removing any leading and trailing whitespaces including tabs ( \t ). rstrip(): returns a new string with trailing whitespace removed. It's easier to remember as removing white spaces from “right” side of the string.
The strip() method removes any leading (spaces at the beginning) and trailing (spaces at the end) characters (space is the default leading character to remove)
strip() Python String strip() function will remove leading and trailing whitespaces. If you want to remove only leading or trailing spaces, use lstrip() or rstrip() function instead.
From the documentation of the string.whitespace
:
A string containing all ASCII characters that are considered whitespace. This includes the characters space, tab, linefeed, return, formfeed, and vertical tab.
It's the same under python3, where all non-ASCII constants where removed. (In python2 some constants could be influenced by locale
settings).
Hence the difference in behaviour is quite obvious since strip()
does remove any unicode whitespace, while strip(string.whitespace)
removes only ASCII spaces. Your string clearly contains non-ASCII spaces.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With