I have an error when trying to use contain in python.
s = u"some utf8 words" k = u"one utf8 word" if s.contains(k): print "contains"
How do i achieve the same result?
Example with normal ASCII string
s = "haha i am going home" k = "haha" if s.contains(k): print "contains"
I am using python 2.7.x
Unicode is explicitly defined such as to overlap in that same range with ASCII. Thus, if you look at the character codes in your string, and it contains anything that is higher than 127, the string contains Unicode characters that are not ASCII characters. Note, that ASCII includes only the English alphabet.
You have two options to create Unicode string in Python. Either use decode() , or create a new Unicode string with UTF-8 encoding by unicode(). The unicode() method is unicode(string[, encoding, errors]) , its arguments should be 8-bit strings.
The 'u' in front of a string means the string is a Unicode string. A Unicode is a way for a string to represent more characters than a regular ASCII string can.
The same for ascii and utf8 strings:
if k in s: print "contains"
There is no contains()
on either ascii or uft8 strings:
>>> "strrtinggg".contains AttributeError: 'str' object has no attribute 'contains'
What you can use instead of contains
is find
or index
:
if k.find(s) > -1: print "contains"
or
try: k.index(s) except ValueError: pass # ValueError: substring not found else: print "contains"
But of course, the in
operator is the way to go, it's much more elegant.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With