I am handling a situation where I need to make a string fit in the allocated gap in the screen, as I'm using unicode len() and slices[] work apparently on bytes and I end up cutting unicode strings too short, because €
only occupies one space in the screen but 2 for len() or slices[].
I have the encoding headers properly setup, and I'm willing to use other things than slices or len() to deal with this, but I really need to know how many spaces will the string take and how to cut it to the available.
$cat test.py
# -*- coding: utf-8 -*-
a = "2 €uros"
b = "2 Euros"
print len(b)
print len(a)
print a[3:]
print b[3:]
$python test.py
7
9
��uros
uros
Python string supports slicing to create substring. Note that Python string is immutable, slicing creates a new substring from the source string and original string remains unchanged.
Slicing StringsYou can return a range of characters by using the slice syntax. Specify the start index and the end index, separated by a colon, to return a part of the string.
What are Indexing and Slicing? Indexing: Indexing is used to obtain individual elements. Slicing: Slicing is used to obtain a sequence of elements. Indexing and Slicing can be be done in Python Sequences types like list, string, tuple, range objects.
Python slicing is about obtaining a sub-string from the given string by slicing it respectively from start to end.
You're not creating Unicode strings there; you're creating byte strings with UTF-8 encoding (which is variable-length, as you're seeing). You need to use constants of the form u"..."
(or u'...'
). If you do that, you get the expected result:
% cat test.py
# -*- coding: utf-8 -*-
a = u"2 €uros"
b = u"2 Euros"
print len(b)
print len(a)
print a[3:]
print b[3:]
% python test.py
7
7
uros
uros
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With