Having an UTF-8 string like this: <pre class="prettyprint"><code>mystring = "işğüı" </code></pre> is it possible to get its (in memory) size in Bytes with Python (2.5)?

Assuming you mean the number of UTF-8 bytes (and not the extra bytes that Python requires to store the object), it’s the same as for the length of any other string. A string literal in Python 2.x is a string of encoded bytes, not Unicode characters. Byte strings: <pre class="prettyprint"><code>>>> mystring = "işğüı" >>> print "length of {0} is {1}".format(repr(mystring), len(mystring)) length of 'i\xc5\x9f\xc4\x9f\xc3\xbc\xc4\xb1' is 9 </code></pre> Unicode strings: <pre class="prettyprint"><code>>>> myunicode = u"işğüı" >>> print "length of {0} is {1}".format(repr(myunicode), len(myunicode)) length of u'i\u015f\u011f\xfc\u0131' is 5 </code></pre> It’s good practice to maintain all of your strings in Unicode, and only encode when communicating with the outside world. In this case, you could use <code>len(myunicode.encode('utf-8'))</code> to find the size it would be after encoding.

How do I get a size of an UTF-8 string in Bytes with Python

Tags:

python

Having an UTF-8 string like this:

mystring = "işğüı"

is it possible to get its (in memory) size in Bytes with Python (2.5)?

807

asked Oct 01 '10 19:10

systempuntoout

1 Answers

Assuming you mean the number of UTF-8 bytes (and not the extra bytes that Python requires to store the object), it’s the same as for the length of any other string. A string literal in Python 2.x is a string of encoded bytes, not Unicode characters.

Byte strings:

>>> mystring = "işğüı"
>>> print "length of {0} is {1}".format(repr(mystring), len(mystring))
length of 'i\xc5\x9f\xc4\x9f\xc3\xbc\xc4\xb1' is 9

Unicode strings:

>>> myunicode = u"işğüı"
>>> print "length of {0} is {1}".format(repr(myunicode), len(myunicode))
length of u'i\u015f\u011f\xfc\u0131' is 5

It’s good practice to maintain all of your strings in Unicode, and only encode when communicating with the outside world. In this case, you could use len(myunicode.encode('utf-8')) to find the size it would be after encoding.

190

answered Sep 18 '22 18:09

Josh Lee

Related questions
                            
                                Google Federated Login (OpenID+Oauth) for Hosted Apps - changing end points?
                            
                                Building python 2.6 w/ sqlite3 module if sqlite is installed in non-standard location
                            
                                Append Row(s) to a NumPy Record Array
                            
                                Django cache_page checking
                            
                                Creating portable Django apps - help needed
                            
                                How can I create a static variable in a Python class via the C API?
                            
                                Discovering referers to SQLAlchemy object
                            
                                How to do mutual certificate authentication with httplib2
                            
                                Introspection of win32com module / pythoncom module
                            
                                Sending MESSAGE to a person on facebook using python
                            
                                Best practices for logging in django project
                            
                                pdf viewer for pyqt4 application?
                            
                                Comparing (similar) images with Python/PIL
                            
                                Is it approproate it use django signals within the same app
                            
                                Can I find the path of the executable running a python script from within the python script?
                            
                                How do I implement secure authentication using xml-rpc in python?
                            
                                Good resources to start python for web development?
                            
                                nosetests --cover-html does not generate html docs
                            
                                What is the Windows equivalent of pwd.getpwnam(username).pw_dir?
                            
                                Python extensions for Win64 via GCC

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With