I am looking to replicate Python 2 style len()
in Python 3, as it relates to unicode strings.
In Python 2 the len()
of a unicode string is its on-disk bytes size. for example: len("애정")
returns 6. In Python 3 len()
returns the number of characters in the string, the example returns 2.
sys.getsizeof()
is not the solution, because that gets the size of the Python object in-memory, not the size the object would be if it was written to the disk.
You can encode it to utf8
like below.
>>> len('애정'.encode('utf8'))
6
>>>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With