Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

memory location in unicode strings

I know someone explain why when I create equal unicode strings in Python 2.7 they do not point to the same location in memory As in "normal" strings

>>> a1 = 'a'
>>> a2 = 'a'
>>> a1 is a2
True

ok that was what I expected, but

>>> ua1 = u'a'
>>> ua2 = u'a'
>>> ua1 is ua2
False

why? how?

like image 657
Zokis Avatar asked Mar 13 '13 18:03

Zokis


People also ask

How are Python strings stored in memory?

Strings are stored as individual characters in a contiguous memory location. It can be accessed from both directions: forward and backward. Characters are nothing but symbols. Strings are immutable Data Types in Python, which means that once a string is created, it cannot be changed.

How does Python Store Unicode?

Python 2 uses str type to store bytes and unicode type to store unicode code points. All strings by default are str type — which is bytes~ And Default encoding is ASCII. So if an incoming file is Cyrillic characters, Python 2 might fail because ASCII will not be able to handle those Cyrillic Characters.

How much memory does a string using Python?

How much memory does a string take in Python? Adding single characters to a string adds only a byte to the size of the string itself, but every string takes up 40 bytes on its own.

Why do we use Unicode strings?

Unicode is a standard encoding system that is used to represent characters from almost all languages. Every Unicode character is encoded using a unique integer code point between 0 and 0x10FFFF . A Unicode string is a sequence of zero or more code points.


1 Answers

I think regular strings are interned but unicode strings are not. This simple test seems to support my theory (Python 2.6.6):

>>> intern("string")
'string'
>>> intern(u"unicode string")

Traceback (most recent call last):
  File "<pyshell#18>", line 1, in <module>
    intern(u"unicode string")
TypeError: intern() argument 1 must be string, not unicode
like image 145
Claudiu Avatar answered Sep 29 '22 22:09

Claudiu