Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How are variables names stored and mapped internally?

I read https://stackoverflow.com/a/19721096/1661745 and it seems that in CPython, variables are simply names that are associated with references.

There are several things going on with the statement x=5:

  1. an int object with the value of 5 is created (or found if it already exists)
  2. the name x is created (or disassociated with the last object 'x' labeled)
  3. the reference count to the new (or found) int object is increased by 1
  4. the name x is associated with the object with the value '5' created (or found).

However, I'm still not clear with exactly how variables are implemented internally.

Namely:

  1. the name x is created (or disassociated with the last object 'x' labeled);

Then wouldn't the name also take up memory space? sys.sizeof(x) equals sys.sizeof(5), and I get that sys.sizeof(x) could only return the size of the associated reference, but then what is the size of the name x?

  1. the name x is associated with the object with the value '5' created (or found)

How is this implemented internally? I think at a high level it can be done with a dict, where the key is the variable name (str?) and the value is the reference that it's associated with.

like image 553
onepiece Avatar asked Oct 24 '15 15:10

onepiece


1 Answers

I think at a high level it can be done with a dict, where the key is the variable name (str?) and the value is the reference that it's associated with.

This is how it works internally too. In CPython, variable names and the objects they point to are typically stored in a Python dictionary; the very same data structure that you can use when writing Python code.

When you write x = 5, the name x is set as a key in the dictionary of global names with 5 as the corresponding value. You can return and inspect this dictionary using the globals() function, which gives the contents of the current scope's namespace.

So you're also correct that the name x takes up space. It exists as a string somewhere in memory and Python keeps a reference to it for the key of the dictionary.

If you want to peer deeper into the CPython source to see where x gets assigned to the value 5, you could have a look at ceval.c. Writing x = 5 triggers the LOAD_CONST opcode (to put the integer 5 onto the stack) and also the STORE_GLOBAL opcode* (to set the name x as a key in the dictionary with 5 as the value).

Here is the code for the STORE_GLOBAL opcode:

TARGET(STORE_GLOBAL) {
    PyObject *name = GETITEM(names, oparg);
    PyObject *v = POP();
    int err;
    err = PyDict_SetItem(f->f_globals, name, v);
    Py_DECREF(v);
    if (err != 0)
        goto error;
    DISPATCH();
}

You can see the call to PyDict_SetItem to update the globals dictionary.


* If you inspect the bytecode generated by x = 5 (e.g. using dis) you might see the STORE_NAME opcode used. This opcode functions in the same way (see here for a brief description).

like image 184
Alex Riley Avatar answered Sep 24 '22 00:09

Alex Riley