Say I define the following variable using ctypes module
i = c_int(4)
and afterwards I try to find out the memory address of i using:
id(i)
or
ctypes.addressof(i)
which, at the moment, yield different values. Why is that?
ctypes is a foreign function library for Python. It provides C compatible data types, and allows calling functions in DLLs or shared libraries. It can be used to wrap these libraries in pure Python.
c_char_p is a subclass of _SimpleCData , with _type_ == 'z' . The __init__ method calls the type's setfunc , which for simple type 'z' is z_set . In Python 2, the z_set function (2.7. 7) is written to handle both str and unicode strings.
What you are suggesting should be the case is an implementation detail of CPython.
The id()
function:
Return the “identity” of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime.
CPython implementation detail: This is the address of the object in memory.
While they might be equivalent in CPython, this is not guaranteed to be true in other implementations of Python.
Why are they different values, even in CPython?
Note that a c_int
:
is a Python Object. CPython's id()
will return the address of this.
contains a 4-byte C-compatible int
value. ctypes.addressof()
will return the address of this.
The metadata in a Python object takes up space. Because of this, that 4-byte value probably won't live at the very beginning of the Python object.
Look at this example:
>>> import ctypes
>>> i = ctypes.c_int(4)
>>> hex(id(i))
'0x22940d0'
>>> hex(ctypes.addressof(i))
'0x22940f8'
We see that the addressof
result is only 0x28 bytes higher than the result of id()
. Playing around with this a few times, we can see that this is always the case. Therefore, I'd say that there are 0x28 bytes of Python object metadata preceding the actual int
value in the overall c_int
.
In my above example:
c_int
___________
| | 0x22940d0 This is what id() returns
| metadata |
| |
| |
| |
| |
|___________|
| value | 0x22940f8 This is what addressof() returns
|___________|
Edit:
In the CPython implementation of ctypes, the base CDataObject
(2.7.6 source) has a b_ptr
member that points to the memory block used for the object's C data:
union value {
char c[16];
short s;
int i;
long l;
float f;
double d;
#ifdef HAVE_LONG_LONG
PY_LONG_LONG ll;
#endif
long double D;
};
struct tagCDataObject {
PyObject_HEAD
char *b_ptr; /* pointer to memory block */
int b_needsfree; /* need _we_ free the memory? */
CDataObject *b_base; /* pointer to base object or NULL */
Py_ssize_t b_size; /* size of memory block in bytes */
Py_ssize_t b_length; /* number of references we need */
Py_ssize_t b_index; /* index of this object into base's
b_object list */
PyObject *b_objects; /* dictionary of references we need
to keep, or Py_None */
union value b_value;
};
addressof
returns this pointer as a Python integer:
static PyObject *
addressof(PyObject *self, PyObject *obj)
{
if (CDataObject_Check(obj))
return PyLong_FromVoidPtr(((CDataObject *)obj)->b_ptr);
PyErr_SetString(PyExc_TypeError,
"invalid type");
return NULL;
}
Small C objects use the default 16-byte b_value
member of the CDataObject
. As the example above shows, this default buffer is used for the c_int(4)
instance. We can turn ctypes on itself to introspect c_int(4)
in a 32-bit process:
>>> i = c_int(4)
>>> ci = CDataObject.from_address(id(i))
>>> ci
ob_base:
ob_refcnt: 1
ob_type: py_object(<class 'ctypes.c_long'>)
b_ptr: 3071814328
b_needsfree: 1
b_base: LP_CDataObject(<NULL>)
b_size: 4
b_length: 0
b_index: 0
b_objects: py_object(<NULL>)
b_value:
c: b'\x04'
s: 4
i: 4
l: 4
f: 5.605193857299268e-45
d: 2e-323
ll: 4
D: 0.0
>>> addressof(i)
3071814328
>>> id(i) + CDataObject.b_value.offset
3071814328
This trick leverages the fact that id
in CPython returns the base address of an object.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With