Вопрос

Say I define the following variable using ctypes module

i = c_int(4)

and afterwards I try to find out the memory address of i using:

id(i)

or

ctypes.addressof(i)

which, at the moment, yield different values. Why is that?

Это было полезно?

Решение

What you are suggesting should be the case is an implementation detail of CPython.

The id() function:

Return the “identity” of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime.

CPython implementation detail: This is the address of the object in memory.

While they might be equivalent in CPython, this is not guaranteed to be true in other implementations of Python.


Why are they different values, even in CPython?

Note that a c_int:

  • is a Python Object. CPython's id() will return the address of this.

  • contains a 4-byte C-compatible int value. ctypes.addressof() will return the address of this.

The metadata in a Python object takes up space. Because of this, that 4-byte value probably won't live at the very beginning of the Python object.

Look at this example:

>>> import ctypes
>>> i = ctypes.c_int(4)
>>> hex(id(i))
'0x22940d0'
>>> hex(ctypes.addressof(i))
'0x22940f8'

We see that the addressof result is only 0x28 bytes higher than the result of id(). Playing around with this a few times, we can see that this is always the case. Therefore, I'd say that there are 0x28 bytes of Python object metadata preceding the actual int value in the overall c_int.

In my above example:

   c_int
 ___________
|           |   0x22940d0   This is what id() returns
| metadata  |
|           |
|           |
|           |
|           |
|___________|
|   value   |   0x22940f8   This is what addressof() returns
|___________|

Edit:

In the CPython implementation of ctypes, the base CDataObject (2.7.6 source) has a b_ptr member that points to the memory block used for the object's C data:

union value {
                char c[16];
                short s;
                int i;
                long l;
                float f;
                double d;
#ifdef HAVE_LONG_LONG
                PY_LONG_LONG ll;
#endif
                long double D;
};

struct tagCDataObject {
    PyObject_HEAD
    char *b_ptr;                /* pointer to memory block */
    int  b_needsfree;           /* need _we_ free the memory? */
    CDataObject *b_base;        /* pointer to base object or NULL */
    Py_ssize_t b_size;          /* size of memory block in bytes */
    Py_ssize_t b_length;        /* number of references we need */
    Py_ssize_t b_index;         /* index of this object into base's
                                   b_object list */
    PyObject *b_objects;        /* dictionary of references we need 
                                   to keep, or Py_None */
    union value b_value;
};

addressof returns this pointer as a Python integer:

static PyObject *
addressof(PyObject *self, PyObject *obj)
{
    if (CDataObject_Check(obj))
        return PyLong_FromVoidPtr(((CDataObject *)obj)->b_ptr);
    PyErr_SetString(PyExc_TypeError,
                    "invalid type");
    return NULL;
}

Small C objects use the default 16-byte b_value member of the CDataObject. As the example above shows, this default buffer is used for the c_int(4) instance. We can turn ctypes on itself to introspect c_int(4) in a 32-bit process:

>>> i = c_int(4)
>>> ci = CDataObject.from_address(id(i))

>>> ci
ob_base: 
    ob_refcnt: 1
    ob_type: py_object(<class 'ctypes.c_long'>)
b_ptr: 3071814328
b_needsfree: 1
b_base: LP_CDataObject(<NULL>)
b_size: 4
b_length: 0
b_index: 0
b_objects: py_object(<NULL>)
b_value: 
    c: b'\x04'
    s: 4
    i: 4
    l: 4
    f: 5.605193857299268e-45
    d: 2e-323
    ll: 4
    D: 0.0

>>> addressof(i)
3071814328
>>> id(i) + CDataObject.b_value.offset
3071814328

This trick leverages the fact that id in CPython returns the base address of an object.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top