Python ctypes and mutability

Tags:

ctypes

I noticed that passing Python objects to native code with ctypes can break mutability expectations.

For example, if I have a C function like:

int print_and_mutate(char *str)
{
    str[0] = 'X';
    return printf("%s\n", str);
}

and I call it like this:

from ctypes import *
lib = cdll.LoadLibrary("foo.so")

s = b"asdf"
lib.print_and_mutate(s)

The value of s changed, and is now b"Xsdf".

The Python docs say "You should be careful, however, not to pass them to functions expecting pointers to mutable memory.".

Is this only because it breaks expectations of which types are immutable, or can something else break as a result? In other words, if I go in with the clear understanding that my original bytes object will change, even though normally bytes are immutable, is that OK or will I get some kind of nasty surprise later if I don't use create_string_buffer like I'm supposed to?

737

asked Sep 19 '20 12:09

wrschneider

3 Answers

Python makes assumptions about immutable objects, so mutating them will definitely break things. Here's a concrete example:

>>> import ctypes as c
>>> x = b'abc'          # immutable string
>>> d = {x:123}         # Used as key in dictionary (keys must be hashable/immutable)
>>> d
{b'abc': 123}

Now build a ctypes mutable buffer to the immutable object. id(x) in CPython is the memory address of the Python object and sys.getsizeof() returns the size of that object. PyBytes objects have some overhead, but the end of the object has the bytes of the string.

>>> sys.getsizeof(x)
36
>>> px=(c.c_char*36).from_address(id(x))
>>> px.raw
b'\x02\x00\x00\x00\x00\x00\x00\x000\x8fq\x0b\xfc\x7f\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00\xf0\x06\xe61\xeb\x00\x1b\xa9abc\x00'
>>> px.raw[-4:]  # last bytes of the object
b'abc\x00'
>>> px[-4]
b'a'
>>> px[-4] = b'y'  # Mutate the ctypes buffer, mutating the "immutable" string
>>> x              # Now it has a modified value.
b'ybc'

Now try to access the key in the dictionary. Keys are located in O(1) time using its hash, but the hash was on the original, "immutable" value so it is incorrect. The key can no longer be found by old or new value:

>>> d           # Note that dictionary key changed, too.
{b'ybc': 123}
>>> d[b'ybc']   # Try to access the key
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: b'ybc'
>>> d[b'abc']   # Maybe original key will work? It hashes same as the original...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: b'abc'

188

answered Oct 23 '22 23:10

Mark Tolonen

Various objects are interned by CPython and reused. Examples are small integers (-5 to 127) but also short strings and some literals. This behaviour is entirely implementation defined and may freely change between releases. Changing such objects can trigger arbitrary behaviour, from nothing at all to entirely undefined behaviour.

That "original bytes object" is not yours, it is CPython's.

answered Oct 24 '22 01:10

MisterMiyagi

It sounds like the closest you can get to UB in CPython.

While it may not be happening at the moment, CPython could give you a pointer to read-only memory and the program will segfault.

Further, CPython could be sharing the string or subslices with other objects, and you would be modifying all of them.

answered Oct 23 '22 23:10

Acorn

Related questions
                            
                                SpyderKernelApp WARNING No such comm
                            
                                Does Ansible expose its auto-discovered Python interpreter?
                            
                                Can you run Google Colab on your local computer?
                            
                                Graphing points on a map but the error code is "ValueError: 'box_aspect' and 'fig_aspect' must be positive"
                            
                                How can I extract text fragments from PDF with their coordinates in Python?
                            
                                "WHY" 2 different executables of python of same version?
                            
                                Verify hostname of the server who invoked the API
                            
                                How determine if a token is part of an entity within Spacy?
                            
                                Conditional filtering of ndarrays
                            
                                Python Callback for File Object Close
                            
                                AttributeError: 'Worksheet' object has no attribute 'set_column'
                            
                                selenium.common.exceptions.SessionNotCreatedException: Message: session not created: This version of ChromeDriver only supports Chrome version 85
                            
                                Parse expression with binary and unary operators, reserved words, and without parentheses
                            
                                "requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))"
                            
                                How to clear the conda environment variables?
                            
                                Pandas: Sampling from a DataFrame according to a target distribution
                            
                                Fastest way to run a single function in python in parallel for multiple parameters
                            
                                Return majority weighted vote from array based in columns
                            
                                Add file filters to JavaFx Filechooser in Jython and parametrize them
                            
                                Find the top 5 values based on the sum in the last column and last row

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With