I'd like to monkey-patch Python lists, in particular, replacing the __setitem__
method with custom code. Note that I am not trying to extend, but to overwrite the builtin types. For example:
>>> # Monkey Patch
... # Replace list.__setitem__ with a Noop
...
>>> myList = [1,2,3,4,5]
>>> myList[0] = "Nope"
>>> myList
[1, 2, 3, 4, 5]
Yes, I know that is a downright perverted thing to do to python code. No, my usecase doesn't really make sense. Nonetheless, can it be done?
forbiddenfruit
module allows patching of C builtins, but does not work when trying to override the list methodsI actually manage to override the methods themselves, as shown below:
import ctypes
def magic_get_dict(o):
# find address of dict whose offset is stored in the type
dict_addr = id(o) + type(o).__dictoffset__
# retrieve the dict object itself
dict_ptr = ctypes.cast(dict_addr, ctypes.POINTER(ctypes.py_object))
return dict_ptr.contents.value
def magic_flush_mro_cache():
ctypes.PyDLL(None).PyType_Modified(ctypes.cast(id(object), ctypes.py_object))
print(list.__setitem__)
dct = magic_get_dict(list)
dct['__setitem__'] = lambda s, k, v: s
magic_flush_mro_cache()
print(list.__setitem__)
x = [1,2,3,4,5]
print(x.__setitem__)
x.__setitem__(0,10)
x[1] = 20
print(x)
Which outputs the following:
➤ python3 override.py
<slot wrapper '__setitem__' of 'list' objects>
<function <lambda> at 0x10de43f28>
<bound method <lambda> of [1, 2, 3, 4, 5]>
[1, 20, 3, 4, 5]
But as shown in the output, this doesn't seem to affect the normal syntax for setting an item (x[0] = 0
)
As a lesser alternative, if I was able to monkey patch an individual list's instance, this could work too. Perhaps by changing the class pointer of the list to a custom class.
In Python, the term monkey patch only refers to dynamic modifications of a class or module at runtime, motivated by the intent to patch existing third-party code as a workaround to a bug or feature which does not act as you desire.
Monkey patching is a technique used to dynamically update the behavior of a piece of code at run-time. A monkey patch (also spelled monkey-patch, MonkeyPatch) is a way to extend or modify the runtime code of dynamic languages (e.g. Smalltalk, JavaScript, Objective-C, Ruby, Perl, Python, Groovy, etc.)
Monkey patching refers to the dynamic (run-time) modification of a class or module. It is an advanced topic in Python and to understand it one must have clarity about functions and how functions are treated in Python.
A little late to the party, but nonetheless, here's the answer.
As user2357112 hinted in the comment above, modifying the dict won't suffice, since __getitme__
(and other double-underscore names) are mapped to their slot, and won't be updated without calling update_slot
(which isn't exported, so that would be a little tricky).
Inspired by the above comment, here's a working example of making __setitem__
a no-op for specific lists:
# assuming v3.8 (tested on Windows x64 and Ubuntu x64)
# definition of PyTypeObject: https://github.com/python/cpython/blob/3.8/Include/cpython/object.h#L177
# no extensive testing was performed and I'll let other decide if this is a good idea or not, but it's possible
import ctypes
Py_TPFLAGS_HEAPTYPE = (1 << 9)
# calculate the offset of the tp_flags field
offset = ctypes.sizeof(ctypes.c_ssize_t) * 1 # PyObject_VAR_HEAD.ob_base.ob_refcnt
offset += ctypes.sizeof(ctypes.c_void_p) * 1 # PyObject_VAR_HEAD.ob_base.ob_type
offset += ctypes.sizeof(ctypes.c_ssize_t) * 1 # PyObject_VAR_HEAD.ob_size
offset += ctypes.sizeof(ctypes.c_void_p) * 1 # tp_name
offset += ctypes.sizeof(ctypes.c_ssize_t) * 2 # tp_basicsize+tp_itemsize
offset += ctypes.sizeof(ctypes.c_void_p) * 1 # tp_dealloc
offset += ctypes.sizeof(ctypes.c_ssize_t) * 1 # tp_vectorcall_offset
offset += ctypes.sizeof(ctypes.c_void_p) * 7 # tp_getattr+tp_setattr+tp_as_async+tp_repr+tp_as_number+tp_as_sequence+tp_as_mapping
offset += ctypes.sizeof(ctypes.c_void_p) * 6 # tp_hash+tp_call+tp_str+tp_getattro+tp_setattro+tp_as_buffer
tp_flags = ctypes.c_ulong.from_address(id(list) + offset)
assert(tp_flags.value == list.__flags__) # should be the same
lst1 = [1,2,3]
lst2 = [1,2,3]
dont_set_me = [lst1] # these lists cannot be set
# define new method
orig = list.__setitem__
def new_setitem(self, *args):
if [_ for _ in dont_set_me if _ is self]: # check for identical object in list
print('Nope')
else:
return orig(self, *args)
tp_flags.value |= Py_TPFLAGS_HEAPTYPE # add flag, to allow type_setattro to continue
list.__setitem__ = new_setitem # set method, this will already call PyType_Modified and update_slot
tp_flags.value &= (~Py_TPFLAGS_HEAPTYPE) # remove flag
print(lst1, lst2) # > [1, 2, 3] [1, 2, 3]
lst1[0],lst2[0]='x','x' # > Nope
print(lst1, lst2) # > [1, 2, 3] ['x', 2, 3]
Edit
See here why it's not supported to begin with. Mainly, as explained by Guido van Rossum:
This is prohibited intentionally to prevent accidental fatal changes to built-in types (fatal to parts of the code that you never though of). Also, it is done to prevent the changes to affect different interpreters residing in the address space, since built-in types (unlike user-defined classes) are shared between all such interpreters.
I also searched for all usages of Py_TPFLAGS_HEAPTYPE
in cpython and they all seem to be related to GC or some validations.
So I guess if:
You'll just be fine <generic disclaimer here>.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With