In C python, accessing the bytecode evaluation stack

Tags:

Given a C Python frame pointer, how do I look at arbitrary evaluation stack entries? (Some specific stack entries can be found via locals(), I'm talking about other stack entries.)

I asked a broader question like this a while ago:

getting the C python exec argument string or accessing the evaluation stack

but here I want to focus on being able to read CPython stack entries at runtime.

I'll take a solution that works on CPython 2.7 or any Python later than Python 3.3. However if you have things that work outside of that, share that and, if there is no better solution I'll accept that.

I'd prefer not modifying the C Python code. In Ruby, I have in fact done this to get what I want. I can speak from experience that this is probably not the way we want to work. But again, if there's no better solution, I'll take that. (My understanding wrt to SO points is that I lose it in the bounty either way. So I'm happy go see it go to the person who has shown the most good spirit and willingness to look at this, assuming it works.)

update: See the comment by user2357112 tldr; Basically this is hard-to-impossible to do. (Still, if you think you have the gumption to try, by all means do so.)

So instead, let me narrow the scope to this simpler problem which I think is doable:

Given a python stack frame, like inspect.currentframe(), find the beginning of the evaluation stack. In the C version of the structure, this is f_valuestack. From that we then need a way in Python to read off the Python values/objects from there.

update 2 well the time period for a bounty is over and no one (including my own summary answer) has offered concrete code. I feel this is a good start though and I now understand the situation much more than I had. In the obligatory "describe why you think there should be a bounty" I had listed one of the proffered choices "to draw more attention to this problem" and to that extent where there had been something less than a dozen views of the prior incarnation of the problem, as I type this it has been viewed a little under 190 times. So this is a success. However...

If someone in the future decides to carry this further, contact me and I'll set up another bounty.

Thanks all.

516

asked Jun 03 '17 16:06

rocky

1 Answers

This is sometimes possible, with ctypes for direct C struct member access, but it gets messy fast.

First off, there's no public API for this, on the C side or the Python side, so that's out. We'll have to dig into the undocumented insides of the C implementation. I'll be focusing on the CPython 3.8 implementation; the details should be similar, though likely different, in other versions.

A PyFrameObject struct has an f_valuestack member that points to the bottom of its evaluation stack. It also has an f_stacktop member that points to the top of its evaluation stack... sometimes. During execution of a frame, Python actually keeps track of the top of the stack using a stack_pointer local variable in _PyEval_EvalFrameDefault:

stack_pointer = f->f_stacktop;
assert(stack_pointer != NULL);
f->f_stacktop = NULL;       /* remains NULL unless yield suspends frame */

There are two cases in which f_stacktop is restored. One is if the frame is suspended by a yield (or yield from, or any of the multiple constructs that suspend coroutines through the same mechanism). The other is right before calling a trace function for a 'line' or 'opcode' trace event. f_stacktop is cleared again when the frame unsuspends, or after the trace function finishes.

That means that if

you're looking at a suspended generator or coroutine frame, or
you're currently in a trace function for a 'line' or 'opcode' event for a frame

then you can access the f_valuestack and f_stacktop pointers with ctypes to find the lower and upper bounds of the frame's evaluation stack and access the PyObject * pointers stored in that range. You can even get a superset of the stack contents without ctypes with gc.get_referents(frame_object), although this will contain other referents that aren't on the frame's stack.

Debuggers use trace functions, so this gets you value stack entries for the top stack frame while debugging, most of the time. It does not get you value stack entries for any other stack frames on the call stack, and it doesn't get you value stack entries while tracing an 'exception' event or any other trace events.

When f_stacktop is NULL, determining the frame's stack contents is close to impossible. You can still see where the stack begins with f_valuestack, but you can't see where it ends. The stack top is stored in a C-level stack_pointer local variable that's really hard to access.

There's the frame's code object's co_stacksize, which gives an upper bound on the stack size, but it doesn't give the actual stack size.
You can't tell where the stack ends by examining the stack itself, because Python doesn't null out the pointers on the stack when it pops entries.
gc.get_referents doesn't return value stack entries when f_stacktop is null. It doesn't know how to retrieve stack entries safely in this case either (and it doesn't need to, because if f_stacktop is null and stack entries exist, the frame is guaranteed reachable).
You might be able to examine the frame's f_lasti to determine the last bytecode instruction it was on and try to figure out where that instruction would leave the stack, but that would take a lot of intimate knowledge of Python bytecode and the bytecode evaluation loop, and it's still ambiguous sometimes (because the frame might be halfway through an instruction). This would at least give you a lower bound on the current stack size, though, letting you safely inspect at least some of it.
Frame objects have independent value stacks that aren't contiguous with each other, so you can't look at the bottom of one frame's stack to find the top of another. (The value stack is actually allocated within the frame object itself.)
You might be able to hunt down the stack_pointer local variable with some GDB magic or something, but it'd be a mess.

answered Sep 29 '22 10:09

user2357112 supports Monica

Related questions
                            
                                cx_Freeze - The appdirs package is required
                            
                                Session is shared between two Flask apps on localhost
                            
                                How restart Scrapy spider
                            
                                Why do I need to deploy a "default" app before I can deploy multiple services in GAE?
                            
                                What needs to be in a setup.py to create a wheel?
                            
                                Python Multiprocessing - Why are my processes are not returning/finishing?
                            
                                Download subset of file from s3 using Boto3
                            
                                How does one achieve parallel gzip compression with Python?
                            
                                How I can specify SQS queue name in celery
                            
                                How to use pretrained Word2Vec model in Tensorflow
                            
                                datetime difference in python adjusted for night time
                            
                                Can I still specify a path to chromedriver using ChromeOptions in Python?
                            
                                Installed Anaconda 4.3.1 (64-bit) which contains Python 3.6 but pip3 missing, cannot install tensorflow
                            
                                "django.contrib.admin.sites.NotRegistered: The model User is not registered" I get this error when a want to register my Custom User.
                            
                                pandas dataframe: how to count the number of 1 rows in a binary column?
                            
                                Pandas dataframe first instance of value in column
                            
                                How to calculate Cohen's kappa coefficient that measures inter-rater agreement ? ( movie review )
                            
                                How do I get Flake8 to work with F811 errors?
                            
                                How to use Bazel's py_library imports argument
                            
                                how to send photo by telegram bot using multipart/form-data

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

In C python, accessing the bytecode evaluation stack

Tags:

python

bytecode

cpython

reverse-engineering

disassembly

rocky

People also ask

1 Answers

user2357112 supports Monica

Recent Activity

Donate For Us