Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to access a data structure from a currently running Python process on Linux?

I have a long-running Python process that is generating more data than I planned for. My results are stored in a list that will be serialized (pickled) and written to disk when the program completes -- if it gets that far. But at this rate, it's more likely that the list will exhaust all 1+ GB free RAM and the process will crash, losing all my results in the process.

I plan to modify my script to write results to disk periodically, but I'd like to save the results of the currently-running process if possible. Is there some way I can grab an in-memory data structure from a running process and write it to disk?

I found code.interact(), but since I don't have this hook in my code already, it doesn't seem useful to me (Method to peek at a Python program running right now).

I'm running Python 2.5 on Fedora 8. Any thoughts?

Thanks a lot.

Shahin

like image 245
Shahin Avatar asked Oct 04 '10 04:10

Shahin


People also ask

How do I view python scripts in Linux?

Running a ScriptOpen the terminal by searching for it in the dashboard or pressing Ctrl + Alt + T . Navigate the terminal to the directory where the script is located using the cd command. Type python SCRIPTNAME.py in the terminal to execute the script.

How do I see what python processes are running on Linux?

I usually use ps -fA | grep python to see what processes are running. The CMD will show you what python scripts you have running, although it won't give you the directory of the script.

How do I open a file from command line in python?

To run Python scripts with the python command, you need to open a command-line and type in the word python , or python3 if you have both versions, followed by the path to your script, just like this: $ python3 hello.py Hello World!


2 Answers

There is not much you can do for a running program. The only thing I can think of is to attach the gdb debugger, stop the process and examine the memory. Alternatively make sure that your system is set up to save core dumps then kill the process with kill --sigsegv <pid>. You should then be able to open the core dump with gdb and examine it at your leisure.

There are some gdb macros that will let you examine python data structures and execute python code from within gdb, but for these to work you need to have compiled python with debug symbols enabled and I doubt that is your case. Creating a core dump first then recompiling python with symbols will NOT work, since all the addresses will have changed from the values in the dump.

Here are some links for introspecting python from gdb:

http://wiki.python.org/moin/DebuggingWithGdb

http://chrismiles.livejournal.com/20226.html

or google for 'python gdb'

N.B. to set linux to create coredumps use the ulimit command.

ulimit -a will show you what the current limits are set to.

ulimit -c unlimited will enable core dumps of any size.

like image 126
Dave Kirby Avatar answered Sep 21 '22 17:09

Dave Kirby


While certainly not very pretty you could try to access data of your process through the proc filesystem.. /proc/[pid-of-your-process]. The proc filesystem stores a lot of per process information such as currently open file pointers, memory maps and what not. With a bit of digging you might be able to access the data you need though.

Still i suspect you should rather look at this from within python and do some runtime logging&debugging.

like image 22
gilligan Avatar answered Sep 17 '22 17:09

gilligan