Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Access to external process info

I'm trying to figured out how to manage process with Python, although maybe C++ could be better for this. I'm using Python 2.7 and Ubuntu 14.04 is my OS.

Resume of what I'm trying to achieve:

  • Send actions (not signals) to a running process // Interact with the UI of a process
  • Read memory address value

My intention is to create a script to manage other softwares, something similar to what Selenium do with the browsers, but with any program. Maybe execute the process with Python using subprocess would give me the option to manage the process UI


Send Actions/Interact with Running Process

Right now I'm making this script in Linux using psutil. I know there are some Windows Libraries like pywin or pywindll.

I want to manage a process, for example any kind of software with UI(Skype, Gedit, Firefox..), I would like to know if it's possible to send an action to make a click in a button.

I don't want to manage the mouse in the computer, because let's say this window is "hidden" under other windows/stuff:

  • Is possible once I have the process in my script to send a click to a button of the UI ? (or write into some textbox)

I'm using psutil to get the process, and I have a lot of options like:

  • Get the memory mapping
  • Get the threads of the process
  • Kill the process
  • CPU usage
  • etc

But none of this actions seems to be what I'm looking for, that is to interact with the process UI...

  • Is even possible what I'm trying to achieve ?

  • Would be the simplest solution for this to send key strokes and mouse clicks ?


Read Memory Address Value

I've been using scanmem in Linux to find the memmory address of some variable, and once I've found the memory address I'm looking for, I want to use that address in Python to get the value stored in that address.

The closest thing I've found was using ctypes, something like:

from ctypes import string_at
from sys import getsizeof

mem_address = 0x7c3f 
value = string_at(id(mem_address), getsizeof(mem_address))
  • Is this the simplest way to access a memory address in Python ?
  • Is possible to modify/update this number in realtime ?
  • How could I identify the memory address with IU ?

I was thinking that a program when is executed has to send the UI of the program to the OS, could be possible to "capture" the interface with python, and redirect to the OS ?

Something like executing the software via Python so may be possible to manage directly the UI

like image 457
AlvaroAV Avatar asked Sep 11 '14 09:09

AlvaroAV


People also ask

What function is the suggested way to interact with external process from python?

As we said before, the run function is the recommended way to run an external process and should cover the majority of cases.

How do you start a process in python?

How to Start a Process in Python? To start a new process, or in other words, a new subprocess in Python, you need to use the Popen function call. It is possible to pass two parameters in the function call. The first parameter is the program you want to start, and the second is the file argument.


1 Answers

I like the way you think :D UI automation is awesome

On the question itself, as far as I can tell, all the software that can interact with the GUI of processes is based on computer vision with OCR or reading the memory to get the object model of the UI. The latter is probably not universal, since different widget toolkits and approaches to building the UI will have different underlying models - it'll probably be more of a pain in the ass than CV+OCR.

If you want to see some stuff that has already been made for this purpose check out the wikpedia list. You already know about Selenium but there's more - AutoIt and Sikuli where ones I checked for a similar project I want to make, in python. (AutoIt is BASIC-like -YUCK- and windows-only but Sikuli seems to be python related and is cross-platform - I checked them out ages ago so I don't remember details).

The really good news is that python has pretty good CV and OCR modules. My personal recommendation is simplecv which can wrap around opencv and other cv software and although I don't have a module of choice to recommend for OCR, I liked python-tesseract the most when I was scouting for modules.

The approach is generally to take a snapshot of the GUI (graphicsmagick can do that plenty well and there's a python wrapper for it), make out where the elements are with CV, read the labels with OCR and that way you get a model for what is on the window. Then, you give your script instructions on what to do and when, based on where it is on the GUI. Since python can send mouse and keyboard events, you're golden. You can even use the minidom module to make an easier object model for your code.

As an aside, the CV+OCR approach is also used by a Hearthstone-related app that takes snapshots of the game and reads the score, which it then tracks for the player so they can make up metrics. It's a more lightweight and easy approach than it seems - I checked out the code and it was quite easy to understand, despite the heavyweight technologies behind it.

like image 143
mechalynx Avatar answered Oct 01 '22 08:10

mechalynx