Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most efficient/quickest way to parse pixel data with Python?

I have created a simple Python script that gets activated whenever a specific program is running. That program sends information to the screen, which the script needs to grab and analyze.

Part of the script's logic can be expressed as follows:

while a certain condition is met:
    function to continuously check pixel information on a fixed area of the screen()
    if pixel data (e.g. RGB) changes:
        do something
    else:
        continues to check

I have already found something that does exactly this, but not quite as fast as I'd like. Here is a solution using Python Imaging Library (PIL) with arbitrary values:

import ImageGrab

box = (0,0,100,100) # 100x100 screen area to capture (0x0 is top left corner)
pixel = (60,20) #target pixel coordenates (must be within the box's boundaries)
im = ImageGrab.grab(box) #grabs the image area (aka printscreen) -> source of bottleneck
hm = im.getpixel(pixel) # gets pixel information from the captured image in the form of an RGB value

I can then take that RGB value and compare it with the previous value obtained by the function. If it changed then something happened in the screen, which means the program did something, and so the script can behave accordingly. However, the script needs to react fast, especially because this is just part of a larger function with its own intricacies and flaws, so I'm in the process of optimizing the code bit by bit, starting by this.

This solution limits the script to ~30 iterations per second on a i7 4770k cpu. Seems fast, but adding it with other functions which themselves parse pixel information at a similar rate and things start to add up . My goal is at least 200, maybe 150 iterations per second on a single function so that the end script can run at 5-10 iterations per second.

So, long story short: what other method is there to parse pixels from the screen more rapidly?

like image 531
Daniel Shaw Avatar asked Jan 10 '23 18:01

Daniel Shaw


1 Answers

Alright peeps, after some digging turns out it is indeed possible to do what exactly what I wanted with Python and the simple pywin32 module (thanks based Mark Hammond). There's no need for the "beefier" language or to outsource the job to numpy and whatnot. Here it is, 5 lines of code (6 with the import):

import win32ui
window_name = "Target Window Name" # use EnumerateWindow for a complete list
wd = win32ui.FindWindow(None, window_name)
dc = wd.GetWindowDC() # Get window handle
j = dc.GetPixel (60,20)  # as practical and intuitive as using PIL!
print j
dc.DeleteDC() # necessary to handle garbage collection, otherwise code starts to slow down over many iterations

And that's it. It will return a number (COLORREF) of the selected pixel on each iteration, which is a way to represent color (just like RGB or hex) and, most importantly, data I can parse! If you aren't convinced here are some benchmarks on my desktop pc (standard Python build CPython and i7 4770k):

My previous solution wrapped around a virtual stopwatch (feel free to run them yourself and check it):

    import ImageGrab, time
    box = (0,0,100,100) #100 x 100 square box to capture
    pixel = (60,20) #pixel coordinates (must be within the box's boundaries)
    t1 = time.time()
    count = 0
    while count < 1000:
        s = ImageGrab.grab(box) #grabs the image area
        h = s.getpixel(pixel) #gets pixel RGB value
        count += 1
    t2 = time.time()
    tf = t2-t1
    it_per_sec = int(count/tf)
    print (str(it_per_sec) + " iterations per second")

Obtained 29 iterations per second. Let's use this as the base speed to which we'll make our comparisons.

Here's the solution pointed by BenjaminGolder using ctypes:

from ctypes import windll
import time
dc= windll.user32.GetDC(0)
count = 0
t1 = time.time()
while count < 1000:
    a= windll.gdi32.GetPixel(dc,x,y)
    count += 1
t2 = time.time()
tf = t2-t1
print int(count/tf)

Average 54 iterations per second. That's a fancy 86% improvement but it is not the order of magnitude improvement I was looking for.

So, finally, here is it comes:

name = "Python 2.7.6 Shell" #just an example of a window I had open at the time
w = win32ui.FindWindow( None, name )
t1 = time.time()
count = 0
while count < 1000:
    dc = w.GetWindowDC()
    dc.GetPixel (60,20)
    dc.DeleteDC()
    count +=1
t2 = time.time()
tf = t2-t1
it_per_sec = int(count/tf)
print (str(it_per_sec) + " iterations per second")

Roughly 16000 iterations a second of a pixel thirsty script. Yes, 16000. That's at least 2 orders of magnitude faster than the previous solutions and a whooping 29600 % improvement. It's so fast that the count+=1 increment slows it down. I did some tests on 100k iterations because 1000 was too low for this piece of code, the average stays roughly the same, 14-16k iterations/second. It also did the job in 7-8 seconds, whereas the previous ones where started when I started writing this and... well they are still going.

Alright, and that's it! Hope this can help anyone with a similar objective and faced similar problems. And remember, Python finds a way.

like image 73
Daniel Shaw Avatar answered Jan 17 '23 11:01

Daniel Shaw