Python Get Screen Pixel Value in OS X

Question

I'm in the process of building an automated game bot in Python on OS X 10.8.2 and in the process of researching Python GUI automation I discovered autopy. The mouse manipulation API is great, but it seems that the screen capture methods rely on deprecated OpenGL methods...

Are there any efficient ways of getting the color value of a pixel in OS X? The only way I can think of now is to use os.system("screencapture foo.png") but the process seems to have unneeded overhead as I'll be polling very quickly.

n4p · Accepted Answer

This was all so very helpful I had to come back to comment / however I don't have the reputation.. I do, however, have a sample code of a combination of the answers above for a lightning quick screen capture / save thanks to @dbr and @qqg!

import time
import numpy as np
from scipy.misc import imsave
import Quartz.CoreGraphics as CG

image = CG.CGWindowListCreateImage(CG.CGRectInfinite, CG.kCGWindowListOptionOnScreenOnly, CG.kCGNullWindowID, CG.kCGWindowImageDefault)

prov = CG.CGImageGetDataProvider(image)
_data = CG.CGDataProviderCopyData(prov)

width = CG.CGImageGetWidth(image)
height = CG.CGImageGetHeight(image)

imgdata=np.fromstring(_data,dtype=np.uint8).reshape(len(_data)/4,4)
numpy_img = imgdata[:width*height,:-1].reshape(height,width,3)
imsave('test_fast.png', numpy_img)

dbr · Answer

A small improvement, but using the TIFF compression option for screencapture is a bit quicker:

$ time screencapture -t png /tmp/test.png
real        0m0.235s
user        0m0.191s
sys         0m0.016s
$ time screencapture -t tiff /tmp/test.tiff
real        0m0.079s
user        0m0.028s
sys         0m0.026s

This does have a lot of overhead, as you say (the subprocess creation, writing/reading from disc, compressing/decompressing).

Instead, you could use PyObjC to capture the screen using CGWindowListCreateImage. I found it took about 70ms (~14fps) to capture a 1680x1050 pixel screen, and have the values accessible in memory

A few random notes:

Importing the Quartz.CoreGraphics module is the slowest part, about 1 second. Same is true for importing most of the PyObjC modules. Unlikely to matter in this case, but for short-lived processes you might be better writing the tool in ObjC
Specifying a smaller region is a bit quicker, but not hugely (~40ms for a 100x100px block, ~70ms for 1680x1050). Most of the time seems to be spent in just the CGDataProviderCopyData call - I wonder if there's a way to access the data directly, since we dont need to modify it?
The ScreenPixel.pixel function is pretty quick, but accessing large numbers of pixels is still slow (since 0.01ms * 1650*1050 is about 17 seconds) - if you need to access lots of pixels, probably quicker to struct.unpack_from them all in one go.

Here's the code:

import time
import struct

import Quartz.CoreGraphics as CG


class ScreenPixel(object):
    """Captures the screen using CoreGraphics, and provides access to
    the pixel values.
    """

    def capture(self, region = None):
        """region should be a CGRect, something like:

        >>> import Quartz.CoreGraphics as CG
        >>> region = CG.CGRectMake(0, 0, 100, 100)
        >>> sp = ScreenPixel()
        >>> sp.capture(region=region)

        The default region is CG.CGRectInfinite (captures the full screen)
        """

        if region is None:
            region = CG.CGRectInfinite
        else:
            # TODO: Odd widths cause the image to warp. This is likely
            # caused by offset calculation in ScreenPixel.pixel, and
            # could could modified to allow odd-widths
            if region.size.width % 2 > 0:
                emsg = "Capture region width should be even (was %s)" % (
                    region.size.width)
                raise ValueError(emsg)

        # Create screenshot as CGImage
        image = CG.CGWindowListCreateImage(
            region,
            CG.kCGWindowListOptionOnScreenOnly,
            CG.kCGNullWindowID,
            CG.kCGWindowImageDefault)

        # Intermediate step, get pixel data as CGDataProvider
        prov = CG.CGImageGetDataProvider(image)

        # Copy data out of CGDataProvider, becomes string of bytes
        self._data = CG.CGDataProviderCopyData(prov)

        # Get width/height of image
        self.width = CG.CGImageGetWidth(image)
        self.height = CG.CGImageGetHeight(image)

    def pixel(self, x, y):
        """Get pixel value at given (x,y) screen coordinates

        Must call capture first.
        """

        # Pixel data is unsigned char (8bit unsigned integer),
        # and there are for (blue,green,red,alpha)
        data_format = "BBBB"

        # Calculate offset, based on
        # http://www.markj.net/iphone-uiimage-pixel-color/
        offset = 4 * ((self.width*int(round(y))) + int(round(x)))

        # Unpack data from string into Python'y integers
        b, g, r, a = struct.unpack_from(data_format, self._data, offset=offset)

        # Return BGRA as RGBA
        return (r, g, b, a)


if __name__ == '__main__':
    # Timer helper-function
    import contextlib

    @contextlib.contextmanager
    def timer(msg):
        start = time.time()
        yield
        end = time.time()
        print "%s: %.02fms" % (msg, (end-start)*1000)


    # Example usage
    sp = ScreenPixel()

    with timer("Capture"):
        # Take screenshot (takes about 70ms for me)
        sp.capture()

    with timer("Query"):
        # Get pixel value (takes about 0.01ms)
        print sp.width, sp.height
        print sp.pixel(0, 0)


    # To verify screen-cap code is correct, save all pixels to PNG,
    # using http://the.taoofmac.com/space/projects/PNGCanvas

    from pngcanvas import PNGCanvas
    c = PNGCanvas(sp.width, sp.height)
    for x in range(sp.width):
        for y in range(sp.height):
            c.point(x, y, color = sp.pixel(x, y))

    with open("test.png", "wb") as f:
        f.write(c.dump())

qqg · Answer

I came across this post while searching for a solution to get screenshot in Mac OS X used for real-time processing. I have tried using ImageGrab from PIL as suggested in some other posts but couldn't get the data fast enough (with only about 0.5 fps).

The answer https://stackoverflow.com/a/13024603/3322123 in this post to use PyObjC saved my day! Thanks @dbr!

However, my task requires to get all pixel values rather than just a single pixel, and also to comment on the third note by @dbr, I added a new method in this class to get a full image, in case anyone else might need it.

The image data are returned as a numpy array with dimension of (height, width, 3), which can be directly used for post-processing in numpy or opencv etc… getting individual pixel values from it also becomes pretty trivial using numpy indexing.

I tested the code with a 1600 x 1000 screenshot - getting the data using capture() took ~30 ms and converting it to a np array getimage() takes only ~50 ms on my Macbook. So now I have >10 fps and even faster for smaller regions.

import numpy as np

def getimage(self):
    imgdata=np.fromstring(self._data,dtype=np.uint8).reshape(len(self._data)/4,4)
    return imgdata[:self.width*self.height,:-1].reshape(self.height,self.width,3)

note I throw away the “alpha” channel from the BGRA 4 channel.

Python Get Screen Pixel Value in OS X

Tags:

python

macos

automation

ui-automation

itsachen

3 Answers

n4p

dbr

qqg

Recent Activity

Donate For Us

Python Get Screen Pixel Value in OS X

Tags:

python

macos

automation

ui-automation

itsachen

3 Answers

n4p

dbr

qqg

Related questions

Recent Activity

Donate For Us