Question

I have created a simple Python script that gets activated whenever a specific program is running. That program sends information to the screen, which the script needs to grab and analyze.

Part of the script's logic can be expressed as follows:

while a certain condition is met:
    function to continuously check pixel information on a fixed area of the screen()
    if pixel data (e.g. RGB) changes:
        do something
    else:
        continues to check

I have already found something that does exactly this, but not quite as fast as I'd like. Here is a solution using Python Imaging Library (PIL) with arbitrary values:

import ImageGrab

box = (0,0,100,100) # 100x100 screen area to capture (0x0 is top left corner)
pixel = (60,20) #target pixel coordenates (must be within the box's boundaries)
im = ImageGrab.grab(box) #grabs the image area (aka printscreen) -> source of bottleneck
hm = im.getpixel(pixel) # gets pixel information from the captured image in the form of an RGB value

I can then take that RGB value and compare it with the previous value obtained by the function. If it changed then something happened in the screen, which means the program did something, and so the script can behave accordingly. However, the script needs to react fast, especially because this is just part of a larger function with its own intricacies and flaws, so I'm in the process of optimizing the code bit by bit, starting by this.

This solution limits the script to ~30 iterations per second on a i7 4770k cpu. Seems fast, but adding it with other functions which themselves parse pixel information at a similar rate and things start to add up . My goal is at least 200, maybe 150 iterations per second on a single function so that the end script can run at 5-10 iterations per second.

So, long story short: what other method is there to parse pixels from the screen more rapidly?

Was it helpful?

Solution

Alright peeps, after some digging turns out it is indeed possible to do what exactly what I wanted with Python and the simple pywin32 module (thanks based Mark Hammond). There's no need for the "beefier" language or to outsource the job to numpy and whatnot. Here it is, 5 lines of code (6 with the import):

import win32ui
window_name = "Target Window Name" # use EnumerateWindow for a complete list
wd = win32ui.FindWindow(None, window_name)
dc = wd.GetWindowDC() # Get window handle
j = dc.GetPixel (60,20)  # as practical and intuitive as using PIL!
print j
dc.DeleteDC() # necessary to handle garbage collection, otherwise code starts to slow down over many iterations

And that's it. It will return a number (COLORREF) of the selected pixel on each iteration, which is a way to represent color (just like RGB or hex) and, most importantly, data I can parse! If you aren't convinced here are some benchmarks on my desktop pc (standard Python build CPython and i7 4770k):

My previous solution wrapped around a virtual stopwatch (feel free to run them yourself and check it):

    import ImageGrab, time
    box = (0,0,100,100) #100 x 100 square box to capture
    pixel = (60,20) #pixel coordinates (must be within the box's boundaries)
    t1 = time.time()
    count = 0
    while count < 1000:
        s = ImageGrab.grab(box) #grabs the image area
        h = s.getpixel(pixel) #gets pixel RGB value
        count += 1
    t2 = time.time()
    tf = t2-t1
    it_per_sec = int(count/tf)
    print (str(it_per_sec) + " iterations per second")

Obtained 29 iterations per second. Let's use this as the base speed to which we'll make our comparisons.

Here's the solution pointed by BenjaminGolder using ctypes:

from ctypes import windll
import time
dc= windll.user32.GetDC(0)
count = 0
t1 = time.time()
while count < 1000:
    a= windll.gdi32.GetPixel(dc,x,y)
    count += 1
t2 = time.time()
tf = t2-t1
print int(count/tf)

Average 54 iterations per second. That's a fancy 86% improvement but it is not the order of magnitude improvement I was looking for.

So, finally, here is it comes:

name = "Python 2.7.6 Shell" #just an example of a window I had open at the time
w = win32ui.FindWindow( None, name )
t1 = time.time()
count = 0
while count < 1000:
    dc = w.GetWindowDC()
    dc.GetPixel (60,20)
    dc.DeleteDC()
    count +=1
t2 = time.time()
tf = t2-t1
it_per_sec = int(count/tf)
print (str(it_per_sec) + " iterations per second")

Roughly 16000 iterations a second of a pixel thirsty script. Yes, 16000. That's at least 2 orders of magnitude faster than the previous solutions and a whooping 29600 % improvement. It's so fast that the count+=1 increment slows it down. I did some tests on 100k iterations because 1000 was too low for this piece of code, the average stays roughly the same, 14-16k iterations/second. It also did the job in 7-8 seconds, whereas the previous ones where started when I started writing this and... well they are still going.

Alright, and that's it! Hope this can help anyone with a similar objective and faced similar problems. And remember, Python finds a way.

OTHER TIPS

Actually, you should not try to check pixel by pixel in Python loops,as stated in the comments. You might try Pypy - witht he proper data constucts and pypy you can get an improvement of 10 fold using pure Python code and pixel by pixel data.

However, the usual practice is to have Python call a library in native code to do the pixel manipulation. PIL and Numpy are such libraries - what you should do instead of checking the value of each pixel in Python is, for example, to have image rectangular areas to be subtracted one from another, so that you get a matrix with the different pixels, and then use Numpy to treat these differences as you need. That would be fast, and you'd still be using Python for all the high level things you need.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top