how to perform stable eye corner detection?

https://stackoverflow.com//questions/9645871

10-12-2019
|

Question

For those who find it too long, just read the bold lines.

My project of gaze estimation based screen cursor moving HCI is now dependent on one last thing - gaze estimation, for which i'm using eye corners as a reference stable point relative to which i will detect the movement of the pupil and calculate the gaze.

But i haven't been able to stably detect eye corners from live webcam feed. I've been using cv.CornerHarris() and GFTT - cv.GoodFeaturesToTrack() functions for corner detection. I tried FAST demo (the executable from their website) directly on my eye images but that wasn't good.

These are some results of my so far corner detections for images.

Using GFTT:

good lighting, using GFTT

Using Harris:

using cv.CornerHarris

what happens in video:

corners in video using GFTT The green cirlces are the corners, the others (in pink, smaller circles) are the other corners

I used a certain heuristic - that the corners will be in the left or right extremeties and around the middle if thinking vertically. I've done that because after taking many snapshots in many conditions, except for less than 5% of the images, rest are like these, and for them the above heuristics hold.

But these eye corner detections are for snapshots - not from the webcam feed.

When i use methodologies (harris and GFTT) for webcam feed, i just don't get 'em.

My code for eye corner detection using cv.CornerHarris

Eye corners using GFTT

Now the parameters i use in both methods - they don't show results for different lighting conditions and obviously. But in the same lighting condition as the one in which these snapshots were taken, i'm still not getting the result for the frames i queried from webcam video

These parameters from GFTT work good for average lighting conditions

cornerCount = 100
qualityLevel = 0.1
minDistance = 5

whereas these :

    cornerCount = 500
    qualityLevel = 0.005
    minDistance = 30

worked good for the static image displayed above

minDistance = 30 because obviously the corners would have atleast that much distance, again, something of a trend i saw from my snaps. But i lowered it for the webcam feed version of GFTT because then i wasn't getting any corners at all.

Also, for the live feed version of GFTT, there's a small change i had to accomodate:

cv.CreateImage((colorImage.width, colorImage.height), 8,1)

whereas for the still image version (code on pastebin) i used:

cv.CreateImage(cv.GetSize(grayImage), cv.IPL_DEPTH_32F, 1)

Pay attention to the depths.

Would that change any quality of detection??

The eye image i was passing the GFTT method didn't have a depth of 32F so i had to change it and according the rest of the temporary images (eignenimg, tempimg ,etc)

Bottom line: I've to finish gaze estimation but without stable eye corner detection i can't progress.. and i've to get on to blink detection and template matching based pupil tracking (or do you know better?). Put simply, i want to know if i'm making any rookie mistakes or not doing things which are stopping me from getting the near perfect eye corner detection in my webcam video stream, which i got in my snaps i posted here.

Anyways thanks for giving this a view. Any idea how i could perform eye corner detection for various lighting conditions would be very helpful

Okay, if you didn't get what i'm doing in my code (how i'm getting the left and right corners), i'll explain:

max_dist = 0
maxL = 20
maxR = 0

lc =0
rc =0

maxLP =(0,0)
maxRP =(0,0)

for point in cornerMem:
    center = int(point[0]), int(point[1])

    x = point[0]
    y = point[1]


    if ( x<colorImage.width/5 or x>((colorImage.width/4)*3) ) and (y>40 and y<70):
                      #cv.Circle(image,(x,y),2,cv.RGB(155, 0, 25))

                      if maxL > x:
                               maxL = x
                               maxLP = center


                      if maxR < x:
                               maxR = x
                               maxRP = center

                      dist = maxR-maxL

                      if max_dist<dist:
                           max_dist = maxR-maxL
                           lc = maxLP
                           rc = maxRP





    cv.Circle(colorImage, (center), 1, (200,100,255)) #for every corner

cv.Circle(colorImage,maxLP,3,cv.RGB(0, 255, 0)) # for left eye corner
cv.Circle(colorImage,maxRP,3,cv.RGB(0,255,0))   # for right eye corner

maxLP and maxRP will store the (x,y) for left and right corners of the eye respectively. What i'm doing here is, taking a variable for left and right corner detection, maxL and maxR respectively, which will be compared to the x-values of the corners detected. Now simply, for maxL, it has to be something more than 0; I assigned it 20 because if the left corner is at (x,y) where x<20, then maxL will be = x, or if say, ie, the leftest corner's X-ordinate is found this way. Similarly for rightest corner.

I tried for maxL = 50 too (but that would mean that the left corner is almost in the middle of the eye region) to get more candidates for the webcam feed - in which i'm not getting any corners at all

Also, max_dist stores the maximum distance between the so far seen X-ordinates, and thus gives a measure of which pair of corners would be left and right eye corners - the one with the maximum distance = max_dist

Also, i've seen from my snapshots that the eye corners' Y-ordinates fall in between 40-70 so i used that too to minimize the candidate pool

Solution 2

i changed this

if ( x<colorImage.width/5 or x>((colorImage.width/4)*3) ) and (y>40 and y<70):

to this:

if ( x<(w/5) or x>((w/4)*3) ) and (y>int(h*0.45) and y<int(h*0.65)):

because earlier i was just manually looking at pixel values beyond which i my windows where corners could be found with the highest probability. But then afterwards i realised, lets make it general, so i made a horizontal window of 45 to 65 pc of the Y range, and 1/5th to 3/4ths for X range, because that's the usual area within which the corners are.

I'm sorry guys for replying late, i was busy with the later part of the project - gaze estimation. And i'm gonna post a question about it, i'm stuck in it.

by the way, here are few pictures of eye corners and pupil detected in my eye: (enlarged to 100x100)

enter image description here

enter image description here Hope this will be useful for others beginning in this area.

OTHER TIPS

I think there is an easy way to help!

It looks as though you are considering each eye in isolation. What I suggest you do is to combine your data for both eyes, and also use facial geometry. I will illustrate my suggestions with a picture that some people may recognise (it is not really the best example, as its a painting, and her face is a bit off centre, but it is certainly the funniest..)

enter image description here

It seems you have relible estimates for the pupil position for both eyes, and providing the face is looking fairly straight on at the camera (facial rotations perpendicular to the screen will be ok using this method), we know that the corners of the eyes (from now on just 'corners') will lie on (or near to) the line that passes through the pupils of both eyes (red dotted line).

We know the distance between the pupils, a, and we know that the ratio between this distance, and the distance across one eye (corner to corner), b, is fixed for an individual, and will not change much across the adult population (may differ between sexes).

ie. a / b = constant.

Therefore we can deduce b, independent of the subjects distance from the camera, knowing only a.

Using this information we can construct threshold boxes for each eye corner (dotted boxes, in detail, labelled 1, 2, 3, 4). Each box is b by c (eye height, again determinable through the same fixed ratio principle) and lies parrallel to the pupil axis. The centre edge of each box is pinned to the centre of the pupil, and moves with it. We know each corner will always be in its very own threshold box!

Now, of course the trouble is the pupils move about, and so do our threshold boxes... but we've massively narrowed down the field this way, because we can confidently discard ALL estimate eye positions (from Harris or GFTT or anything) falling outside of these boxes (provided we are confident about our pupil detection).

If we have high confidence in just one corner position we can extrapolate and deduce all the other corner postions just from geometry! (for both eyes!).
If there is doubt between multiple corner positions we can use knowledge of other corners (from either eye) to resolve it probabilistically linking their positions, making a best guess. ie. do any pair of estimates (within their boxes of course) lie b apart and parallel to the pupil axis.
If you can get general 'eye' positions that do not move when the pupil moves around (or in fact any facial feature on the same plane), this is massively useful and allows you to determine the corners positions geometrically.

I hope this might help you find the elusive d (pupil displacement from center of eye).

Have you tried sclera segmentation?

You might be able to do with the 2 corners of sclera as well, and this might be easier because you already have a decent pupil detection working, sclera is the brighter region surrounding the pupil.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow