For those who find it too long, just read the bold lines.
My project of gaze estimation based screen cursor moving HCI is now dependent on one last thing - gaze estimation, for which i'm using eye corners as a reference stable point relative to which i will detect the movement of the pupil and calculate the gaze.
But i haven't been able to stably detect eye corners from live webcam feed. I've been using cv.CornerHarris() and GFTT - cv.GoodFeaturesToTrack() functions for corner detection. I tried FAST demo (the executable from their website) directly on my eye images but that wasn't good.
These are some results of my so far corner detections for images.
Using GFTT:
Using Harris:
what happens in video:
The green cirlces are the corners, the others (in pink, smaller circles) are the other corners
I used a certain heuristic - that the corners will be in the left or right extremeties and around the middle if thinking vertically. I've done that because after taking many snapshots in many conditions, except for less than 5% of the images, rest are like these, and for them the above heuristics hold.
But these eye corner detections are for snapshots - not from the webcam feed.
When i use methodologies (harris and GFTT) for webcam feed, i just don't get 'em.
My code for eye corner detection using cv.CornerHarris
Eye corners using GFTT
Now the parameters i use in both methods - they don't show results for different lighting conditions and obviously. But in the same lighting condition as the one in which these snapshots were taken, i'm still not getting the result for the frames i queried from webcam video
These parameters from GFTT work good for average lighting conditions
cornerCount = 100
qualityLevel = 0.1
minDistance = 5
whereas these :
cornerCount = 500
qualityLevel = 0.005
minDistance = 30
worked good for the static image displayed above
minDistance = 30 because obviously the corners would have atleast that much distance, again, something of a trend i saw from my snaps. But i lowered it for the webcam feed version of GFTT because then i wasn't getting any corners at all.
Also, for the live feed version of GFTT, there's a small change i had to accomodate:
cv.CreateImage((colorImage.width, colorImage.height), 8,1)
whereas for the still image version (code on pastebin) i used:
cv.CreateImage(cv.GetSize(grayImage), cv.IPL_DEPTH_32F, 1)
Pay attention to the depths.
Would that change any quality of detection??
The eye image i was passing the GFTT method didn't have a depth of 32F so i had to change it and according the rest of the temporary images (eignenimg, tempimg ,etc)
Bottom line: I've to finish gaze estimation but without stable eye corner detection i can't progress.. and i've to get on to blink detection and template matching based pupil tracking (or do you know better?). Put simply, i want to know if i'm making any rookie mistakes or not doing things which are stopping me from getting the near perfect eye corner detection in my webcam video stream, which i got in my snaps i posted here.
Anyways thanks for giving this a view. Any idea how i could perform eye corner detection for various lighting conditions would be very helpful
Okay, if you didn't get what i'm doing in my code (how i'm getting the left and right corners), i'll explain:
max_dist = 0
maxL = 20
maxR = 0
lc =0
rc =0
maxLP =(0,0)
maxRP =(0,0)
for point in cornerMem:
center = int(point[0]), int(point[1])
x = point[0]
y = point[1]
if ( x<colorImage.width/5 or x>((colorImage.width/4)*3) ) and (y>40 and y<70):
#cv.Circle(image,(x,y),2,cv.RGB(155, 0, 25))
if maxL > x:
maxL = x
maxLP = center
if maxR < x:
maxR = x
maxRP = center
dist = maxR-maxL
if max_dist<dist:
max_dist = maxR-maxL
lc = maxLP
rc = maxRP
cv.Circle(colorImage, (center), 1, (200,100,255)) #for every corner
cv.Circle(colorImage,maxLP,3,cv.RGB(0, 255, 0)) # for left eye corner
cv.Circle(colorImage,maxRP,3,cv.RGB(0,255,0)) # for right eye corner
maxLP and maxRP will store the (x,y) for left and right corners of the eye respectively. What i'm doing here is, taking a variable for left and right corner detection, maxL and maxR respectively, which will be compared to the x-values of the corners detected. Now simply, for maxL, it has to be something more than 0; I assigned it 20 because if the left corner is at (x,y) where x<20, then maxL will be = x, or if say, ie, the leftest corner's X-ordinate is found this way. Similarly for rightest corner.
I tried for maxL = 50 too (but that would mean that the left corner is almost in the middle of the eye region) to get more candidates for the webcam feed - in which i'm not getting any corners at all
Also, max_dist stores the maximum distance between the so far seen X-ordinates, and thus gives a measure of which pair of corners would be left and right eye corners - the one with the maximum distance = max_dist
Also, i've seen from my snapshots that the eye corners' Y-ordinates fall in between 40-70 so i used that too to minimize the candidate pool
I think there is an easy way to help!
It looks as though you are considering each eye in isolation. What I suggest you do is to combine your data for both eyes, and also use facial geometry. I will illustrate my suggestions with a picture that some people may recognise (it is not really the best example, as its a painting, and her face is a bit off centre, but it is certainly the funniest..)
It seems you have relible estimates for the pupil position for both eyes, and providing the face is looking fairly straight on at the camera (facial rotations perpendicular to the screen will be ok using this method), we know that the corners of the eyes (from now on just 'corners') will lie on (or near to) the line that passes through the pupils of both eyes (red dotted line).
We know the distance between the pupils, a
, and we know that the ratio between this distance, and the distance across one eye (corner to corner), b
, is fixed for an individual, and will not change much across the adult population (may differ between sexes).
ie. a / b = constant.
Therefore we can deduce b, independent of the subjects distance from the camera, knowing only a
.
Using this information we can construct threshold boxes for each eye corner (dotted boxes, in detail, labelled 1, 2, 3, 4
). Each box is b
by c
(eye height, again determinable through the same fixed ratio principle) and lies parrallel to the pupil axis. The centre edge of each box is pinned to the centre of the pupil, and moves with it. We know each corner will always be in its very own threshold box!
Now, of course the trouble is the pupils move about, and so do our threshold boxes... but we've massively narrowed down the field this way, because we can confidently discard ALL estimate eye positions (from Harris or GFTT or anything) falling outside of these boxes (provided we are confident about our pupil detection).
If we have high confidence in just one corner position we can extrapolate and deduce all the other corner postions just from geometry! (for both eyes!).
If there is doubt between multiple corner positions we can use knowledge of other corners (from either eye) to resolve it probabilistically linking their positions, making a best guess. ie. do any pair of estimates (within their boxes of course) lie b
apart and parallel to the pupil axis.
I hope this might help you find the elusive d
(pupil displacement from center of eye).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With