A Method for Assessing the Pupil Center Coordinates in Eyetracking with a Free Head Position

. IR-illuminated Eyetracking systems include cornea reflection and pupil center coordinates detection to calculate the ope rator’ s gaze fixation point. When you turn a view of a large angle, some of the frames are blurred, and the coordinates are unreliable. The article describes a method for determining the center of the pupil in the gaze fixation system for operation at an increased camera frame rate. Comparison with known algorithms is given. The algorithm execution average time is about 1.2 ms on a typical office computer by processing images in fragments of the order of 340x240 pixels. © 2021 Journal of Biomedical Photonics & Engineering.


Introduction
Person's gaze determining systems are beginning to be applied in various areas of activity. In technology -the devices control, first, remotely unmanned devices [1,2]. In medicine, it is the detection of the human condition in psychology [3], the study of the central nervous system [4], the assessment of visual acuity in ophthalmology [5]. In marketing, Eyetracking is used to study how to inspect a scene when promoting products [6]. The considered system is designed to device control.
The most widespread approaches are systems with IR illumination of the eyes, which provide the greatest contrast between the pupil and the iris and the least influence of extraneous illumination. For each video stream frame in such systems, the main algorithms are executed: finding eye cornea reflections from illuminators and determining their coordinates; finding the pupil and determining its center coordinates; calculation gaze points from the cornea reflection and the pupil center coordinates.
With free head position, the size of the eye area is significantly smaller than the size of the frame. To reduce the processing time, the eye area is localized first. Region of interest (ROI) selection is used when processing various types of images, for example, medical [7,8]. In this paper, the task is simplified by presence of the cornea reflections from IR illuminators, which is brighter than the rest of the image. Cornea reflections are easy to detect, but you need to sort them into true and parasitic ones.
During saccades movements eye movement speed reaches 450-700 deg/sec. In Eyetracking system with a frame rate of 30-50 Hz the eye image is blurred, which prevents the necessary accuracy of the gaze direction registration.
The proposed algorithm uses threshold selection of the pupil boundary and adaptive threshold estimation, which require significantly less computational resources.

Materials and Methods
When the camera is not mounted on a person's head, it is necessary that the camera image capture an area approximately twice the distance between the eyes in order to track the operator head's movement. In this case, it is necessary to determine the eyes position in the image before the pupil searching.
The system uses IR LED illumination, which increases the contrast between the pupil and the iris. In addition, bright small size corneal reflections are formed on the cornea of the eye. Therefore, it is possible to allocate the fragments including the eye image, and image processing should be carried out in the fragments. To reduce the computational resources needs, the cutoff threshold carries out the pupil boundaries selection. While the difference in the pupil and the iris brightness is small, and the lighting conditions may be changed, an adaptive threshold is applied, which is adjusted at each frame in which the pupil is detected. Fig. 1a shows the stages of eye image processing.
Start point for pupil search is marked with a blue cross. The threshold level is estimated inside the blue rectangle for red outlines. RANSAC (Random Sample Consensus) algorithm [9] has left a green contour, along which the white ellipse parameters were estimated.
After the detection and selection of corneal reflections (Fig. 2), the fragment with the eye image is allocated and the procedure for isolating the pupil boundaries and the ellipse parameters is performed (Fig. 1a).
To obtain the coordinates of the pupil center, the following steps are performed: • thresholding the image; • searching areas boundaries and selection a specified range length contour (Fig. 1b); • selecting boundary points inliers by the RANSAC [9]; • refinement the ellipse parameters by the least squares method [10]; • estimation current frame threshold value and recursive filtering by time (frame number).
At the first pupil search, the threshold is estimating before threshold procedure. The assessment is provided on the area outlined by a blue rectangle. The area is getting out above patches of corneal reflections as IR LEDs are below the monitor. After filtration for noise reduction, the threshold level is getting out in the middle between a pupil and an iris brightness. This approach works both at dark and at light pupils (Fig. 2). The binary image is obtained by using the threshold. The boundary contours of the regions are found and highlighted by OpenCV findContours() method, described in the Ref. [11], and discarded the contours with boundary length outside the specified range.
The chosen contour boundary points are checked by RANSAC for belongs to the ellipse model. The parameters of the ellipse are clarified by the least square method from OpenCV fitEllipseDirect() function.
A change of the pupil area is also checked, and if the change is greater than the specified one, the frame is discarded. Fig. 1c shows the change in pupil size when winking.
After obtaining the corneal reflection and the pupil center coordinates, the three-dimensional cornea center coordinates and the gaze direction are determined. The intersection of the gaze vector with the monitor plane defines the gaze fixation point.  computer is used to take the image from the camera. There is no threading parallelization.

Results
The software modules are written in C++, image processing was performed using the OpenCV library [12].
The determination of the pupil center was carried out in fragments of about 350×240 pixels, including the eye image (right or left by choice).
Since described image frame processing algorithms contain iterative procedures, the processing time depends on the scene in the frame. During the experiments, the procedure of calibration, calibration test and two seconds fix point hold was carried out. The sequence contains a sufficient variety of scenes. The real time frame processing is carried out to obtain the coordinates of the gaze fixation point and the frames recording at camera frame rate. Then, the recorded to a hard disk sequence may be processed from the disk at the fastest speed for a computer. By disabling the given algorithm, it is possible to estimate the average running time of the algorithm by the difference in processing time.
Frame file reading and entire image processing lasts about 8 ms on a 3.6 GHz processor, and the pupil center coordinates are obtained in an average of 1.2 ms. Depending on the image quality, the edge points outliers cutoff time by the RANSAC algorithm may be significantly increased in some frames, but the image arrays buffering evens out such deviations. Frame loss was also monitored. The system was tested at frequencies of 50 Hz and 100 Hz. At a frequency of 200 Hz, the entire frame time is occupied by the exchange with the camera and processing in real time is impossible. Executing the program in multiple threads allows you to bypass these limitations.
With a camera operating frequency 50 Hz, the transfer of the gaze fixation point from one monitor corner to another takes 2-3 frames. The eye is blurred and the frames have to be discarded. At a frequency 100 Hz, one image is blurred, and the pupil center position is determined on it, albeit with an error.
During the processing a real sequence, it is impossible to separate the algorithm operation influence error and fast eye movements. The system noise effect on the accuracy of the ellipse center coordinates determination was evaluated on a sequence formed from one real frame with different realizations of noise. The noise level, estimated from the difference between adjacent real frames, is 1.37 image sampling level. The ellipse center coordinates standard deviation Xe and Ye for 100 sequence frames at different noise levels is given in the Table 1. Table 1 shows that stability is maintained up to high noise levels. When operating at a frequency 100 Hz, the noise level is slightly higher, because when the exposure time is decreased, the camera gain must be increased.
The system operation was tested on operators of different ages, different irises colors, and different pupil sizes both in laboratory and in the field. Table 1 The noise value effect on the accuracy on the pupil center coordinates estimating.

Discussion
The well-known algorithm for pupil finding and determining its center coordinates Starburst [13] was developed for a head-mounted device. The MatLab version works well but slowly. There is approximately one second per frame of the camera image. A simplified version in C++ manages to process at 50 fps frame rate. Nevertheless, the pupil's border is very noisy because the radius derivative is evaluated in 4-6 pixels increments to highlight the level increment at low contrast. Attempts to speed up the detection for the pupil center have been made several times. For example, in Ref. [14], the pupil contour is approximated by a sinusoid (SET method). The comparison of the accuracy of determining the pupil center by the Starburst and SET methods is carried out. However, approximation methods generally use iterative approximation and are slow. The algorithm was tested on small images containing the eye area that is typical for head-mounted devices.
The proposed algorithm selects a smoother border that is close to the real one. However, it uses the starting point after corneal reflection highlighting. StarBurst can find the border without specifying the starting point but only in images slightly larger than the eye size. You still need to find the eye area and select a fragment.
The proposed algorithm works on the both light and dark pupil. Many algorithms work with light pupil. We focused on dark pupil option. The hardware implementation is simpler, does not require a small diameter lens with central LEDs around it. Only two side LEDs are required to calculate the 3D eye position. There is also no need to synchronize the illuminators switching with camera frame changes.

Conclusion
The proposed algorithm for the determination of the pupil center coordinates provides stable results with a small scatter. The system was tested at frame rates of 50 and 100 Hz. The algorithm can be used at higher camera frequencies when using a camera with a higher channel bandwidth. Frames image processing in parallel with obtaining the next frame will also allow the pupil center coordinates assessing at higher frequencies.
It is also necessary to reduce computational resources requirement of the cornea reflection extraction and selection algorithm, which takes up most of the processing time. As already mentioned, the main requirement is the IR illumination presence, which increases the contrast between the pupil and the iris. Hence the main limitation: the direct sunlight exposure lack. The use of the system is limited to rooms and moving objects with closed volumes.
A free head position reduces operator fatigue. In medical applications, it allows the study of patient behavior in vivo.

Disclosures
All authors declare that there is no conflict of interests in this paper.