Overview of how the FacePix(30) database was created


The FacePix data capture system


To allow for a precise measure of face recognition robustness, FacePix(30) contains face images with pose and illumination angles that have been captured at 1 degree increments. Figure 4 shows the face capture apparatus that was used to capture these face images. It consists of a flat platform that supports two motorized concentric annular rings that each rotate independently on a pair of circular tracks containing ball bearings.

Figure 4: The Face capture platform used to capture images for the FacePix(30) database.

Image of the Face Capture Platform with a participant

As shown in Figure 4, a video camera (mounted on a tripod) and a "green screen" backdrop (mounted on two vertical supports) are mounted on opposite sides of the inner ring. Thus, as the video camera rotates around the front of the person, the backdrop rotates in synchronism behind the person. The video camera is aimed at the vertical center axis of the ring, which can be precisely determined by a counter-weighted ball (visible at the top of Figure 4) that is suspended from the ceiling on a string, and can be raised or lowered as needed. The person being photographed is seated on a stool in the center of the ring. The ball is then lowered, the person is centered precisely under the hanging ball, and the ball is then raised out of the field of view of the video camera. Soft illumination of the person's face is provided by two Photoflex Starlite diffused light sources, which are in stationary positions at the two front corners of the platform - one to the right, and the other to the left (see Figure 4).

The inner ring (on which the video camera is mounted) is motorized to sweep the camera around the front of the seated person, while it captures a continuous video stream at 30 frames per second. The resulting quick capture of a spectrum of 181 pose angle images (over a 180-degree range) reduces the time that the participant must remain motionless. The rapid frame rate of the video camera (with respect to the camera sweep time) also ensures that, even with eye blinks, it will be possible to extract an open-eyed image from each pose angle.

As the video camera is swept around the person, a second downward-looking video camera (also mounted on the inner ring) records the degree markings on the top surface of the outer ring, which remains stationary as the inner ring rotates. That second video stream is routed to a small LCD display on the green-screen backdrop, so that it can be captured (in the upper-left corner of each video frame) simultaneously with the face. This provides a real-time indication of the tripod-mounted video camera's angular rotation, as it rotates around the person. Figures 5, 6, and 7 show examples of frames captured with the wide format, tripod-mounted video camera, including the degrees of rotation in the upper-left corner.

Figure 5: A video frame captured with the wide format camera at a pose angle of -45°

Participant's face as -45°

Figure 6: A video frame captured with the wide format camera at a pose angle of 0°

Participant's face as 0°

Figure 7: A video frame captured with the wide format camera at a pose angle of +45°

Participant's face as +45°

Based on the angles displayed in the upper left corner of each video frame, 181 frames are (1) extracted from the video sequence (at 1-degree intervals) and (2) normalized and cropped to a standard size. The green screen background and the green apron in each cropped face image could potentially allow the creation of an alpha plane that facilitates removal of the background or insertion of an alternative background.

Also visible in Figure 4 is a point light source on a vertical support, which is mounted on the outer ring. This outer ring (like the inner ring) is motorized, and can also be made to independently rotate around the person seated in the center. As the point light source rotates through a 180-degree range, it casts a range of shadows across the person's face, starting at one side of the face, and ending at the other side. Meanwhile, the two Photoflex Starlites provide diffuse (fill) light, to soften the shadows on the face. The ratio of lighting from the point source and the diffuse light sources can be adjusted, to provide the desired shadow softness.

As the point light source is swept around the person, a second downward-looking video camera (mounted on the stationary inner ring) records the degree markings on the top surface of the outer ring, as the outer ring rotates. That second video stream is routed to a small LCD display on the green-screen backdrop, so that it can be captured simultaneously (in the upper-left corner of each video frame) with the face. This provides a real-time indication of the point light source's angular rotation, as it rotates around the person.

FacePix(30) Capture Procedure


Figure 8 is a top view of the Face Capture Platform. The participant puts on an apron (to cover his or her clothing) and then sits on the stool in the center of the face capture platform. (See Figure 4) The stool height is adjusted so that the participant's face is vertically centered within the field of view of the video camera. A tennis ball (suspended on a string above the center of the face capture platform) is then temporarily lowered from the ceiling to allow the investigator to quickly confirm that the participant's head is centered horizontally over the center of the platform. Three separate video clips are then captured, as the participant looks straight ahead with a neutral facial expression:

Video clip 1:With the Fill lights on, and the Spot light off, the video camera rotates around the subject from -90 to +90.
Video clip 2:With the Fill lights on, and with the Spot light on and rotating from +90 to -90, the video camera remains stationary at 0° (i.e. directly in front of the participant)
Video clip 3:With the Fill lights dimmed, and the Spot light on and rotating from -90 to +90, the video camera remains stationary at 0° (i.e. directly in front of the participant)

Figure 8: Simplified top view of the face capture platform.

(need image)

The FacePix(30) Image Extraction Procedure


After the 3 video clips have been captured, frames are extracted from them at 1 degree intervals. This is done by launching a software application program called Frame Finder, and selecting the video clip from which the frames are to be extracted.

As previously show in Figures 5, 6, and 7, an LCD display in the upper-left corner of each video frame indicates the angle of rotation. The Frame Finder application prompts the user to "find pose -90". The user then rolls the video to the requested frame, and hits the spacebar to select that frame. The application then prompts the user to "find pose -80", "find pose -70", etc, at 10 degree intervals. As the user selects each frame, that frame number is retained by the Frame Finder application. At the conclusion of this process, the Frame Finder application has retained frame numbers for:

Video clip 1: 19 Pose angles from +90 degrees to -90 degrees, at 10 degree intervals
Video clip 2: 19 Spot light angles (with fill lights) from +90 to -90, at 10 degree intervals
Video clip 3: 19 Spot light angles (with dimmed fill lights) from +90 to -90, at 10 degree intervals

The Frame Finder application then presents to the user (one at a time):

(1) the +90, 0, and -90 degree images from Video clip 1
(2) the 0 degree image from Video clip 2
(3) the 0 degree image from Video clip 3
As it does so, it prompts the user to use the mouse to superimpose:

(1) a vertical line over the centerline of the face in the +90 and -90 degree profile images from Clip 1 (See Figure 9 and 10)
(2) a vertical line over the centerline of the face in the 0 degree images from Clips 1, 2, and 3 (See Figure 11)
(3) a horizontal line across the centers of the eyes of the 0 degree images from Clips 2 and 3 (See Figure 12)
(4) a horizontal line between the lips of the 0 degree images from Clips 2 and 3 (See Figure 13)

The positions of these vertical and horizontal lines are then saved to a text file with the same name as the video, but with a .txt extension.

The user then launches an application called Cropper, which uses this .txt file to extract 181 frames (one frame for each degree of rotation) from each of the 3 video clips, for a total of 543 images per participant. These images are then resampled and cropped to a standard size, and stored in the lossless PNG format.

Figure 9: Vertical center line over the +90 degree view of Clip 1

Vertical center line over the +90° view of Clip 1

Figure 10: Vertical center line over the -90 degree view of Clip 1

Vertical center line over the -90° view of Clip 1

Figure 11: Vertical center line over the frontal view of Clip 1

Vertical center line over the frontal (0°) view of Clip 1

Figure 12: Horizontal eye line over the 0 degree view of Clip 2

Horizontal eye line over the 0° view of Clip 2

Figure 13: Horizontal mouth line over the 0 degree view of Clip 2

Horizontal mouth line over the 0° view of Clip 2