MAS622J: FACE RECOGNITION PROJECT

Niloy Mukherjee, Nikolaos Mavridis, MIT Media Lab, Fall’ 02

 

Introduction

The traditional problem

Our problem setting

Overview

 

 

 

Ôhe mean face image of the training set:
as expected, highly symmetric, and often in symmetry lies beauty!

 

Introduction

Automated face recognition is a challenging pattern recognition problem, with an already vast existing literature covering various aspects of it, and a number of commercially available systems. Its applications are numerous, and range from biometric identification to generalised surveillance to advanced human-computer interaction and beyond.

A somewhat outdated but nevertheless concise review can be found in [1].

 

Unfortunately, as justified by the latest escalation of world terrorism, face recognition together with other biometric identification techniques will certainly become much more widespread in the future. There are even statistics that public opinion (a figure of 86% is quoted!) has become more tolerant and sometimes even demands this [2]. Thus, entry/exit logging in many public places may not be very far away, and successful deployment of smaller-scaler non-govermental applications such as cardholder verification in ATM’s will offer huge immediate savings to banks.

 

The cognitive science / neuroscience viewpoint of face recognition is also very interesting. For example, there is strong evidence for the conjecture that the human face recognition system is distinct from general object recognition. The rare disorder of “prosopagnosia” [3] (prosopo (ðñüóùðï) = face, agnosia (á-ãíùóßá) = not knowing) provides the main clues for the above conjecture, as in that case humans with normal general object recognition performance have great difficulty recognising faces.

 

The traditional problem

The traditional setting is as an identity recognition or authentication problem: in the first case, the system is given samples of the faces of a set of people, and is asked to identify an unlabelled picture with one of the people it has been trained for (sometimes also with a “reject” option). In the second case, the system is again trained on a set of people, and is given a novel picture together with the supposed identity of the person, and should decide whether the novel picture really matches the identity. Typical applications of the first include matching the photo of an unknown suspect with somebody in a criminal database, and of the second cardholder verification in ATM’s.

 

A main justification of the difficulty of the problem lies in the wide variation of appearances of faces. Humans can successfully recognise faces (also see [4])under short-term variations caused by the environment or the viewpoint such as pose, added artifacts such as glasses, other occlusions, different illumination conditions, expressions, but also longer-term variations such as hairstyle, beard-growing, and the effect of ageing. Of course, humans often exploit multimodal (whole-body, gait movement, speaker id) and larger contextual information towards their final decision; but even after constraining the information to a simple 2D picture, humans do an impressive job. However, modern face recognition systems have been shown to outperform humans in some cases: for example, Baback reports superiority in gender identification with images not containing hair – see [5].

 

Our problem setting

There are many other useful related problems, apart from the traditional face ID/verification. For example, other forms of categorisation: based on gender, age, colour, expression, and recognition of face artifacts such as a moustache or a hat. This is exactly what we will deal with in this project: the categorisation of faces in a discrete and finite nine-dimensional space, along suitably quantized dimensions of gender, age, colour, expression, moustache, beard, glasses, bandana and hat.

 

Apart from providing a description of the face (useful in metadata / description-based searching etc.), and expression estimation (useful for emotional state estimation, and affective HCI), this information might also be advantageous towards boosting performance in the traditional problem, for example by excluding the area of moustache after its identification and localisation as “missing info” and then proceeding with the search, in case no matching pictures with moustaches where found. One can think of many other uses; their value of course remains to be proved in practice.

 

Overview

A discussion of the problem and a description of the data set serves as a natural first step. Preprocessing and some possible representations of face images follow. An extensive discussion of feature selection, aiming not only towards recognition directly but also towards localisation of the relevant information in a subarea of the facial image, is then presented, together with some first results. Further classification methods and results obtained so far, as well as a future directions section conclude.

 

References:

[1] Chellappa, R.; Wilson, C.L.; Sirohey, S. , “Human and machine recognition of faces: a survey” , Proceedings of the IEEE , Volume: 83 Issue: 5 , May 1995 ,Page(s): 705 -741

[2] Sullivan, B , “Warming to big brother”, MSNBC Tech-Science News,

http://www.msnbc.com/news/654959.asp?0si=-&cp1=1

[3] Burman, C ,“Prosopagnosia pages”,

http://www.prosopagnosia.com/

[4] Bruce, V.; Hancock, P.J.B.; Burton, A.M.Comparisons between human and computer recognition of faces”, Automatic Face and Gesture Recognition, 1998. Proceedings. Third IEEE International Conference on , 1998 Page(s): 408 –413

[5] Moghaddam, B.; Yang, M-H., "Gender Classification with Support Vector Machines", IEEE International Conference on Automatic Face and Gesture Recognition (FG), March 2000

 

Next page: The problem & datasets