Non-parametric Classification of Facial Features

 

 

Hyun Sung Chang

 

Department of Electrical Engineering and Computer Science

Massachusetts Institute of Technology

E-mail: hyunsung@mit.edu

 

 

 

Problem statement

 

In this project, I attempted to classify facial images based on various external characteristics, such as gender, expression, and accessories they are taking on.

Rather than extracting any particular parameters describing faces, e.g., the distances among eyes, nose, and mouse, I used grey-scale face images themselves, fitted to 128x128 window, as the inputs.

 

 

Dataset

 

The dataset used for this project together with detailed description is available here at the course website. The dataset consists of 2,000 training face images (faceR, 1,997 of them labeled) and 2,000 test face images (faceS, 1,996 of them labeled). Because the image size is 128x128, each image can be considered as a data point in a huge dimensional space. The dimensionality reduction has been conducted using principal component analysis (PCA) on 100 sample faces, all from the training dataset, so each image can be represented by 99 eigenface coefficients, as well as the mean face.

 

The composition of dataset is shown in Table 1. For example, notice that, in terms of expression, “funny” faces were significantly fewer than the other two classes and that few people wore glasses or bandana. One interesting thing is that no bandana image was included in the samples used to generate the eigenfaces.

 

 

Table 1. Dataset composition

 

gender

 

 

expression

male

female

 

serious

smiling

funny

Eigenface generating data

61/100

39/100

 

Eigenface generating data

45/100

51/100

4/100

Training data (faceR)

1,150/1,997

847/1,997

 

Training data (faceR)

917/1,997

1,043/1,997

37/1,997

Testing data (faceS)

1,277/1,996

719/1,996

 

Testing data (faceS)

1,097/1,996

836/1,996

63/1,996

 

 

glasses

 

 

bandana

on

off

 

on

off

Eigenface generating data

4/100

96/100

 

Eigenface generating data

0/100

100/100

Training data (faceR)

59/1,997

1,938/1,997

 

Training data (faceR)

13/1,997

1,984/1,997

Testing data (faceS)

8/1,996

1,988/1,996

 

Testing data (faceS)

8/1,996

1,988/1,996

 

 

Objective of this project

 

The objective of this project lies in two aspects:

1)      to practice with meaningful classification problem using the methods learned from the class;

2)      to look into inherent limitations of PCA approach.

 

 

Eigenface representation

 

Let  be eigenfaces and  be the sample faces used to generate the set of eigenfaces. The PCA finds  so that  can be well represented by their linear combinations. Let  be an arbitrary face and  be its eigenface representation, that is, .

Note that  is just a linear combination of , which implies that the sensitivity issue should be aroused. For example, because there was no bandana image in , we may see that  and  may be significantly different from each other if  is a facial image of the person who is wearing a bandana.

The approximation error between  and  can be measured in terms of peak-signal-to-noise-ratio (PSNR) defined by

    ,

where  is the number of pixels.

 

 

Figure 1 shows a bandana image example. The PSNR is as low as 14.47dB. Note that, in eigenface representation, other regions than the bandana pixels were also severely distorted as a result of making best efforts to compensate for the bandana region. This may lead to the classification errors even also for the other criteria, not only for the bandana.

 

        

Figure 1. Bandana image example and its eigenface approximation.

 

 

Figure 2 shows the actual PSNR distribution for the training dataset and the test dataset. The images in the test dataset show somewhat low PSNR; for some particular samples, the PSNR was significantly low. This low PSNR may contribute to the classification error.

 

Figure 2. PSNR distribution of each face image in training dataset (faceR) and in test dataset (faceS). (left: faceR, right: faceS)

 

 

Figure 3 illustrates how the discriminant value and the PSNR for gender classification (+ = male, - = female) when a linear discriminant was used. Note that a majority of male samples of high PSNR were correctly classified. For the female samples, such a correlation was not that noticeable; instead, the discriminant value and the PSNR looked rather uncorrelated. I think this may be because of the inherent ill-performance of our classifier against female images (See Table 2)

 

Figure 3. Plot of discriminant value versus PSNR for male and female face samples. (left: male, right: female)

 

 

 

Classification practice

 

For this part of experiment, the following classification schemes were tested:

 

 

Also the performance may have to be compared to random guess schemes.

 

Table 2 through Table 5 show the classification results. In most cases, LD and NN-2 showed best performance than the other two schemes and also than the two RG schemes.

Nearly all classifiers failed to detect the glasses, bandana, and also “funny” expression which are all characterized as extreme minority, i.e., whose prior probability is very low.

 

 

Table 2. Comparison of k-NN (), LD, NN-2 (), NN-3 (), RG-1, RG-2 for gender classification.

k-NN

detect

miss

 

LD

detect

miss

male

823

454

 

male

1,026

251

female

402

317

 

female

375

344

 

 

 

 

 

 

 

NN-2

detect

miss

 

NN-3

detect

miss

male

1,008

269

 

male

763

514

female

378

341

 

female

544

175

 

 

 

 

 

 

 

RG-1

detect

miss

 

RG-2

detect

miss

male

1,277

0

 

male

753

542

female

0

719

 

female

305

414

 

 

Table 3. Comparison Comparison of k-NN (), LD, NN-2 (), NN-3 (), RG-1, RG-2 for expression classification.

k-NN

detect

miss

 

LD

detect

miss

serious

586

511

 

serious

936

161

smiling

468

368

 

smiling

623

213

funny

0

63

 

funny

0

63

 

 

 

 

 

 

 

NN-2

detect

miss

 

NN-3

detect

miss

serious

932

165

 

serious

963

134

smiling

617

219

 

smiling

593

243

funny

0

63

 

funny

0

63

 

 

 

 

 

 

 

RG-1

detect

miss

 

RG-2

detect

miss

serious

0

1,097

 

serious

504

593

smiling

836

0

 

smiling

437

399

funny

0

63

 

funny

1

62

 

 

Table 4. Comparison of k-NN (), LD, NN-2 (), NN-3 (), RG-1, RG-2 for glasses detection.

k-NN

detect

miss

 

LD

detect

miss

on

0

8

 

on

0

8

off

1,988

0

 

off

1,986

2

 

 

 

 

 

 

 

NN-2

detect

miss

 

NN-3

detect

miss

on

2

6

 

on

0

8

off

1,962

16

 

off

1,958

30

 

 

 

 

 

 

 

RG-1

detect

miss

 

RG-2

detect

miss

on

0

8

 

on

0

8

off

1,988

0

 

off

1,988

0

 

 

Table 5. Comparison of k-NN (), LD, NN-2 (), NN-3 (), RG-1, RG-2 for bandana detection.

k-NN

detect

miss

 

LD

detect

miss

on

0

8

 

on

0

8

off

1,988

0

 

off

1,988

0

 

 

 

 

 

 

 

NN-2

detect

miss

 

NN-3

detect

miss

on

0

8

 

on

0

8

off

1,988

0

 

off

1,986

2

 

 

 

 

 

 

 

RG-1

detect

miss

 

RG-2

detect

miss

on

0

8

 

on

0

8

off

1,988

0

 

off

1,988

0

 

 

From this experiment, I concluded that

1)      Samples from minority class (with very low prior probability) tend to be miss-classified with any classifier;

2)      Eigenface approach is good for identity recognition purpose, robust to noise and partial loss of data, but not as good for classification purpose dealing with extraneous face samples, i.e., not used for the eigenface generation.

 

 

Remarks

 

After Monday presentation, I applied AdaBoost on LD and Parzen window for each classification and obtained preliminary results, but the classification performance was not improved so much. Particularly, I am looking into the working details for Parzen window since my preliminary results were far from those in [2]. Mostly due to limited time, multi-linear analysis method has not been attempted. Future direction of study should include the analytical and experimental study of multi-linear analysis method. 

 

 

References

 

[1]         Richard O. Duda, Peter E. Hart, and David G. Stork, Pattern Classification. New York, NY: John Wiley & Sons, 2001.

[2]         Tiwari, “Face recognition: eigenfaces with 99 PCA coefficients,” MAS.622J/1.126J Project Presentation, MIT, Fall 2004. [ppt]

[3]         W. S. Yambor, “Analysis of PCA-based and Fisher discriminant-based image recognition algorithms,” Master's thesis, Dept. of Comp. Sci., Colorado State Univ., July 2000. [pdf]

[4]         M. Turk and A. Pentland, “Eigenfaces for recognition,” J. Cogn. Neurosci., vol. 3, no. 1, pp. 71-86, 1991. [pdf]

[5]         H. A. Rowely, S. Baluja, and T. Kanade, “Neural network based face detection,” IEEE Trans. Pattern Anal. Machine Intell., vol. 20, no. 1, pp. 23-38, Jan. 1998. [pdf]

[6]         Face recognition homepage. [Online]. Available: http://www.face-rec.org

 


*This page has been written as a final project report for MAS.622J/1.126J. Pattern Recognition and Analysis taken in Fall 2006.