On the class web page you will find four ascii data files extracted from the Eigenface database. Two of the files, faceR and faceS, contain 99 coefficients for each of 2000 faces. faceR should be used as training data; faceS for testing. Each row contains 100 elements. The first element of each row contains a face number (running from 1223 to 5223); the remaining 99 are coefficients measuring how much that face projects onto the corresponding eigenvector.
The other two files, faceDR and faceDS, corresponding to faceR and faceS,
contain ascii descriptors of each face, for example:
1269 (_sex male) (_age teen) (_race white) (_face serious) (_prop '()) 1361 (_sex male) (_age child) (_race black) (_face serious) (_prop '(hat )) 2147 (_sex male) (_age adult) (_race white) (_face serious) (_prop '()) 2148 (_sex female) (_age adult) (_race white) (_face smiling) (_prop '()) 2456 ... (_prop '(hat glasses )) 2473 ... (_prop '(moustache beard ))Since this is real world data, some data is missing. Faces 1228, 1808, 4056, 4135, 4136, and 5004 are missing from the database, so coefficients for these faces are all zeros. The corresponding descriptors contain, e.g. 1228 (missing descriptor). In addition there is a ``missing descriptor'' entry for face 1232. Furthermore, some of the descriptors may be wrong; for example there are two female faces described as having moustaches.
You might want to consider detectors which are non-separable: for example, you may need different smile detectors for children and for adults; or different age detectors for smiling and serious faces.
Which feature detectors are separable, and which features require non-separable detectors?
Do the different ages lie on a path through face space? Is the path linear?
Which features are most easily detected? How can you eliminate outliers?
An example of how you might use this is:
load ev.mat load faceR v = faceR(5, 2:100)'; i = eigenfaces'*v + mean_face'; imagesc(reshape(i, 128, 128)'); colormap(gray(256));which reconstructs image 1227 from its coefficients. In this case, the reconstruction is perfect since 1227 was one of the training images.
To read the RAW images from the raw data use:
fid=fopen('rawdata/1223');
I = fread(fid);
imagesc(reshape(I, 128, 128)'); colormap(gray(256));