IEEE TRANSACTIONS ON NEURAL NETWORKS, vol.17, no.6, pp.1550-1565, 2006 (SCI-Expanded)
In some pattern recognition tasks, the dimension of the sample space is larger than the number of samples in the training set. This is known as the "small sample size problem." Linear discriminant analysis (LDA) techniques cannot be applied directly to the small sample size case. The small sample size problem is also encountered when kernel approaches are used for recognition. In this paper, we attempt to answer the question of "How should one choose the optimal projection vectors for feature extraction in the small sample size case?" Based on our findings, we propose a new method called the kernel discriminative common vector method. In this method, we first nonlinearly map the original input space to an implicit higher dimensional feature space, in which the data are hoped to be linearly separable. Then, the optimal projection vectors are computed in this transformed space. The proposed method yields an optimal solution for maximizing a modified Fisher's linear discriminant criterion, discussed in the paper. Thus, under certain conditions, a 100% recognition rate is guaranteed for the training set samples. Experiments on test data also show that, in many situations, the generalization performance of the proposed method compares favorably with other kernel approaches.