Masliy R.V., Kulyk A.Y.
Vinnitsia national technical university
Design
verification stage for boosting-based face detection method
Boosting-based face
detection methods allow detect face on images in real-time with high
probability (above 90%), but the probability of false alarm is not low enough. The reason is low discriminant of simple classifiers
on the last stages cascade of strong classifier.
The paper proposed to
improve the boosting-based face detection method presented in [1] by applying
the verification stage. The process of
verification is to check areas of the image that a cascade of strong
classifiers identified as the face (“face
candidates”) with using classifier, which demands high detection accuracy,
while perhaps not a very high speed.
The specified requirements for a
responsible approach classifier proposed in paper [2]. This approach allows us to obtain a classifier that uses Bayes' rule in
the form of likelihood ratio. In this case, to decide where the “face candidate” is a face image
need to value likelihood ratio was more
than a certain threshold:
, (1)
where P (image | face) – conditional probability distribution of values taken pixel
image if the input face image classifier are given, P (image | non-face) – conditional probability distribution of values taken
pixel image if the input non-face image classifier are given,
–
threshold.
The advantage of using
likelihood ratio when constructing the classifier is a simple conditional
probability distribution P (image | face) and P (image | non-face) using
separate sets of images "faces" and "non-face" than the
formation probability distribution P (face | image), disadvantage is the
complexity of obtaining a representative set of training images, especially for
the formation probability distribution P (image | non-face).
There are two approaches
to image representation for classifier: global and local.
Global approach tries to
simulate the joint behavior of all input variables. But restrictions on computational
resources and memory resources do not allow for the function of input
variables. The practical solution for the
function of input variables is to reduce the dimension of input data using
different transformations (cosine, Fourier, principle component analysis). Higher-order models and nonparametric
methods are used to describe the joint behavior of the reduced set of
variables.
When you approach a local
input variables are combined in sets, in which relations within each set more
accurately modeled than between sets. The set can be defined as a group of
pixels, or group of transformed variables that meet certain mathematical
characteristics. Also
the same transformed variable or pixel may contain several sets. The local approach is based on the
assumption that a certain object each pixel statistically associated with some
pixels more than others, in this case, global representation is not optimal for
face image representation.
Local approach will better fit the image
representation made by its decorrelation as a statistical dependence will
concentrate in a small set of variables that can form local features. As a means to select appropriate correlation wavelet transform
image, which simplifies the simulation input data in wavelet space compared
with the description in the space of data. Besides properties of locality and
multiscaling allow using wavelet transform effectively describe a wide range of
features of the data by developing a set of local features. The local features of local binary patterns,
characteristics are presented in Table 1.
Table 1 – Characteristics used LBP
|
Mark LBP |
The number of wavelet coefficients |
LBP kernel, wavelet coefficients |
|
LBP4 |
4 |
1 |
|
LBP8 |
8 |
1 |
|
|
16 |
2x2 |
|
|
32 |
2x2 |
|
|
32 |
4x4 |
|
|
64 |
4x4 |
A set of local features to be used for
image representation can be divided into several groups:
1. Wavelet-coefficients obtained in one
subband. These local features are most
localized in the frequency and orientation. Seven local features define of these
subband: level 1 LL (LBP8), level 1 LH (LBP8), level 1 HL (LBP8), level 2 LH (
),
level 2 HL (
),
level 3 LH (
),
Level 3 HL (
).
2. Inter-frequency wavelet coefficients of one
orientation but with different frequencies
subband. These local features describe visual pattern covering the frequency
bands several subband. Define six
such local features using the following pairs: level 1 LL (LBP4) – level 1 HL (LBP4), level 1 LL (LBP4) – level 1 LH (LBP4), level 1 LH (LBP4) – level 2 LH (
) , level 1 HL (LBP4) – level 2 HL (
),
level 2 LH (
)
– level 3 LH (
),
level 2 HL (
)
– level 3 HL (
).
3.
Wavelet coefficients from one subband frequency but different orientation. These local
features describe horizontal and vertical components of visual
images. Define three
such local features using the following pairs: level 1 LH (LBP8) – level 1 HL (LBP8), level 2 LH (
)
– level 2 HL (
),
level 3 LH (
)
– level 3 HL (
).
Given decorrelation wavelet coefficients
will be considered that within the local features of wavelet coefficients have
large statistical dependence, but between the local features statistically
independent. In this case the likelihood ratio can be written as
follows:
, (2)
where featurek - k-th
value of local features in “face candidate”, Pk(featurek | face) - conditional probability distribution of values featurek, if the input image classifier given "face», Pk(featurek | non-face) -
conditional probability distribution of values featurek, if
the input image classifier presented "non-face",
- threshold.
Given that local features can be
calculated in different spatial coordinates of “face candidate” likelihood
ratio can be written as follows:
, (3)
where k - number of local features, (x,y)
- coordinates of the location of local features, fk(x,y) - k-th value of local features in a
point with coordinates (x,y), - the
distribution of conditional probability values fk(x,y) if the input image classifier given "face" -
the distribution of conditional probability values fk(x,y), if the input image classifier presented "non-face".
Using a normalized set of training images
"faces" and "not face" certain local features form a
conditional probability distribution
and
as a histogram.
To simplify the
processing stage of verification the candidate faces are normalized by changing the size to 56x56 pixels.
The decision on whether the “face
candidate” a face images will be determined by the formula (3).
Using the proposed verification phase
allowed to reduce the probability of false detection (from 45 to 15 found
false) when the probability of detecting a slight decrease (from 95.3 % to 93.9
%) using boosting-method [1]. Testing was performed on a set of frontal view face
images from the database BioID.
References:
1.
Masliy R. Face detection in grayscale images / R. Masliy, A. Kulyk //
Advanced Computer Systems and Networks: Design an Application : Proceeding of
the 4-th International Conference ACSN-2009. – Lviv, 2009 – P. 170–172
2.
Schneiderman H. Feature-Centric Evaluation for Efficient Cascaded Object
Detection // Proc. of IEEE Conf. Computer Vision and Pattern Recognition.,
2004. – Vol. 2, P. 29–36.