Vehicle Detection Using Principal Component Analysis

Given a graytone intensity image and a speciication of an interested vehicular (rectangular) shape (length, width and orientation), it is desired to know if there is such a vehicle located at a speciic point. In this report, this problem is posed within a statistical framework based on principal component analysis.


Introduction
Statistical pattern recognition begins with units, such as image regions or projected segments, on which a variety of measurements have been made.Each unit has an associated measurement vector.The purpose of statistical pattern recognition is to classify each unit on the basis of its measurement vector.The classi cation matches the unit with its feature vector to the \closest" category.It does so by means of a decision rule.The decision rule is designed optimally to assign each unit to a class or category on the basis of its measurement vector.Optimally can mean, for example, with the smallest classi cation error for a given set of measurements and for a given computational complexity of decision rule.Hence statistical pattern recognition techniques include: Feature selection and extraction techniques either to reduce the number of measurements to be made or to reduce the dimensionality of the vectors representing the measurements made to the decision rule, Decision rule construction techniques, Techniques for the estimation of decision rule error.

Feature Selection
The kind of features that can better tell the vehicle from background are desired.After some careful study of the images in the image database FHN , the following features are proposed: average gray level inside the vehicle.average gradient magnitude inside the vehicle.average gray level outside the vehicle.where the inside, outside and boundary of a vehicle are de ned as Figure 1 shows.The combination of these features is treated as the feature vector associated with the central pixel of the vehicle.
To collect these features, as Figure 1 shows, a template of rectangular shape of desired vehicle dimension and orientation is moved through the images pixel by pixel with its centroid locating at a speci c pixel and the proposed features associated with it are collected.According to their ground-truth, all pixels in the images can be classi ed into two categories: target-vehicle and non target-vehicle, which are denoted as c1 and c2 respectively.Then we use all these feature vectors as our training set.

Principal Component Analysis
We begin by presenting some notational convention.Let U = fU n : n = 1; : : : ; Ng be the unit-training set to be used to design the table lookup classi er, U n be an unit in the training set and has an associated measurement vector X n with known true class, fX n k ; k = 1; : : : ; Kg be the elements of X n .
The rst principal axis is the one-dimensional subspace for which the projected variance is larger than for any other one-dimensional subspace.Thus, if the measurement vector (X n ) 0 s are projected on this axis, the coordinates of the projected vectors have the largest variance.Since the coordinates of the projected vectors are widely spread on this axis, if the projected coordinates are used as values for the discriminant function, it is expected, but not guaranteed, to have signi cant discriminatory information.Therefore we are interested in nding an axis V such that the variance of the projected coordinates Z n = V 0 X n on this axis is maximum.It turns out that the axis V is the eigenvector associated with the largest eigenvalue of the covariance matrix of the measurement vectors associated with the units in the unit training set U.
Let and P be the estimated mean vector and the estimated covariance matrix of the measurement vectors (X n ) 0 s.Then the variance of the projected coordinates, 2 Z , can be expressed as follows: where the covariance matrix P is a positive, semide nite, real symmetric matrix.
To maximize the projected variance, we must nd a V such that V 0 P V is maximized subject to the constraint V 0 V = 1.Using Lagrange multiplier , we set the following derivative to 0. @ @V V 0 X V ?(V 0 V ?1)] = 0 Taking the required derivation results in X V = V Hence V must be an eigenvector of P with corresponding eigenvalue .To determine which eigenvector maximizes V 0 P V , notice that V 0 X V = V 0 ( V ) = V 0 V = since V 0 V is constrained to have unit norm.Therefore V is the eigenvector of P having the largest eigenvalue.
Based on the same reasoning, the variances of the projected coordinates on the orthonormal eigenvectors are in the same order of their corresponding eigenvalues.In order to reduce the computational complexity, we can just keep the rst m principal axis, hoping that they keep most of the signi cant discriminant information, and do the classi cation in the reduced subspace.
In the reduced feature space, we can assign bits to each principal axis according to the entropy of its projected coordinates and then partition the space to form a lookup table.For each table entry, we count the number of samples for each class and assign a class that is the most probable one for its associated training samples.
This principal component approach consists of two steps: o ine training and online detection, which are descibed in detail in the following sections.

O ine Training Algorithm Outline
The main purpose of o ine training algorithm is to form a lookup table which will assist the online detection algorithm to perform the classi cation.It can be further divided into the following stages:

Coordinates Normalization
In the original feature space, the axes are of di erent scales, some of them are grey levels which range from 0 to 255, some of them are gradient magnitudes which range from one digit to two digits, and some of them are average directions which range from 0 to 1.In order to weight them equally, we divide every coordinate by projected standard deviation of the training set.
Suppose the sample mean and sample variance for the projected coordinates on the axis that spans the k th coordinate is denoted by m k and s k respectively, namely After normalization, X n k becomes Y n k , where for every n = 1; : : : ; N and k = 1; : : : ; K and Y n is the measurement vector associated with unit U n in the normalized feature space.The following operations will be conducted in the normalized feature space.

Principal Components Computation
In the normalized feature space, the principal axes are computed.As the discussion above, these axes correspond to the orthonormal eigenvectors of the sample covariance matrix.So the principal components computation is actually the sample covariance matrix eigenvectors computation.
If the sample mean vector and sample autocorrelation matrix are de ned by then the unbiased estimate of the covariance matrix is X = 1 The orthonormal eigenvalues of P are actually the principal axes that we want to nd.To reduce computation complexity, we reduce the dimensionality by just using the rst several of them.They are expected, but not guaranteed, to have signi cant discriminatory information.

Histogram Computation
To compute the histogram for the coordinates projected to the j th principal component axis V j , we rst nd the minimum and maximum coordinate values in that one dimension space.If the projected coordinate of Y n is denoted by Z n j , and the maximum and minimum values for this axis are denoted by Max j and Min j respectively, then Z n j = (V j ) 0 Y n Max j = max n Z n j Min j = min n Z n j According to the precision we want, the histogram can be obtained by partitioning the principal component axis into bins and then counting the number of samples fell in each bin.

Entropy Computation
Entropy is also called self-information.The larger the entropy is, the more information it contains.If the bin width is denoted by w, the number of samples fall into the m th bin is N m , m = 1; : : : ; M, then the entropy of the projected coordinates onto principal axis V j is de ned as H j = ?X P j (x) log(P j (x)=w)

Bits Allocation
As we just said, the larger the entropy is, the more information it contains.So principal component axis with large entropy should get more bits since it has more information.Thus we allocate the number of bits to a principle component axis proportional to its entropy.
The number of bits that the j th principal axis will get is computed as (#bits for V j ) = H j P J j=1 H j (total #bits)

Histogram Equalization
After the number of bins has been decided, we repartition the principal component axes using histogram equalization method based on previously computed histograms.The purpose of histogram equalization is to enforce that any new sample will fall into any one of these bins with equal probability.Thus for all the principal axes, the boundary points of the bins are recalculated and will be used to form the lookup table.

Lookup Table Formation
With the new boundaries of bins previously computed, the reduced feature space is partitioned into hyperbins.In each of these bins, the number of vehicle samples and the number of non-vehicle samples are counted and a lookup table is formed.Moreover the conditional probability that a sample will fall into a speci c bin given a class are computed.Suppose the total sample size for c1 and c2 are denoted by N c1 and N c2 respectively.Let N i c1 and N i c2 be the number of samples of class c1 and c2 falling in the i th bin B i , then the conditional probabilities that a sample will fall in this bin given it true class are computed as 5 Online Detection Algorithm As the o ine training algorithm, the new coming sample is rst normalized.After the bin, say B i , that the sample falls in is found, the conditional probabilities P(B = B i jC = c1) and P(B = B i jC = c2) are retrieved.Then the Bayes decision rule is formed as follows.
Since we only have two classes: target-vehicle (c1) and non target-vehicle (c2), the gain matrix is just a 2 2 matrix as stated below.
The Bayes rule f is given by f(ajX) = ( 1 if P t2C e(t; a)P(t; X) = max b2C P t2C e(t; b)P(t; X) 0 otherwise where a, t, which can be c1 or c2, stand for assigned and true classes respectively, and X is the measurement vector.Since P(t; X) = P(Xjt)P(t), where P(Xjt) is the conditional probability we previously retrieved and P(t) is the prior probability for t.

Post Processing
The candidates passing the Bayesian classi er procedure are arranged in the descending order of their total gain which has been determined by the Bayesian classi er procedure.Then the candidates are examined pairwise of the distance between their centroids.The pruning step is illustrated in Figure 2. Two candidates whose centroid are A and B are shown in the gure.Their lengths and widths are 2h and 2w respectively.V h and V w are unit vectors parallel and perpendicular to the orientation of the vehicles.Let AB be the vector from A to B. Constants = 0:9, = 0:9.The following is the result we get when applying the algorithm to the whole image database FHN where the prior probability for vehicle is xed at 0.01 and e(v; v) e(v; nv) are xed at 1.0 and -0.5 respectively.Also e(nv; nv) + e(nv; v) = ?0:5The result shows some performance improvement but not much which may be the consequence of the some factors such as The similarity of the subimages of the vehicle database.The representativity of the training set.The completeness of the feature vector.

Figure 1 :
Figure 1: A template is moved through the images pixel by pixel with its centroid locating at a speci c pixel and the proposed features associated with it are collect.
Assume that the support ratio of candidate A is larger than that of B. Candidate B is this experiment, we use the whole data set as training set to form the lookup table and then use the lookup table to perform the classi cation.To evaluate the performance, we Correctly detected target vehicle (CDG) is a target vehicle, in a speci ed neighborhood of whose centroid there exist the centroids of at least one declared vehicle.Type I mis-detection (MD1) refers to any target vehicle that is not a CDG.Detected non-target vehicle (DETNT) is a non-target vehicle, in a speci ed neighborhood of whose centroid there exist the centroids of at least one declared vehicle.Type II mis-detection (MD2) refers to any non-target vehicle that is not a DETNT.Correct detection (CDD) is a declared vehicle, whose centroid lies within a speci ed neighborhood of the centroid of a target vehicle.Type I false alarm (FA1) refers to a declared vehicle, whose centroid lies within a speci ed neighborhood of the centroid of a non-target vehicle, and not within that of any target vehicle.Type II false alarm (FA2) refers to any declared vehicle that is neither a CDD nor an FA1.(Its centroid is not within the speci ed neighborhood of any target vehicle or non-target vehicle.) In where