Patricio Vela: Course Wiki

Module #2: Target Modelling and Re-Identification

Week #2: Testing the Method

Once we know the mixture models are proper, then the next step is to actually use the models as a means to test proximity of one model to another. If two models are close, or the data from a model is consistent with the model, then two should be the same. In this manner, we can identify when the same person comes in and out, or when it gets occluded and re-appears. This action is called re-identification.

Matching Data to Model: Theory

To start simply, let's first consider the case of checking the data from the model for consistency. Imagine that there are 10 people who has walked in, each with their own Gaussian mixture model as the signature, and that they have subsequently left the scene. One of those 10 returns. Which of them is it? … Well, we have a new set of image data from the binary masked image portion of the recently entered person. Our question get converted to the question: Which of the existing models is the new data a good fit for? That naturally begets the question: How can we create a scoring mechanism for testing fitness of data to existing models?

We have been doing this in the course homework using a scoring energy principally based on squared distance. For Gaussian models, the squared distance is not so appropriate since there is a known covariance matrix which describes how the space should be warped to respect the spread of the data. Such a warped squared distance is known as the Mahalanobis distance, though in other domains it is simply a non-trivial $L_2$ norm. Matlab's implementation of a the Gaussian mixture (GM) model class has a member function for computing the Mahalanobis distance of data to a GM model (called 'mahal'). The equation for the scoring energy is:

$$E(T ; M) = \int_{\mathcal{D}} \min_{i} (T(x) - \mu_i)^T \Sigma_i (T(x) - \mu_i) dt \approx \sum_{t_\alpha \epsilon T} \;\min_{i} (t_\alpha - \mu_i) \Sigma_i (t_\alpha - \mu_i) $$

where $T$ is the template image data extracted from the image of the just entered or newly detected target/person, $M$ is a pre-existing target Gaussian mixture model whose mixtures $(\mu_i, \Sigma_i)$ are indexed by $i$. The two versions of the energy are the continuous model and the discrete approximation. Let's review their details.

Recall that we treat an image as a 2D function $I:\mathbb{R}^2 \rightarrow \mathbb{R}^d$ whose output dimension is $d$. Based on the earlier module, our template model appended the pixel coordinate position in a template centered frame, so it is $T:\mathbb{R}^2 \rightarrow \mathbb{R}^{d+2}$. Except that the template is only valid on the subset $\mathcal{D} \subset \mathbb{R}^2$, which is the domain of the template coordinate system. In the discretized approximation, we think of the “vectorized” or sequentially listed discrete pixel elements, hence the indexing via $\alpha$ into $T$ as $t_\alpha$. Secondly, only one model within the mixture model should be selected for the scoring or energy process, hence the $\min$ evaluation. The scores for all models are computed, but only the best is taken. The minimization makes sense because if a data point $t_\alpha$ is spot on for one model, it will be far for the others. That farness from the others should not be included since it is an unnecessary penalty.

Of course, that gives the energy for one single model. For multiple stored target models, the matching model for the test template would be the one with the smallest score. Let the model set be $\mathcal{M}$, which stores all of the known models $M_r$ indexed by $r$. The best match is:

$$ r^* = \arg \min_{r} E(T; M_r). $$

Matching Data to Model: Practice

Based on the earlier module, you should be comfortable with “vectorizing” or converting the template data to sequential form (basically into a 2D matrix where the data is column-wise, or possibly row-wise). What is needed then is to simply score the data. For that, Matlab has the 'mahal' member function. You should see that it returns multiple distances, one for each model in the mixture. Only the minimum value should be taken and used. It should be somewhat clear what to do for that step and for the final summation step.

Implementation

Create a function that takes in a model and the (sequential) template data matrix, and returns the score. Call it scoreTarget.
Create a second function that takes in the template matrix data, plus the set of models, and returns both the index into the best scoring model (first argument) and its associated score (second argument). Call it matchTargetData. Since your model is a Gaussian mixture class instance, and you will have multiple such ones, they should be stored in an array (or at least you should do/have done something to that effect).
Test out the matchTargetData function using the exact same data that generated the model. In practice you should get back the index to the model that the data was used to create. If you pass it in the same order as you created the models, then your matchTargetData output for the data should basically go from 1 to the number of models you have, in order. Comment on whether this happened or not. If some of the outputs were wrong, plot the cartoon visualization of the confused template models, plot the actual data being tested, and explain why the matching could have been mistaken.
Test out the matchTargetData function using the second template or paired data for the person. Assess the accuracy of the output by computing (total correct) / (total test cases). Are the model assignment errors consistent with the mistakes made in the previous step when the exact same data was used? If different, is the mistake sensible or not? Support with visual and scoring evidence.
Download someone else's data from the google drive location (hopefully there is more than one set of data there), and apply the same procedure to it. What this means is, use the first set of the data to train a set of Gaussian mixture models, then use the second set to test with. Report the accuracy, and answer the same questions as the previous step.

Back