ece4580:module_recognition
This is an old revision of the document!
−Table of Contents
Object Recognition
/*
(1) object detector with boosting: http://people.csail.mit.edu/torralba/shortCourseRLOC/boosting/boosting.html
*/
Module #1
Clustering
- Study k-means clustering algorithm and the algorithmic steps for k-means clustering.
- Download (or clone) the clustering skeleton code here
- Implement k-means clustering algorithm working in RGB space by following the algorithmic steps. You are welcome to implement from scratch without skeleton code.
- Test your algorithm on segmenting the image segmentation.jpg using k=3
- Try different random initialization and show corresponding results.
- Comment on your different segmentation results.
Module #2
Object Recognition
- Study the bag-of-words approach for classification/Recognition task
- We begin with implementing a simple but powerful recognition system to classify faces and cars.
- Check here for skeleton code. First, follow the README to setup the dataset and vlfeat library.
- In our implementation, you will find vlfeat library very useful. One may use vl_sift, vl_kmeans and vl_kdtreebuild.
- Now, use first 40 images in both categories for training.
- Extract SIFT features from each image
- Derive k codewords with k-means clustering in module 1.
- Compute histogram of codewords using kd-tree algorithm using vlfeat.
- Use the rest of 50 images in both categories to test your implementation.
- Report the accuracy and computation time with different k
Module #3
Spatial Pyramid Matching (SPM)
- Study Spatial Pyramid Matching which can improve BoW apporach by concatenating histogram vectors.
- We will implement a simplified version of SPM based on your molude 2
- First, for each traning image, divide it equally into a 2 × 2 spatial bin.
- Second, for each of the 4 bins, extract the SIFT features and compute the histograms of codewords as in module 2
- Third, concatenate the 4 histogram vectors in a fixed order. (hint: the a vector has 4k dimension.)
- Forth, concatenate the vector you have in module 2 with this vector (both weighted by 0.5 before concatenated).
- Finally, use this 5k representation and re-run the training and testing again.
- Compare the results from module 3 and module 2. Explain what you observe.
ece4580/module_recognition.1485253632.txt.gz · Last modified: 2024/08/20 21:38 (external edit)