Both sides previous revisionPrevious revisionNext revision | Previous revision |
ece4580:module_pcd:localneighborhood [2017/02/21 12:56] – [Using Sub-Sampling and Clustering via Nearest Neighbors] pvela | ece4580:module_pcd:localneighborhood [2024/08/20 21:38] (current) – external edit 127.0.0.1 |
---|
===== Using Greedy Point Selection ===== | ===== Using Greedy Point Selection ===== |
| |
This is sort of the dual to the Matlab nearest neighbor search process and more in line with the connectivity matrix approach. Matlab has a function called ''rangesearch'' that finds all the points in a source set that are within range of points in the target set. The return values are cell arrays that give the necessary connectivity and distance information. Since the maximum radius as specified when invoking the function, only the index cell sets are needed. They provide the connectivity information. As noted in the [[Matlab documentation|https://www.mathworks.com/help/stats/rangesearch.html]] for the function, its goes as follows: | This is sort of the dual to the Matlab nearest neighbor search process and more in line with the connectivity matrix approach. Matlab has a function called ''rangesearch'' that finds all the points in a source set that are within range of points in the target set. The return values are cell arrays that give the necessary connectivity and distance information. Since the maximum radius as specified when invoking the function, only the index cell sets are needed. They provide the connectivity information. As noted in the [[https://www.mathworks.com/help/stats/rangesearch.html| Matlab documentation]] for the function, its goes as follows: |
<code> | <code> |
[idx, dists] = rangesearch(source, target, radius) | [idx, dists] = rangesearch(source, target, radius) |
Radius selection for locality calculations has been studied a fair amount, with many folk coming up with different rules. Surely trying to figure out what this number should be by guess and checking is probably not optimal, and definitely does not lend itself to automation. Here are a couple of ideas: | Radius selection for locality calculations has been studied a fair amount, with many folk coming up with different rules. Surely trying to figure out what this number should be by guess and checking is probably not optimal, and definitely does not lend itself to automation. Here are a couple of ideas: |
| |
- Compute the median distance. For that, the ''pdist'' function could be useful. | - Compute the median distance of all pair-wise point distances. For that, the ''pdist'' function could be useful. |
- Compute the average distance of only the $k$-nearest neighbors of a point set to itself. Basically, the $k$-nearest neighbors are sought for each point, then all of the distances are collected for these neighbors. Their mean is then computed (or median if you'd prefer). No for loops should be necessary. Matlab's ''knnsearch'' method does most of the work. A couple lines later, you are good to go. | - Compute the average distance of only the $k$-nearest neighbors of a point set to itself. Basically, the $k$-nearest neighbors are sought for each point, then all of the distances are collected for these neighbors. Their mean is then computed (or median if you'd prefer). No for loops should be necessary. Matlab's ''knnsearch'' method does most of the work. A couple lines later, you are good to go. \\ The value $k$ should be chosen so that you almost always get the number of neighbors you need with the radius chosen. Say you need 10 points, then a good value for $k$ would be something like 15. It should be that more than 70% of the time, you will get 10 points. |
| |
-------------------------- | -------------------------- |