[View]  [Edit]  [Lock]  [References]  [Attachments]  [History]  [Home]  [Changes]  [Search]  [Help] 

Comments of Dave Grossman on cvBlobsLib's algorithm

cvBlobsLib is based on earlier code that I wrote. Here is the background on the code that I wrote:

I first saw this algorithm around 1975 in a presentation by Gerry Agin at SRI International (then called Stanford Research Institute). I was unable to find it in any journal articles, so I implemented it by memory. The basic idea is to raster scan, numbering any new regions that are encountered, but also merging old regions when they prove to be connected on a lower row.
There is a final step that compresses the numbering.

The computations for area, moments, and bounding box are very straightforward. The computation of perimeter is very complicated.

The algorithm has two passes. The first pass converts the binary image to run code. The second pass processes the run codes for two consecutive rows at a time. It looks at the starting and ending columns of a region in each row. It also looks at their colors (black or white), whether or not they were previously encountered, and whether a region is new or old, or a bridge between two old regions. There are lots of possible states, but the implementation deals explicitly with the only states that are important.

The x centroid Xbar is determined by adding the contribution for each row.
In each row, if the pixels of a region are at y1, y2, ..., yn, then just add up all these values y1+y2+...+yn. (Or equivalently, n[y1+yn]/2.) After finishing all the rows, you have to divide the accumulated value by the area A.

The y centroid Ybar is determined as follows: For each row x, let the length of the run be n. Then just add x*n for all rows. When finished, divide by the area A.

Perimeter is really complicated for 2 reasons:
  1. A region may contain an interior region, so there is an interior perimeter as well as an exterior perimeter. Since people usually want perimeter to be only the exterior perimeter, the perimeter of the interior region has to be subtracted at the end of the computation.
  2. When analyzing a particular row, you can only compute the perimeter
contribution of the prior row. (This is different from area and centroid, where the computation is for the current row.) The perimeter computation is so complicated that I no longer remember how it works.

- Dave Grossman