[View]  [Edit]  [Lock]  [References]  [Attachments]  [History]  [Home]  [Changes]  [Search]  [Help] 

Haar-based Detectors For Pedestrian Detection

by Hannes Kruppa and Bernt Schiele, ETH Zurich, Switzerland

This archive provides the following three detectors:

These detectors have been successfully applied to pedestrian detection in still images. They can be directly passed as parameters to the program HaarFaceDetect.
NOTE: These detectors deal with frontal and backside views but not with side views (also see "Known limitations" below).


If you are using any of the detectors or involved ideas please cite this paper (available at http://www.vision.ethz.ch/publications/):
  author =       "Hannes Kruppa, Modesto Castrillon-Santana and Bernt Schiele",
  title =        "Fast and Robust Face Finding via Local Context."
  booktitle =    "Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance"
  year =         "2003",
  month =        "October"


If you have any commercial interest in this work please contact hkruppa@inf.ethz.ch


Check out the demo movie, e.g. using mplayer or any (Windows/Linux-) player that can play back .mpg movies.
Under Linux that's: ffplay demo.mpg or: mplayer demo.mpg

The movie shows a person walking towards the camera in a realistic indoor setting. Using ffplay or mplayer you can pause and continue the movie by pressing the space bar.

Detections coming from the different detectors are visualized using different line styles:
upper body : dotted line
lower body : dashed line
full body : solid line

You will notice that successful detections containing the target do not sit tightly on the body but also include some of the background left and right. This is not a bug but accurately reflects the employed training data which also includes portions of the background to ensure proper silhouette representation. If you want to get a feeling for the training data check out the CBCL data set: http://www.ai.mit.edu/projects/cbcl/software-datasets/PedestrianData.html

There is also a small number of false alarms in this sequence.
NOTE: This is per frame detection, not tracking (which is also one of the reasons why it is not mislead by the person's shadow on the back wall).

On an Intel Xeon 1.7GHz machine the detectors operate at something between 6Hz to 14 Hz (on 352 x 288 frames per second) depending on the detector. The detectors work as well on much lower image resolutions which is always an interesting possibility for speed-ups or "coarse-to-fine" search strategies.

Additional information e.g. on training parameters, detector combination, detecting other types of objects (e.g. cars) etc. is available in my PhD thesis report (available end of June). Check out http://www.vision.ethz.ch/kruppa/


1) the detectors only support frontal and back views but not sideviews. Sideviews are trickier and it makes a lot of sense to include additional modalities for their detection, e.g. motion information. I recommend Viola and Jones' ICCV 2003 paper if this further interests you.

2) dont expect these detectors to be as accurate as a frontal face detector. A frontal face as a pattern is pretty distinct with respect to other patterns occuring in the world (i.e. image "background"). This is not so for upper, lower and especially full bodies, because they have to rely on fragile silhouette information rather than internal (facial) features. Still, we found especially the upper body detector to perform amazingly well. In contrast to a face detector these detectors will also work at very low image resolutions


Thanks to Martin Spengler, ETH Zurich, for providing the demo movie.