INTRODUCTION

  This is the C++ implementation of the folded hierarchy of
  classifiers for cat detection described in

     F. Fleuret and D. Geman, "Stationary Features and Cat Detection",
     Journal of Machine Learning Research (JMLR), 2008, to appear.

  Please cite this paper when referring to this software.

INSTALLATION

  This program was developed on Debian GNU/Linux computers with the
  following main tool versions

   * GNU bash, version 3.2.39
   * g++ 4.3.2
   * gnuplot 4.2 patchlevel 4

  If you have installed the RateMyKitten images provided on

    http://www.idiap.ch/folded-ctf

  in the source directory, everything should work seamlessly by
  invoking the ./run.sh script. It will

   * Compile the source code entirely

   * Generate the "pool file" containing the uncompressed images
     converted to gray levels, labeled with the ground truth.

   * Run 20 rounds of training / test (ten rounds for each of HB and
     H+B detectors with different random seeds)

  You can also run the full thing with the following commands if you
  have wget installed

     wget http://www.idiap.ch/folded-ctf/not-public-yet/data/folding-gpl.tgz
     tar zxvf folding-gpl.tgz
     cd folding
     wget http://www.idiap.ch/folded-ctf/not-public-yet/data/rmk.tgz
     tar zxvf rmk.tgz
     ./run.sh

  Note that every one of the twenty rounds of training/testing takes
  more than three days on a powerful PC. However, the script detects
  already running computations by looking at the presence of the
  corresponding result directory. Hence, it can be run in parallel on
  several machines as long as they see the same result directory.

  When all or some of the experimental rounds are over, you can
  generate the ROC curves by invoking the ./graph.sh script.

  You are welcome to send bug reports and comments to fleuret@idiap.ch

PARAMETERS

  To set the value of a parameter during an experiment, just add an
  argument of the form --parameter-name=value before the commands that
  should take into account that value.

  For every parameter below, the default value is given between
  parenthesis.

  * niceness (5)

    Process priority

  * random-seed (0)

    Global random seed

  * pictures-for-article ("no")

    Should the pictures be generated to be clear in b&w

  * pool-name (no default)

    Where are the data to use

  * test-pool-name (no default)

    Should we use a separate pool file, and ignore proportion-for-test
    then.

  * detector-name ("default.det")

    Where to write or from where to read the detector.

  * result-path ("/tmp/")

    In what directory should we save all the produced file during the
    computation.

  * loss-type ("exponential")

    What kind of loss to use for the boosting. While different loss are
    implementer in the code, only the exponential has been thoroughly
    tested.

  * nb-images (-1)

    How many images to process in list_to_pool or when using the
    write-pool-images command.

  * tree-depth-max (1)

    Maximum depth of the decision trees used as weak learners in the
    classifier.

  * proportion-negative-cells-for-training (0.025)

    Overall proportion of negative cells to use during learning (we
    sample among them)

  * nb-negative-samples-per-positive (10)

    How many negative cell to sample for every positive cell during
    training.

  * nb-features-for-boosting-optimization (10000)

    How many pose-indexed features to use at every step of boosting.

  * force-head-belly-independence (no)

    Should we force the independence between the two levels of the
    detector (i.e. make an H+B detector)

  * nb-weak-learners-per-classifier (10)

    This parameter corresponds to the value U in the JMLR paper, and
    should be set to 100.

  * nb-classifiers-per-level (25)

    This parameter corresponds to the value B in the JMLR paper.

  * nb-levels (1)

    How many levels in the hierarchy, this is 2 for the JMLR paper
    experiments.

  * proportion-for-train (0.5)

    The proportion of scenes from the pool to use for training.

  * proportion-for-validation (0.25)

    The proportion of scenes from the pool to use for estimating the
    thresholds.

  * proportion-for-test (0.25)

    The proportion of scenes from the pool to use to test the
    detector.

  * write-validation-rocs ("no")

    Should we compute and save the ROC curves estimated on the
    validation set during training.

  * write-parse-images ("no")

    Should we save one image for every test scene with the resulting
    alarms.

  * write-tag-images ("no")

    Should we save the (very large) tag images when saving the
    materials.

  * wanted-true-positive-rate (0.5)

    What is the target true positive rate. Note that this is the rate
    without post-processing and without pose tolerance in the
    definition of a true positive.

  * nb-wanted-true-positive-rates (10)

    How many true positive rates to visit to generate the pseudo-ROC.

  * min-head-radius (25)

    What is the radius of the smallest heads we are looking for.

  * max-head-radius (200)

    What is the radius of the largest heads we are looking for.

  * root-cell-nb-xy-per-radius (5)

    What is the size of a (x,y) square cell with respect to the radius
    of the head.

  * pi-feature-window-min-size (0.1)

    What is the minimum pose-indexed feature windows size with respect
    to the frame they are defined in.

  * nb-scales-per-power-of-two (5)

    How many scales do we visit between two powers of two.

  * progress-bar ("yes")

    Should we display a progress bar.

COMMANDS

   open-pool
   train-detector
   compute-thresholds
   test-detector
   sequence-test-detector
   write-detector
   read-detector
   write-pool-images

  --
  Francois Fleuret
  October 200