Removed the definition of basename, which confuses an existing system one.

[folded-ctf.git] / README.txt
diff --git a/README.txt b/README.txt

index e74ec22..61d8446 100644 (file)
--- a/README.txt
+++ b/README.txt
@@ -1,64 +1,94 @@
  
  INTRODUCTION
+------------
  
-  This is the C++ implementation of the folded hierarchy of
-  classifiers for cat detection described in
+  This is the documentation for the open-source C++ implementation of
+  the folded hierarchy of classifiers for cat detection described in
  
       F. Fleuret and D. Geman, "Stationary Features and Cat Detection",
-     Journal of Machine Learning Research (JMLR), 2008, to appear.
+     Journal of Machine Learning Research (JMLR), 9, 2549-2578, 2008.
  
-  Please cite this paper when referring to this software.
+  Please use that citation and the URL
  
-INSTALLATION
+     http://www.idiap.ch/folded-ctf/
  
-  This program was developed on Debian GNU/Linux computers with the
-  following main tool versions
+  when referring to this software.
  
-   * GNU bash, version 3.2.39
-   * g++ 4.3.2
-   * gnuplot 4.2 patchlevel 4
+  Contact Francois Fleuret at francois.fleuret@idiap.ch for comments
+  and bug reports.
  
-  If you have installed the RateMyKitten images provided on
+INSTALLATION
+------------
  
-    http://www.idiap.ch/folded-ctf
+  If you have installed in the same directory as the source code the
+  RateMyKitten images available on the same web page as the source
+  code, everything should work seamlessly by invoking the ./run.sh
+  script.
  
-  in the source directory, everything should work seamlessly by
-  invoking the ./run.sh script. It will
+  It will
  
-   * Compile the source code entirely
+  * Compile the source code entirely
  
-   * Generate the "pool file" containing the uncompressed images
-     converted to gray levels, labeled with the ground truth.
+  * Generate the "pool file" containing the uncompressed images
+    converted to gray levels, labelled with the ground truth.
  
-   * Run 20 rounds of training / test (ten rounds for each of HB and
-     H+B detectors with different random seeds)
+  * Run 20 rounds of training / test (ten rounds for each of HB and
+    H+B detectors with different random seeds)
  
-  You can also run the full thing with the following commands if you
-  have wget installed
+  You can run the full thing with the following commands if you have
+  wget installed
  
-     wget http://www.idiap.ch/folded-ctf/not-public-yet/data/folding-gpl.tgz
-     tar zxvf folding-gpl.tgz
-     cd folding
-     wget http://www.idiap.ch/folded-ctf/not-public-yet/data/rmk.tgz
-     tar zxvf rmk.tgz
-     ./run.sh
+  > wget http://www.idiap.ch/folded-ctf/data/folding-v1.0.tgz
+  > tar zxvf folding-v1.0.tgz
+  > cd folding
+  > wget http://www.idiap.ch/folded-ctf/data/rmk-v1.0.tgz
+  > tar zxvf rmk-v1.0.tgz
+  > ./run.sh
  
-  Note that every one of the twenty rounds of training/testing takes
-  more than three days on a powerful PC. However, the script detects
+  Note that for every round, we have to fully train a detector and run
+  the test through all the test scenes at 10 different thresholds,
+  including at very conservative thresholds for which the
+  computational efforts is very high. Hence, each round takes more
+  than three days on a powerful PC. However, the script detects
    already running computations by looking at the presence of the
-  corresponding result directory. Hence, it can be run in parallel on
-  several machines as long as they see the same result directory.
+  corresponding result directories. Hence, it can be run in parallel
+  on several machines as long as they see the same result directory.
  
    When all or some of the experimental rounds are over, you can
-  generate the ROC curves by invoking the ./graph.sh script.
+  generate ROC curves by invoking ./graph.sh script. You need a fairly
+  recent version of Gnuplot.
  
-  You are welcome to send bug reports and comments to fleuret@idiap.ch
+  If you pass the argument "pics" to the ./graphs.sh script, it will
+  save images from the data set with the ground truth plotted on them,
+  the pose-indexed referential, and examples of the pose-indexed
+  feature windows.
  
-PARAMETERS
+  This program was developed on Debian GNU/Linux computers with the
+  following main tool versions
+
+  * GNU bash, version 3.2.39
+  * g++ 4.3.2
+  * gnuplot 4.2 patchlevel 4
  
-  To set the value of a parameter during an experiment, just add an
-  argument of the form --parameter-name=value before the commands that
-  should take into account that value.
+  Due to approximations in the optimized arithmetic operations with
+  g++, results may vary with different versions of the compiler and/or
+  different levels of optimization.
+
+EXECUTING THE PROGRAM
+---------------------
+
+  The main command has to be invoked with a list of parameter values,
+  followed by commands to execute. A parameter value is modified by
+  adding an argument of the form --parameter-name=value.
+
+  For instance, to open a scene pool ./something.pool, train a
+  detector and save it with all other parameters kept at their default
+  value, you would do
+
+    ./folding --pool-name=./something.pool open-pool train-detector write-detector
+
+PARAMETERS
+----------
  
    For every parameter below, the default value is given between
    parenthesis.
@@ -73,16 +103,17 @@ PARAMETERS
  
    * pictures-for-article ("no")
  
-    Should the pictures be generated to be clear in b&w
+    Should the pictures be generated for printing in black and white.
  
-  * pool-name (no default)
+  * pool-name (none)
  
-    Where are the data to use
+    The scene pool file name.
  
-  * test-pool-name (no default)
+  * test-pool-name (none)
  
-    Should we use a separate pool file, and ignore proportion-for-test
-    then.
+    Should we use a separate test pool file. If none is given, then
+    the test scenes are taken at random from the main pool file
+    according to proportion-for-test.
  
    * detector-name ("default.det")
  
@@ -90,14 +121,14 @@ PARAMETERS
  
    * result-path ("/tmp/")
  
-    In what directory should we save all the produced file during the
+    In what directory should we save all the produced files during the
      computation.
  
    * loss-type ("exponential")
  
-    What kind of loss to use for the boosting. While different loss are
-    implementer in the code, only the exponential has been thoroughly
-    tested.
+    What kind of loss to use for the boosting. While different losses
+    are implemented in the code, only the exponential has been
+    thoroughly tested.
  
    * nb-images (-1)
  
@@ -107,42 +138,41 @@ PARAMETERS
    * tree-depth-max (1)
  
      Maximum depth of the decision trees used as weak learners in the
-    classifier.
+    classifier. The default value of 1 corresponds to stumps.
  
    * proportion-negative-cells-for-training (0.025)
  
      Overall proportion of negative cells to use during learning (we
-    sample among them)
+    sample among them for boosting).
  
    * nb-negative-samples-per-positive (10)
  
-    How many negative cell to sample for every positive cell during
+    How many negative cells to sample for every positive cell during
      training.
  
    * nb-features-for-boosting-optimization (10000)
  
-    How many pose-indexed features to use at every step of boosting.
+    How many pose-indexed features to look at for optimization at
+    every step of boosting.
  
-  * force-head-belly-independence (no)
+  * force-head-belly-independence ("no")
  
      Should we force the independence between the two levels of the
      detector (i.e. make an H+B detector)
  
-  * nb-weak-learners-per-classifier (10)
+  * nb-weak-learners-per-classifier (100)
  
-    This parameter corresponds to the value U in the JMLR paper, and
-    should be set to 100.
+    This parameter corresponds to the value U in the article.
  
    * nb-classifiers-per-level (25)
  
-    This parameter corresponds to the value B in the JMLR paper.
+    This parameter corresponds to the value B in the article.
  
-  * nb-levels (1)
+  * nb-levels (2)
  
-    How many levels in the hierarchy, this is 2 for the JMLR paper
-    experiments.
+    How many levels in the hierarchy.
  
-  * proportion-for-train (0.5)
+  * proportion-for-train (0.75)
  
      The proportion of scenes from the pool to use for training.
  
@@ -164,14 +194,16 @@ PARAMETERS
    * write-parse-images ("no")
  
      Should we save one image for every test scene with the resulting
-    alarms.
+    alarms. This option generates a lot of images for every round and
+    is switched off by default. Switch it on to produce images such as
+    the full page of results in the paper.
  
    * write-tag-images ("no")
  
      Should we save the (very large) tag images when saving the
      materials.
  
-  * wanted-true-positive-rate (0.5)
+  * wanted-true-positive-rate (0.75)
  
      What is the target true positive rate. Note that this is the rate
      without post-processing and without pose tolerance in the
@@ -205,19 +237,52 @@ PARAMETERS
  
    * progress-bar ("yes")
  
-    Should we display a progress bar.
+    Should we display a progress bar during long computations.
  
  COMMANDS
+--------
+
+  * open-pool
+
+    Open the pool of scenes.
+
+  * train-detector
+
+    Create a new detector from the training scenes.
+
+  * compute-thresholds
+
+    Compute the thresholds of the detector classifiers from the
+    validation set to obtain the required wanted-true-positive-rate.
+
+  * test-detector
+
+    Run the detector on the test scenes.
+
+  * sequence-test-detector
+
+    Visit nb-wanted-true-positive-rates rates between 0 and
+    wanted-true-positive-rate, for each compute the detector
+    thresholds on the validation set and estimate the error rate on
+    the test set.
+
+  * write-detector
+
+    Write the current detector to the file detector-name
+
+  * read-detector
+
+    Read a detector from the file detector-name
+
+  * write-pool-images
+
+    For every of the first nb-images of the pool, save one PNG image
+    with the ground truth, one with the corresponding referential at
+    the reference scale, and one with the feature material-feature-nb
+    from the detector. This last image is not saved if either no
+    detector has been read/trained or if no feature number has been
+    specified.
  
-   open-pool
-   train-detector
-   compute-thresholds
-   test-detector
-   sequence-test-detector
-   write-detector
-   read-detector
-   write-pool-images
-
-  --
-  Francois Fleuret
-  October 200
+--
+Francois Fleuret
+October 2008