README.md

   1 # Introduction #
   2
   3 This is a port of the Synthetic Visual Reasoning Test problems to the
   4 pytorch framework, with an implementation of two convolutional
   5 networks to solve them.
   6
   7 # Installation and test #
   8
   9 Executing
  10
  11 ```
  12 make -j -k
  13 ./test-svrt.py
  14 ```
  15
  16 should generate an image
  17 [`example.png`](https://fleuret.org/git-extract/pysvrt/example.png) in
  18 the current directory.
  19
  20 Note that the image generation does not take advantage of GPUs or
  21 multi-core, and can be as fast as 10,000 vignettes per second and as
  22 slow as 40 on a 4GHz i7-6700K.
  23
  24 # Vignette generation and compression #
  25
  26 ## Vignette sets ##
  27
  28 The file [`svrtset.py`](https://fleuret.org/git-extract/pysvrt/svrtset.py) implements the classes `VignetteSet` and
  29 `CompressedVignetteSet` with the following constructor
  30
  31 ```
  32 __init__(problem_number, nb_samples, batch_size, cuda = False, logger = None)
  33 ```
  34
  35 and the following method to return one batch
  36
  37 ```
  38 (torch.FloatTensor, torch.LongTensor) get_batch(b)
  39 ```
  40
  41 as a pair composed of a 4d 'input' Tensor (i.e. single channel 128x128
  42 images), and a 1d 'target' Tensor (i.e. Boolean labels).
  43
  44 ## Low-level functions ##
  45
  46 The main function for genering vignettes is
  47
  48 ```
  49 torch.ByteTensor svrt.generate_vignettes(int problem_number, torch.LongTensor labels)
  50 ```
  51
  52 where
  53
  54  * `problem_number` indicates which of the 23 problem to use
  55  * `labels` indicates the boolean labels of the vignettes to generate
  56
  57 The returned ByteTensor has three dimensions:
  58
  59  * Vignette index
  60  * Pixel row
  61  * Pixel col
  62
  63 The two additional functions
  64
  65 ```
  66 torch.ByteStorage svrt.compress(torch.ByteStorage x)
  67 ```
  68
  69 and
  70
  71 ```
  72 torch.ByteStorage svrt.uncompress(torch.ByteStorage x)
  73 ```
  74
  75 provide a lossless compression scheme adapted to the ByteStorage of
  76 the vignette ByteTensor (i.e. expecting a lot of 255s, a few 0s, and
  77 no other value).
  78
  79 This compression reduces the memory footprint by a factor ~50, and may
  80 be usefull to deal with very large data-sets and avoid re-generating
  81 images at every batch. It induces a little overhead for decompression,
  82 and moving from CPU to GPU memory.
  83
  84 See vignette_set.py for a class CompressedVignetteSet using it.
  85
  86 # Testing convolution networks #
  87
  88 The file
  89 [`cnn-svrt.py`](https://fleuret.org/git-extract/pysvrt/cnn-svrt.py)
  90 provides the implementation of two deep networks designed by Afroze
  91 Baqapuri during an internship at Idiap, and allows to train them with
  92 several millions vignettes on a PC with 16Gb and a GPU with 8Gb.