Setup
-----
- Install weka from http://www.cs.waikato.ac.nz/ml/weka/
- Update code/filter.pl, set the variable $javaClassPath at the top
  of the file to point to the installation location of weka.jar.

Running
-------
- The system is a set of scripts that are run using a make file. The
  typical sequence of commands to run is as follows:
      make clean-all
      make filter-data
      make resample-data
      make perform-test
      make class-stats
      make tex-output
- There is also a shell script called runTests.sh which puts the
  system through a number of runs using different thresholds. This
  is the script I used to produce the latest data sets for Tim.
  On my system this script takes about four hours to run.

Overview of the system:
-----------------------
- This system is complex and I will not attempt to describe
  everything. I anticipate that it will take time to understand the
  details and that you will have to ask me questions from time to
  time.
- "make clean-all".
   - This removes files from all data folders. Only do this if you
     are certain that you no longer need the data.
- "make filter-data"
   - This filters all files from the raw-data folder into the
     filtered-data folder.
   - In this process all low PD classes are removed from the data.
   - The script filter.pl contains the details of how this is done.
- "make resample-data"
   - This resamples the filtered data into the test-data folder.
   - Random sampling with replacement is used to produce 10 sets of
     test and training data for each filtered data file.
   - For each set of files, there is a pair of test and training files
     per class in the filtered data file.
   - The script resample.pl contains details of how this is done.
- "make perform-test"
   - This actually run the modified sawtooth.awk script on each of the
     resampled data pairs.
   - The script sawtooth.awk for the most part is not my code and I
     have not gone out of my way to clean it up. The majority of my
     work is within the "Pass==2" section.
   - The data produced is written to the test-results folder.
- "make class-stats"
   - This combines the test results for each data set together and
     writes the results to the final-stats folder.
   - There is one CSV per data set which summarizes the results of
     analysis for each class of the data set, and then the entire
     data set.
- "make tex-output"
   - This produces the final latex output, suitable for inclusion in
     a tex document.