4000 rows * (9 columns: different space application * 2 physics involved very different - crew re-entry - pad abort bike data - Here are the results on the bicycle data set. This is similar to the results I sent out a little while back. 10 repeats of margins (4 runs of each algorithm per margins run). Results are for median values (data for mean and std.dev also available if you want it) Runtime TAR4: 0.13, 0.14, 0.18, 0.19, 0.22, [ - |+ ] TAR3: 0.22, 0.22, 0.23, 0.23, 0.25, [ |+ ] SA_TAR4: 72.1, 72.1, 158.1, 291.4, 473.5 SA_TAR3: 139.6, 185.9, 398.5, 541.2, 1352.9 Pct. Right (PD) TAR4: 39.5, 39.5, 39.7, 39.7, 40.3, [ | ] SA_TAR4: 20.6, 20.6, 24.0, 25.0, 26.8, [ -|+ ] TAR3: 20.6, 20.6, 21.3, 21.4, 21.4, [ | ] SA_TAR3: 14.6, 17.8, 18.8, 19.2, 21.5, [ --|+ ] Pct. Wrong (PF) TAR3: 5.9, 6.0, 6.0, 6.9, 9.3, [ -|+ ] SA_TAR3: 5.4, 7.0, 10.6, 11.1, 14.5, [ - |+ ] SA_TAR4: 15.9, 15.9, 21.7, 25.1, 32.1, [ - | ++++ ] TAR4: 31.0, 31.1, 31.2, 31.2, 31.4, [ | ] Sensitivity (PD/1+PF) TAR4: 30.1, 30.1, 30.2, 30.2, 30.7, [ | ] TAR3: 19.5, 19.5, 19.9, 20.0, 20.0, [ | ] SA_TAR4: 17.0, 17.0, 18.2, 21.6, 22.7, [ -|+ ] SA_TAR3: 13.7, 16.0, 17.1, 17.6, 19.4, [ --| ] Precision TAR3: 41.3, 49.5, 53.5, 54.0, 54.8, [ ----- | ] SA_TAR3: 29.0, 29.9, 31.9, 36.2, 44.1, [ -| ++++ ] TAR4: 29.6, 29.7, 29.8, 30.1, 30.2, [ |+ ] SA_TAR4: 19.2, 19.2, 23.1, 23.2, 27.4, [ - |++ ] F-Measure TAR4: 33.9, 33.9, 33.9, 34.0, 34.5, [ |+ ] TAR3: 29.7, 29.7, 29.9, 30.0, 30.3, [ |+ ] SA_TAR3: 21.6, 23.4, 25.4, 25.9, 27.2, [ - | ] SA_TAR4: 20.8, 20.8, 21.8, 26.9, 29.7, [ | ++ ] Runtime #key, ties, win, loss, win-loss TAR4, 0, 3, 0, 3 TAR3, 0, 2, 1, 1 SA_TAR4, 1, 0, 2, -2 SA_TAR3, 1, 0, 2, -2 Pct. Right (PD) #key, ties, win, loss, win-loss TAR4, 0, 3, 0, 3 SA_TAR4, 0, 2, 1, 1 TAR3, 0, 1, 2, -1 SA_TAR3, 0, 0, 3, -3 Pct. Wrong (PF) #key, ties, win, loss, win-loss TAR3, 0, 3, 0, 3 SA_TAR3, 0, 2, 1, 1 TAR4, 1, 0, 2, -2 SA_TAR4, 1, 0, 2, -2 Sensitivity (PD/1+PF) #key, ties, win, loss, win-loss TAR4, 0, 3, 0, 3 TAR3, 1, 1, 1, 0 SA_TAR4, 1, 1, 1, 0 SA_TAR3, 0, 0, 3, -3 Precision #key, ties, win, loss, win-loss TAR3, 0, 3, 0, 3 SA_TAR3, 0, 2, 1, 1 TAR4, 0, 1, 2, -1 SA_TAR4, 0, 0, 3, -3 ------ - As before, the choice between TAR3 and TAR4 is a trade-off between precision and recall. TAR4 obtains a much higher probability of detection, but at the cost of also grabbing a much larger number of false positives. I don't know which one is the "winner." I would lean towards TAR3, because its median false-positive rate is <10% of the data, but that is a more conservative approach because you have to spend more time trying to find all of the sought-after data. This is an interesting relationship that we need to discuss in the paper. - TAR4.1 still wins on runtime, but it is pretty close to TAR3 on this dataset. Pretty interesting. - Simulated Annealing is still our strawman. It can be made to act like TAR3 or TAR4, but its results are always weaker and its runtimes are ridiculous (in some cases over an hour) on the bike data. ------ I also collected data for the cliff-discretization versions of TAR3 and TAR4.1, but I think those are beyond the scope of this paper. Same trend as before, the new discretizer performs almost identically to the old one. I suspect that this is due to the low number of bins being used, but haven't had a chance yet to raise that number and do comparisons. If you guys see anything else in here worth commenting on, please let me know!