TOE
	TOE = Timm's theory of everything
	It aims to  simplify knowledge-level modeling with a little 
	data mining.

KNOWLEDGE-LEVEL PROBLEM SOLVING METHODS

	note
		the following text references certain terms that aren't 
		explained till below. So just relax and go with the flow.

	anomaly detector (hmmm... that's odd)
		: walk through data in "eras" of, say, 100 instances
		: report if median "likelihood(1)" of era i < era[i-1]/2

	verification (do I trust what is going on now?)
		: alert if any app runs on an "era" with anomalies

	classification (give me the executive summary)
		: "likelihood(n)"

	mode identification (what is happening now?)
		: classification using  labels of previous eras
		: if classification is anomalous, declare a new label

	prediction (what will happen now?)
		: classification of this era, then return in the current 
		  era are the expected values

	planning (how to get there?)
		: find a "contrast set" between a current and goal era.

	control (how to sail upwards)
		: find a "contrast set" between a current era and all 
	      eras with a higher weight. 

	monitor (are we currently smiling?)
		: classification over the utility labels

	explanation
		: contrast set between two eras

	diagnosis (how did we go bad?)
		: explanation, from an eras with a lower to  
		  a higher utility

	repair (how can we go good?)
		: diagnosis, but flip the weights.
		: also "contrast set" between bad and good, 
		: favoring attributes that have the highest frequency 
		  difference and are cheapest to control

	insert your own here

FUNCTIONS

	supervised
		count 
			: build a frequency table for all 
			  attribute/range/class  values f[Attr,Range,Class].; 
			: e.g. f[sex,male,pregnant]  = 0
			: Note that f[class,label,class] is the 
			  frequency of class label "Range", which we'll 
			  denote f[class] (and "F" is
			  the sum of all "f").

		likelihood(1)
			: every instance is labeled "seen"
			: compute likelihood that you have seen this before.
			: prod(f[a,r,"seen"])/f("seen")*(f("seen")/f) = 1)

		likelihood(N)
			: every instance is labeled L
			: compute likelihood that new instance has label L
			: report label with highest likelihood

		contrast 
			: given two populations
			: find ranges more frequent in one than the other
			: for top ranked ranges, try with rule generation

	unsupervised
		discretization
			: N bin, equal Freq (equal number of items in each bin)	
			test
				: using one table with numeric columns

		bore (best or rest)
			: discretization on a numeric utility score
			: label top score "best" and the others "rest"
			test
				: using one table with a numeric class

		nomralize
			: for all values in undiscretized numeric columns, 
			: replace them with
			: (value - min)/(max - min)
			test
				: using one table with numeric columns

		distance
			: reports distance between two rows using
			: sqrt((x2-x1)^2+(y2-y1)^2...) 
			: Note that x,y are NORMALIZED numerics
			: Also, if x,y are discrete, then their distance
			: is ONE if they are the same and ZERO otherwise
			: Also also, do not use the class columns for the distance
			: measure.
			test
				: using one table with discrete and numeric columns.

		median
			: Propose a node halfway between two others (for discrete 
			:  attributes, move half to the other value, at random).
			: If columns are numeric, go half way between them.
			: For the discrete columns, if their values are the same, use the
			: same value. If their values are different, flip half of them (at
			: random) to the values of the other guy.
			test
				: Using one table with discrete and numeric columns, pick any
				: two rows at random.

		GAC
			: builds a tree of nearest pairs
			: if too slow, use sub/micro sampling as a pre-cursor

	sampling
		randomizer
			: Randomly re-order rows of the data
			test
				: Using one table.

		eras
			: Spits our data, X instances at a time
			test
				: Using one table. Note that each "spit"
				: should be a new table.

		utility
			: add a label to each row based on a scoring function
			: note: simplest one is to just apply the class symbol

		sub:sampling
			: report all rows of the minority class 
			: use same number of every other class (at random)

		over:sampling
			: report all rows of the majority class 
			: use same number of every other class (repeat at random)

		micro-sampling
			: pick N instances (at random) of all classes

EXPERIMENT

	hypothesis
		: once the above is working, the building a whole 
		  range of knowledge-level tasks is a trivial process

	tools
		: we'll need a generator of data to test this all out

	generator
		sampler(L,P)
			: ascend levels L in the GAC
			: find the average distance of things at level L
			  returns random instances within D*L .

		alienator
			: take classified data 
			: generates eras of the same class frequency 
			  as the original data set
			: at interval I, injects a different frequency 
			  classes at probability P