Project 2b

For 500-level students only.

Note: Apologies for getting this out late. I'll make it due Friday Week 7. If you moan loud enough in week7, I'll make it due Monday week 8 without any late fines.

Theory

Consider a model with N inputs (i.e. either details about a case study or some environmental factors or some control distribution for a random number generator). Assume each inputs comes from some space of possibilities range(Ni).

Notice we can handle both discrete and numeric values the same way: as a set of bins with only one difference we call ordinalp

Note we can initialize the probability distribution of Ni, just by drawing the bins using "~":

(defconstant ~ 0)
(defconstant ~~ 1)
(defconstant ~~~ 2)
(defconstant ~~~~ 3)
(defconstant ~~~~~ 4)
(defconstant ~~~~~~ 5)
(defconstant ~~~~~~~ 6)
(defconstant ~~~~~~~~ 7)
(defconstant ~~~~~~~~~ 8)
(defconstant ~~~~~~~~~~ 9)
(defconstant ~~~~~~~~~~~ 10)
(defconstant ~~~~~~~~~~~~ 11)
(defconstant ~~~~~~~~~~~~~ 12)
(defconstant ~~~~~~~~~~~~~~ 13)
(defconstant ~~~~~~~~~~~~~~~ 14)
(defconstant ~~~~~~~~~~~~~~~~ 15)
(defconstant ~~~~~~~~~~~~~~~~~ 16)
(defconstant ~~~~~~~~~~~~~~~~~~ 18)
(defconstant ~~~~~~~~~~~~~~~~~~~ 19)
(defconstant ~~~~~~~~~~~~~~~~~~~~ 20)

(define age 
      20 ~
         ~~~
         ~~~~~~
         ~~~~~~~~~~
         ~~~~~~~~~~
         ~~~~~~
         ~~~
      60 ~
      
)

(This will need a little support code (and that is something you need to code).

Anyway, note that the model input is now a vector "V" with one slot for each range of each input.

At any time, some oracle has demanded that we only use a subset U ⊆ V of this vector.

Completion

The model requires one input for each Ni variable. We call the process of finding the inputs as the "completion" of "U".

Using the Bins

Conceptually, there are multiple copies of the bins:

  1. "PD": The raw bins counting how often new observations fall into a bin. Defines the probability distribution for the data:
    bin1 : 10
    bin2 : 200
    bin3  : 10
    bin4  : 2
    
  2. "CD": The raw cumulative frequencies. Defines the cumulative probability distributions for the data:
    bin1 : 10
    bin2 : 210
    bin3 : 220
    bin4 : 222
    
  3. "Sorted PD": The sorted raw bins
    bin2 : 200
    bin1 : 10
    bin3 : 10
    bin4 : 2
    
  4. "Sorted CD": The raw sorted cumulative bins:
    bin2 : 200
    bin1 : 210
    bin3 : 220
    bin4 : 222
    

Note that as soon as we enter a new observation into "PD", then the other distributions becomes "stale" and we can't reuse it till we resort and recalculated the others

These bins are used for different purposes:

Tasks

  1. Task1: Implement POM1

    As described in the paper. See if you can generate the three figures for low, medium, high dynamism.

  2. Task2: Monkey's Banana

    Here's a little throw-away AI task, just to make sure you don't feel AI-starved.

    Using DFS, BFS, DFID, adapt the tree search algorithm i gave you in class to the monkey/banana problem shown in class (and yes, you to use the (op ...) syntax.

    If you answer this question properly then your search engine should be very general and the only place we see domain details is in (op).

    Important: the 400 level students are also solving this problem but their solution is due 3 weeks after yours. Please ensure that they do not see your code. Testing Task2:

    I will run demo-task2 and expect to see a trace generating of the monkey getting the banana. Then, I will edit the "-op" list, run again, and expect a different behavior.

    Important: the 400 level students are also solving this problem but their solution is due 3 weeks after yours. Please ensure that they do not see your code.

  3. Task3: Implement POM2

    Same deliverables above. But now generate 32 graphs for low,high * {dynamism, size, culture, criticality, personnel}.

  4. Task4: Implement "define"

    Implement the define function shown above. Make it store the generated dists in a global *dists*.

    Testing Task4: I will run various defines and then inspect *dists*.

  5. Task5: Updates

    Implement the update functionality described above. Hint, see any.lisp.

    Testing Task5: I will run various demo-task5, that you write, and I expect to see before and after printing of values pulled from *dists*.

  6. Task6: Nudging

    Implement the nudging functionality described above.

    Testing Task6: I will run various demo-task6, that you write, and I expect to see before and after nudging values pulled from *dists*.

  7. Task7: Sampling

    Implement the sampling functionality described above.

    Testing Task7: I will run various demo-task7, that you write, and I expect to see values printed from a sample of *dist* vars that conforms to the distributions defined in your define functions.

  8. Task8: Completion

    Implement the completing functionality described above.

    Testing Task8: I will run various demo-task8, that you write, and I expect to see a print of a vector containing some undecided values. Then, after completing, there will be another vector with atoms filled in from each value.


What's Next?

That's all for this project. But if you want to get started on the next one: