Projects

For this work, you will need to know your group.

CS472

    1. basic start up stuff
    2. simple lisp funs
    1. In a new directory called 2/a, complete the following exercise. As always,generate your handing using ~timm/bin/handin.

      Copy the file stagger.lisp to guide.lisp. Modify the code to create a function "guide" that prompts the user for the next move. Valid commands are (L,R,F,C,O,B,?), where "?" prints the current stuff being carried by the agent, If you code this correctly, then:

      • if you try and walk into an obstacle,
      • if you try to chop or blow up or open anything and you don't have the right tool
      then nothing happens (but an error message is printed).

      To show me that this all works, run a session with you typing commands into the game in an emacs SLIME buffer, then save that buffer to disk (C-x,C-s). Modify "main.lisp" so it cats the saved buffer.

    2. In a new directory called 2/b, complete the following exercises. As always,generate your handin using ~timm/bin/handin.
      • Add commands (*,=). "=" lets you switch to one of sample boards and "*" lets you set the "life" of the agent Every time you take a step, the agents ages -0; at life=0, agent dies screaming; try and get to the gold and get home in a life of 40 for the sample boards. To show me that this all works, run your session with the game in the emacs SLIME buffer, then save that buffer to disk (C-X,C-S). Modify "main.lisp" so it cats the saved buffer.
      • Using DFS adapt the tree search algorithm i gave you in class to the monkey/banana problem shown in class (and yes, you to use the (op ...) syntax.

        If you answer this question properly then your search engine should be very general and the only place we see domain details is in (op). Write one inference file (contains DFS, BFS) and two "op" files containing an incomplete set of operators and another containing the write operators. Write a "main.lisp" file that loads your code, cats your code, then shows a trace of the program running with the incomplete, then the correct, operators.

        If you do all this correctly, the behavior of the knowledge base should change as a result of using different operators.

    3. In a new directory called 2/c, complete the following exercise. As always,generate your handin using ~timm/bin/handin.

      The task is the same as last week, but use BFS, Best-first, and DFID for monkeys and bananas. If you do this right, the changes to your code should be minimal.

    1. In a new directory called 3/a, implement the following. As always,generate your handin using ~timm/bin/handin.

      Add an evil environment. At probability P,

      • bushes grow in vacant spaces
      • open doors blow shut
      • wet squares with dry adjacent squares either flood to adjacent spare or evaporate (become dry squares)

      Conduct six manual walk around the board (using the code from 2a). Each walk will be conducted increasing evil from 0 to 0.025 to 0.05 to 0.1 to 0.2 to 0.4.

    2. In a new directory called 3/a, complete the following exercise. As always,generate your handin using ~timm/bin/handin.

      Implement a set of feature extractors from the board (e.g cant-get-to-gold, key-is-nearby, etc).

      Implement a score function for a board that use the feature extractors. Amongst other things, the score should (a) decrease if you can't get to the gold, and (b) increase if you are near the gold; or you have the gold and you are near the start; or if on the way to the gold, you are close by a key (or dynamite). Etc (the stuff in the last para is by no means exhaustive- they are many many more things you need to add).

      Add state to your search functions. That state should carry around a board and if you do something to it, you copy it and pass a modified board to a sub-routine. This will be the way you code up your search- each new branch is a new what-if with a new copy of the board.

    3. In a new directory called 3/c, complete the following exercise. Implement best-first search for the board game, using the above score function. Note: you don't have to win the game, just show that (a) sometimes the agent is guided by the score; (b) the more evil the board, the worst the performance.
    1. Optimize your code. The files lisp101/debug.lisp and lisp101/lispfuns.lisp contain code for collecting statistics on your system. If you have completed 3c, then you should have an automatic player of your game. Now, disable the print functions (so the boards are not shown as you play) and let the game play by itself. Wrap the top-level call to the game in (watch (call-the-game)). If that works, you should see an output like the following. Note how, of the 0.707 seconds of runtime, 0.620 seconds are spent in functions ("!", "EM2EFFORT", and "EM2RIN"). So if you could somehow speed up just those functions, you'd save a lot of time in your code.
        seconds  |   consed   |   calls   |  sec/call  |  name  
      ------------------------------------------------------------
           0.447 |          0 |   990,001 |  0.0000005 | !
           0.105 |  6,819,840 |   170,000 |   0.000001 | EM2EFFORT
           0.073 |  6,795,264 |   170,000 |  0.0000004 | EM2RIN
           0.027 |  6,778,880 |   170,000 |  0.0000002 | EM2CIN
           0.020 |  1,585,152 |    50,000 |  0.0000004 | SF2CIN
           0.016 |  1,630,208 |    50,000 |  0.0000003 | SF2DIN
           0.012 |  6,742,016 |   170,000 |  0.0000001 | EM2DIN
           0.006 |    946,176 |    30,000 |  0.0000002 | DR2ROUT
           0.001 |          0 |        14 |   0.000067 | MAKE-R15
           0.001 |      8,192 |         1 |   0.000539 | COCOMO-DEFAULTS
           0.000 |          0 |         5 |   0.000000 | MAKE-SFS
           0.000 |          0 |         1 |   0.000000 | LINE-Y
           0.000 |          0 |        27 |   0.000000 | ?
           0.000 |          0 |         1 |   0.000000 | INIT-DB
           0.000 |          0 |         5 |   0.000000 | MAKE-R16
           0.000 |          0 |         7 |   0.000000 | MAKE-RIN+
           0.000 |  1,695,744 |    50,000 |   0.000000 | SF2EFFORT
           0.000 |          0 |         1 |   0.000000 | MAKE-R26
           0.000 |          0 |         9 |   0.000000 | MAKE-EM-
           0.000 |          0 |         6 |   0.000000 | MAKE-DIN+
           0.000 |          0 |         1 |   0.000000 | ?ONE-B
           0.000 |          0 |         1 |   0.000000 | MAKE-ONE-A
           0.000 |          0 |        27 |   0.000000 | GUESS
           0.000 |          0 |        25 |   0.000000 | ?ELT
           0.000 |          0 |         5 |   0.000000 | MAKE-DSF
           0.000 |    974,848 |    30,000 |   0.000000 | DR2DOUT
           0.000 |          0 |        34 |   0.000000 | MAKE-EM
           0.000 |          0 |        25 |   0.000000 | ?BAG
           0.000 |          0 |         6 |   0.000000 | MAKE-DR
           0.000 |          0 |        10 |   0.000000 | MAKE-RIN-
           0.000 |          0 |         5 |   0.000000 | MAKE-RSF
           0.000 |          0 |         3 |   0.000000 | ?DR
           0.000 |          0 |         1 |   0.000000 | POINT-TO-LINE
           0.000 |          0 |         3 |   0.000000 | MAKE-RDR
           0.000 |          0 |         1 |   0.000000 | DEMO456
           0.000 |          0 |         1 |   0.000000 | MAKE-LINE
           0.000 |          0 |         3 |   0.000000 | MAKE-R25
           0.000 |          0 |         3 |   0.000000 | MAKE-DDR
           0.000 |          0 |         5 |   0.000000 | MAKE-CSF
           0.000 |          0 |        10 |   0.000000 | MAKE-SF
           0.000 |          0 |         6 |   0.000000 | MAKE-CIN+
           0.000 |          0 |        11 |   0.000000 | MAKE-CIN-
           0.000 |          0 |        28 |   0.000000 | GETA
           0.000 |  1,482,752 |    50,000 |   0.000000 | SF2RIN
           0.000 |          0 |         5 |   0.000000 | ?SF
           0.000 |      8,192 |        17 |   0.000000 | ?EM
           0.000 |          0 |         8 |   0.000000 | MAKE-EM+
           0.000 |          0 |         2 |   0.000000 | MAKE-NUM
           0.000 |          0 |         1 |   0.000000 | ?ONE-A
           0.000 |    958,464 |    30,000 |   0.000000 | DR2COUT
           0.000 |          0 |         3 |   0.000000 | MAKE-CODR
           0.000 |          0 |         1 |   0.000000 | MAKE-ONE-B
           0.000 | 24,076,288 |         1 |   0.000000 | DEMO456A
           0.000 |          0 |        25 |   0.000000 | COCO
           0.000 |          0 |         2 |   0.000000 | MAKE-R36
           0.000 |          0 |        11 |   0.000000 | MAKE-DIN-
           0.000 |          0 |         1 |   0.000000 | MAKE-DB
           0.000 |    155,648 |    10,124 |   0.000000 | MY-RANDOM
      ------------------------------------------------------------
           0.707 | 60,657,664 | 1,970,493 |            | Total
      

      To complete this homework, generate three documents:

      • before.txt : the above trace output, before you do any optimizations.
      • what2do.txt : a discussion of what you see in before.txt and suggestions on what you might change in order to speed up the code.
      • after.txt: the above watch output, after you do some optimizations.

      Note: to get most of the marks on this homework, you only have to try to speed up the code (offer some intelligent comment in what2do.txt). Obviously, sometimes, no optimizations are possible. However, to get full marks on this homework, you need at least a 30% speed in after.txt.

    2. Now that you've optimized the code, take it for a spin.

      This week, you have two tasks: create some baselines and think about improvements.

      When thinking about improvements, reflect on more than just mere systems-level optimizations. Think about the search space and how to reduce its complexity- that will be your work for next week.

      As to creating baselines, you have to to collect some stats. And next week's task will be to show an improvement on those stats.

      To collect those stats:

      • for every test board,
        • for evil in (0,N,2N,4N) where 4N is LARGE and N is small,
          • Twenty times repeat
            • Collect the runtimes and percent failures and number of steps to get to finish the game (the following acts take zero steps: picking things up, blowing things up, chopping things down, opening doors). Keep the 0, 15, 50, 75, 100 percentile values; where 0,100=min,max and 50=median.

      Report these in a text table where the columns line up neatly and the columns are labelled:

      RUNTIMES:
      
      board, evil, 0, 15, 50, 75, 100
      ----- ----- -- --- --- --- ----
      ...
      

      What to hand in:

      1. One file "hypothesis.txt" describing what you think is the bottleneck in the search. Note that your homework next week will be to explore that hypothesis.
      2. One file "runtimes0.txt" (the above table for runtimes)
      3. One file "steps0.txt" (the above table for number of steps)
      4. One file "failures0.txt" (the above table for failures)

      Practicalities:

      • Some of the test boards will destroy your search engine. Accept.
      • You'll need two tables, one for runtimes and one for percent failures. If you die, report that result as "died".
      • You'll probably have to set a max give up time for the really hard boards. If a game blows that max time, report the result as "never".
      • Don't bother reporting more than one decimal place.
      • You are about to melt down CPUs. Make sure you run with
        nice -n 20
        
        or the sysadmin people will cut you off.
      • This will take a long time. Start now. Why are you still reading? Go!
    3. Last homework. Explore the hypothesis your proposed last week. See if you can improve the runtimes/ percent failures/ steps. What to hand-in:
      1. One file "runtimes1.txt" (expressed as percentages of the runtimes0.txt values)
      2. One file "steps1.txt" (expressed as percentages of the steps0.txt values)
      3. One file "failures1.txt" (expressed as percentages of the failures0.txt values).

    CS572

    Homework

    Week one: please complete the Cs472 exercise, number 1a.

    Week two:

    Week three: read http://web.engr.oregonstate.edu/~wong/workshops/icml2006/slides/menzies.pdf and http://portal.acm.org/ft_gateway.cfm?id=1194910&type=pdf&coll=GUIDE&dl=GUIDE&CFID=73493180&CFTOKEN=43852520.

    Project

    1. Due Feb 3: Using the demo function, show that you can code randomizer, eras, median, distance, normalize, bore, discretize from the TOE document. Using that code, run a demo session with the lecturer.
    2. Due March 3:
      • Report: Notes on KE theory and CBR. References:
        • My paper on expert system patterns.
        • http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.4555: Knowledge Engineering: Principles and Methods (1998) http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45.4555&rep=rep1&type=pdf
        • http://groups.google.com/group/cbr4se/web/readings (in particular, see "Intro to CBR" and "General CBR theory" and "http://home.cc.gatech.edu/ccl/45").
      • Coding: distance, GAC, utility, sub:sampling, over:sampling, micro-sampling, sampler, alienator
    3. Due March 31:

      Targets: anomaly detection, verification, classification, mode identification

      • Coding:
        • Count, likelihood(1), likelihood(N)
        • Contrast set learner. Uses KEYS2 (cause it is so fast) or, if you are creative, some other method.
      • Experiments
        • Using likelihood(1), implement some analog of the BAD anomaly detection experiment. Include some "limits to anomaly detection" result (i.e. find the weirdness barrier, below which we cannot distinguish anomalies).
        • Implement verification (hint: trivial after the anaomaly detector).
        • Using likelihood(N), try to build mode identification. i.e. run all the sampled data in N eras where every so often, an era switches to just instances of one class. See if you can detect the new class, then label it "newClass+1", then if you see it again, you can find it again.
      • Update your march 3 report with
        • responses to my comments
        • notes on any new technologies developed this month
        • any results generated this month
    4. Due April 21: Preliminary report and seminars.
      • Experiments (note : for all these, and all the above, we are after compeency and incompetencies; i.e we need show what we can do and what we can't do; i.e. case studies on interesting problems as well as limit results showing where these capabilities fall off the cliff).
        • Planning (note: trivial, if contrast set learning works)
        • Prediction (note: trivial, if classification working)
        • Control (note: trivial if planning working)
        • Monitor (note: trivial if classification working)
        • Explanation (note: trivial if contrast set learning works)
        • Diagnosis (note: trivial if contrast set learning working; except you may want to generate N competing diagnosis if there are similar competing best contrast sets).
        • Repair (note: trivial if diagnosis working)
      • Update your march 31 report with
        • responses to my comments
        • notes on any new technologies developed this month
        • any results generated this month
        • Finally, for what class of problems would you recommend for or against using the above techniques?
    5. Due April 28: Mock PC meeting.
    6. Due May 5: Final report.