This project is due in week 14 (April 20).
The following spec offers ways to earn 30 marks. Note that 20 marks are enough to get full marks for this project- the rest are bonus.
Note that the at the code inspection time, you must show results from 100s, 1000s of runs of your systems (depending on what parts of this project you decide to do). So you cannot do this project the night before. It can take days to weeks to collect the required runtime information, realize you made a mistake, then restart all the data collection from scratch.
In the last project you wrote feature extractors that reflect over the board to generate a set of conditions. Operators then matched on these features, and then added or deleted their own.
Lets call your current layer of operators "tactical" in that (except for the higher level stuff) they run using information extracted from the immediate neighborhood (e.g. I am next to a bush and I have dynamite so I can blow it up).
A second layer of operators, which we call "strategic" reflects over the tactical operators found by your code. In this second layer, the features aren't so much about the board but about preference knowledge such as "if I have keys and dynamite and I can use both then favor using the dynamite" (this is useful since keys can be used many times but using dynamite exhausts that resource).
Note that the second strategic layer works the same as the tactical layer: feature extractors that reflect over X to generate a set of conditions. Operators then matched on these features, and then added or deleted their own. But:
This strategic layer is not hard to implement (in fact, the code looks exactly like your current operator search code) but:
Note also that the strategic layer might run in multiple passes. At pass=1, we might work out we need to walk from (1 1) to (4 4) to get a key so we add
(walk 1 1 to 10 10)At pass=2, we the replace "(walk 1 1 to 10 10)" with
(step 1 1 to 1 2) (step 1 2 to 2 2) (step 2 2 to 2 3) (step 2 3 to 3 3) (step 3 3 to 3 4) (step 3 4 to 4 4)Alternatively, we could use lisp macros to expand (walk 1 1 to 4 4 ) with
((step 1 1 to 1 2) (step 1 2 to 2 2) (step 2 2 to 2 3) (step 2 3 to 3 3) (step 3 3 to 3 4) (step 3 4 to 4 4))
The choice is up to you (and it depends on how strong is your defmacro fu).
Maintain the file/director standards of proj 2a
At demo time, run demo-walk7 from proj2 with a random walk (just so I know you can)
At demo time, show me where you code implement beam search using strategic and tactical operators. Beam will sort the operators using whatever comes out of the strategic layer's ordering. It may then ignore all but the top, say 20 (why 20? I dunno. You'll have to work out your own beam search width).
At demo time, show me one run of your search engine on 3 random boards (of my choosing). Note that our agent should walk to the gold, then return the start position
Show a graph of runtimes of 20 repeats of running your search all the example files "game/eg/*" (so that is 200 runs in all). That graph is an ascii bar chart with 10 bars showing median runtimes (50% percentile points), sorted by median runtime. e.g.
eg runtime -- ------- 3 *********************************** [200] 4 ******************* [98] 5 *************** [70] 1 ************** [68] 2 ************* [60] 5 ******** [30] 8 ***** [20] etc
Note that if a search dies (never finished cause our agent is locked behind bushes with no dynamite), then give that search a runtime that is 20% larger than then max runtime of any other search.
Draw a second with 10 bars showing spread in runtimes (75-50)% percentile points), sorted by spread.
DO NOT generate the runtime file at the demo- it will take too long. Do it before the meeting and cache it.
To generate runtimes, use
(let (times) (dotimes (i 20 (sort times #'<) (let ((before (get-internal-runtime)) (call-search-on-one-eg-file) (push (- (get-internal-runtime) before) times)))))))))))))
Use the same code to process the strategic layer as the tactical layer (exception- as discussed above "act" in tactical does something different to "act" in strategic). Hint: you may have to add lambda bodies to slots in the operators (exception the call
Add a random environment. At probability P,
Generate 5 sets of bar charts showing the effects of P in 0.025, 0.05, 0.1, 0.2, 0.4; i.e. (5*20*10) runs. Hint: as P increases, more searchers will die.
Implement 4 different variants of search.
Generate bar charts for each variant.
Shout "hooray" if one search engine does better for higher P that other guys.
Implement "desire". Desire are plans that persist between moves. Right now, all the above reasoning is per-move. At each move, we extract features, run the ops, do a search that explores all options, then makes 1 move. So it could be possible (though unlikely) that your search engine will decide in move 1 and i+1 to move right then left then right then left then right then left then....
"Desires" are intentions from the past that inform the present. Internally, it is some working memory of the goals decided on before that are used to control the goals selected in the current move. They are a like a memo that the search engine writes itself saying "you know, what we are really trying to do here is XYZ".
To show the effects of desire: