/timm's /charming /python /tricks
Download
do.py.
Read more on How to be Charming (in Python).
001: import sys
002: sys.dont_write_bytecode=True # no .pyc files
003: from the import *
004: from about import *
005: from lib import *
006:
007:
008: def _do():
009: """A every iteraction, some performance score
010: is logged back with the 'Do' loop, using 'seen'.
011: Note the use of 'controlled' to print the eras."""
012: for x,d in do(range(0,50),
013: era=10,
014: haltOn='fx',
015: epsilon=0.05):
016: y = 1.0/(1 + e**(-0.1*x))
017: d.seen(x,fx=y)
018: done(d,0,1,value=lambda s: '%4.2f' % s)
019:
020: def do(iterator=range(0,100),
021: epsilon=0.01,
022: haltOn= None,
023: era = 100,
024: also = None,
025: cohen = The.math.cohen,
026: cmp = lambda old, new: new - old):
027: """syntactic sugar for calling the Do class.
028: Without this helper function, you need to do
029:
030: for x,d in Do(....).loop():
031: ...
032:
033: With this function, you can do:
034:
035: for x,d in do(....):
036: ...
037: """
038: for tick,doings in Do(iterator=iterator,
039: epsilon= epsilon, haltOn=haltOn, era= era,
040: also=also, cohen=cohen, cmp=cmp).loop():
041: yield tick,doings
042:
043: class Do:
044: """'Do' adds to features to a standard
045: iterator. 'Do' lets you run an optimizer many
046: times, while incrementally storing performance
047: results. Using those results, 'Do' also lets
048: you determine in the optimizer should quit early.
049:
050: For a simple example of its usage, see
051: '_do()', above. Note that during each
052: iteration, we log some performance value using the
053: 'seen' method.
054:
055: For a more typical example of usage, see search.py
056: and the 'sa' function. From that example,here's 50
057: runs of SA from k=0 to 1000 where all the results
058: are grouped together in 'era's of size 30. On the
059: left is era number followed by [sampleSize] of
060: items in that era. Then there is a quantile chart
061: showing the 10,30,50,70,90-th percentile of the
062: performance scores in that era.
063:
064: This first run stops if the mean results in any
065: era are within an epsilon on 20% of max;
066: i.e. epsilon=0.2. 'Best' is the 'eb' scores and
067: 'every' are the 'e' scores.
068:
069: FIRST RUN:
070:
071: -----| best |----------------------------------------- ----------------
072:
073: 0 [ 128] | -- * - ,0.62, 0.71, 0.79, 0.85, 0.91
074: 30 [ 128] | -- * ,0.79, 0.85, 0.88, 0.91, 0.94
075: 60 [ 120] | - *- ,0.75, 0.79, 0.82, 0.86, 0.89
076: 90 [ 30] | * ,0.80, 0.80, 0.80, 0.80, 0.80
077:
078: ------| every |--------------------------------------------------------
079:
080: 0 [ 128] |--- *-- ,0.53, 0.67, 0.77, 0.82, 0.90
081: 30 [ 128] | -- *- ,0.74, 0.81, 0.87, 0.90, 0.94
082: 60 [ 120] | - *- ,0.74, 0.78, 0.82, 0.86, 0.89
083: 90 [ 30] |-------* ,0.53, 0.80, 0.80, 0.80, 0.80
084:
085: This second run stops if we reach an epsilon of
086: within 2.5% of max; i.e. epsilon=0.025. Notice we
087: run for more eras than in the FIRST RUN.
088:
089: SECOND RUN:
090:
091: ------| best |--------------------------------------------------------
092:
093: 0 [ 128] | ---- *-- ,0.60, 0.76, 0.79, 0.84, 0.91
094: 30 [ 128] | -- * ,0.77, 0.84, 0.89, 0.91, 0.95
095: 60 [ 128] | *- ,0.84, 0.87, 0.92, 0.94, 0.97
096: 90 [ 128] | - * ,0.87, 0.91, 0.92, 0.94, 0.97
097: 120 [ 128] | --* ,0.86, 0.93, 0.94, 0.94, 0.97
098: 150 [ 30] | *,0.97, 0.97, 0.97, 0.97, 0.97
099:
100: ------| every |--------------------------------------------------------
101:
102: 0 [ 128] | --- * -- ,0.60, 0.70, 0.76, 0.81, 0.88
103: 30 [ 128] | --- *-- ,0.69, 0.83, 0.86, 0.91, 0.96
104: 60 [ 128] | ----- *- ,0.67, 0.85, 0.89, 0.93, 0.96
105: 90 [ 128] | -- *- ,0.79, 0.87, 0.91, 0.94, 0.97
106: 120 [ 128] | --- * - ,0.66, 0.80, 0.88, 0.94, 0.97
107: 150 [ 30] |--* ,0.51, 0.62, 0.63, 0.83, 0.83
108:
109: Another thing we can 'Do' is stop if the
110: improvement in the new era is very small compared
111: to the last one. Here's the last run
112: (epsilon=0.025) but in this run, we say "very
113: small" is less than one standard deviation of the
114: all the scores seen in the run. This is the Cohen
115: test for effect size so we denote it cohen=1.
116:
117: THIRD RUN:
118:
119: -----| best |---------------------------------------------------------
120:
121: 0 [ 128] | -- *- ,0.64, 0.75, 0.81, 0.86, 0.91
122: 30 [ 128] | - *- ,0.80, 0.84, 0.89, 0.91, 0.97
123: 60 [ 128] | -*- ,0.84, 0.89, 0.92, 0.94, 0.98
124: 90 [ 120] | *- ,0.84, 0.88, 0.91, 0.91, 0.98
125:
126: ------| every |-------------------------------------------------------
127:
128: 0 [ 128] | ---- * - ,0.59, 0.72, 0.80, 0.87, 0.91
129: 30 [ 128] | -- *-- ,0.73, 0.83, 0.87, 0.91, 0.97
130: 60 [ 128] | ----- *- ,0.62, 0.83, 0.89, 0.92, 0.98
131: 90 [ 120] | - *- ,0.81, 0.88, 0.89, 0.91, 0.98
132:
133: Finally, our last run shows a run with typical
134: settings for this `Do'-ing;
135: i.e. epsilon=0.01 (1% of max) and cohen=0.2
136: (i.e. "very small" is less than 20% of one
137: standard deviation). Note that its the same as
138: the SECOND RUN shown above since, in this model,
139: achieving 1% of max is not possible.
140:
141: FOURTH RUN:
142:
143: ------| best |--------------------------------------------------------
144:
145: 0 [ 128] | ---- *-- ,0.60, 0.76, 0.79, 0.84, 0.91
146: 30 [ 128] | -- * ,0.77, 0.84, 0.89, 0.91, 0.95
147: 60 [ 128] | *- ,0.84, 0.87, 0.92, 0.94, 0.97
148: 90 [ 128] | - * ,0.87, 0.91, 0.92, 0.94, 0.97
149: 120 [ 128] | --* ,0.86, 0.93, 0.94, 0.94, 0.97
150: 150 [ 30] | *,0.97, 0.97, 0.97, 0.97, 0.97
151:
152: ------| every |--------------------------------------------------------
153:
154: 0 [ 128] | --- * -- ,0.60, 0.70, 0.76, 0.81, 0.88
155: 30 [ 128] | --- *-- ,0.69, 0.83, 0.86, 0.91, 0.96
156: 60 [ 128] | ----- *- ,0.67, 0.85, 0.89, 0.93, 0.96
157: 90 [ 128] | -- *- ,0.79, 0.87, 0.91, 0.94, 0.97
158: 120 [ 128] | --- * - ,0.66, 0.80, 0.88, 0.94, 0.97
159: 150 [ 30] |--* ,0.51, 0.62, 0.63, 0.83, 0.83
160: """
161: def __init__(i,
162: iterator=range(0,100),
163: epsilon= 0.005,
164: haltOn=None,
165: era= 100,
166: also=None,
167: cohen=The.math.cohen,
168: cmp=lambda old, new: new - old):
169: i.iterator=iterator
170: i.era = era
171: i.epsilon = epsilon
172: i.all = Nums()
173: i.now = Nums()
174: i.also = also
175: i.cohen = cohen
176: i.haltOn = haltOn
177: i.cmp = cmp
178: def thisEra(i,tick):
179: """Internally, 'Do' instances keep time
180: rounded off to the nearest era size."""
181: return int(tick/i.era)*i.era
182: def small(i,where) :
183: "Small is, say, 30% of std.dev seen so far"
184: return i.all[where].s*i.cohen
185: def close2Best(i,where,when) :
186: "mean within epsilon of best"
187: return i.now[(where,when)].mu > (1 - i.epsilon)
188: def improving(i,where,when):
189: """The difference in means between now and
190: last is not small."""
191: before = when - i.era
192: new = i.now[(where,when)]
193: old = i.now[(where,before)]
194: return i.cmp(old.mu, new.mu) > i.small(where)
195: def earlyStop(i,when):
196: "Early stop if close to best or not improving."
197: where = i.haltOn
198: if where:
199: if i.close2Best(where,when) or \
200: not i.improving(where,when):
201: return True
202: return False
203: def seen(i,tick,**d):
204: "Syntactic sugar for adding new items"
205: when = i.thisEra(tick)
206: for where,what in d.items():
207: i.seen1(when,what,where)
208: def seen1(i,when,what,where):
209: "Primitive data add. Also adds to parent."
210: if i.also:
211: i.also.seen1(when,what,where)
212: i.all[where].seen(what)
213: i.now[(where, when)].seen(what)
214: def loop(i):
215: "Every time we change eras, check for early stop."
216: before = 0
217: for tick in i.iterator:
218: now = i.thisEra(tick)
219: if before and now != before:
220: if i.earlyStop(before):
221: break
222: before = now
223: yield tick,i
224:
225: def done(doings,lo,hi,
226: key =lambda z:'%10s' % z,
227: value=lambda z: '%2d' % z
228: ):
229: d = doings.now
230: wheres= {}
231: whens = {}
232: for where,when in d.keys():
233: wheres[where] = 1
234: whens[ when ] = 1
235: wheres = sorted(wheres.keys())
236: whens = sorted(whens.keys())
237: for where in wheres:
238: print '\n------|',str(where),'|'+'-'*75,'\n'
239: for when in whens:
240: if (where,when) in d:
241: all = d[(where,when)].cached()
242: s = "?"
243: if len(all) > 5:
244: s= xtile(all,show = value,
245: lo=lo,hi=hi,width=25)
246: print '%10s' % key(when),\
247: '[%5s]' % len(all), s
248:
249: """
250: ### xtile: Pretty Print Distributions
251:
252: The function _xtile_ takes a list of (possibly) unsorted numbers and
253: presents them as a horizontal xtile chart (in ascii format). The default
254: is a contracted _quintile_ that shows the 10,30,50,70,90 breaks in the
255: data (but this can be changed- see the optional flags of the function).
256:
257: """
258: def xtile(lst,lo=0,hi=100,width=50,
259: chops=[0.1 ,0.3,0.5,0.7,0.9],
260: marks=["-" ," "," ","-"," "],
261: bar="|",star="*",show= lambda s:" %3.0f" % s):
262: def pos(p) : return ordered[int(len(lst)*p)]
263: def place(x) :
264: tmp= int(width*float((x - lo))/(hi - lo))
265: if tmp == width: tmp += -1
266: return tmp
267: def pretty(lst) :
268: return ', '.join([show(x) for x in lst])
269: ordered = sorted(lst)
270: lo = min(lo,ordered[0])
271: hi = max(hi,ordered[-1])
272: what = [pos(p) for p in chops]
273: where = [place(n) for n in what]
274: out = [" "] * width
275: for one,two in pairs(where):
276: for i in range(one,two):
277: out[i] = marks[0]
278: marks = marks[1:]
279: out[int(width/2)] = bar
280: loc = place(pos(0.5))
281: #print loc, len(out)
282: out[loc] = star
283: return ''.join(out) + "," + pretty(what)
284:
285: def _tile() :
286: import random
287: r = random.random
288: def show(lst):
289: return xtile(lst,lo=0, hi=1,width=25,
290: show= lambda s:" %3.2f" % s)
291: print "one", show([r()**0.5 for x in range(1000)])
292: print "two", show([r()**2 for x in range(1000)])w
293:
294: def _tile2():
295: def show(lst):
296: return xtile(lst,lo=0, hi=1,width=25,
297: show= lambda s:" %3.2f" % s)
298: print "one", show([0.21, 0.29, 0.28, 0.32, 0.32,
299: 0.28, 0.29, 0.41, 0.42, 0.48])
300: print "two", show([0.71, 0.92, 0.80, 0.79, 0.78,
301: 0.9, 0.71, 0.82, 0.79, 0.98])
302:
303: if __name__ == '__main__' :
304: eval(cmd('_do()'))
305:
This file is part of Timm's charming Python tricks.
© 2014, Tim Menzies:
tim.menzies@gmail.com,
http://menzies.us.
Timm's charming Python tricks are free software: you can redistribute it and/or modify it under the terms of the GNU Lesser Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
Timm's charming Python tricks are distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU Lesser Public License along with Foobar. If not, see http://www.gnu.org/licenses.