/timm's /charming /python /tricks
Download
do.py.
Read more on How to be Charming (in Python).
001: import sys 002: sys.dont_write_bytecode=True # no .pyc files 003: from the import * 004: from about import * 005: from lib import * 006: 007: 008: def _do(): 009: """A every iteraction, some performance score 010: is logged back with the 'Do' loop, using 'seen'. 011: Note the use of 'controlled' to print the eras.""" 012: for x,d in do(range(0,50), 013: era=10, 014: haltOn='fx', 015: epsilon=0.05): 016: y = 1.0/(1 + e**(-0.1*x)) 017: d.seen(x,fx=y) 018: done(d,0,1,value=lambda s: '%4.2f' % s) 019: 020: def do(iterator=range(0,100), 021: epsilon=0.01, 022: haltOn= None, 023: era = 100, 024: also = None, 025: cohen = The.math.cohen, 026: cmp = lambda old, new: new - old): 027: """syntactic sugar for calling the Do class. 028: Without this helper function, you need to do 029: 030: for x,d in Do(....).loop(): 031: ... 032: 033: With this function, you can do: 034: 035: for x,d in do(....): 036: ... 037: """ 038: for tick,doings in Do(iterator=iterator, 039: epsilon= epsilon, haltOn=haltOn, era= era, 040: also=also, cohen=cohen, cmp=cmp).loop(): 041: yield tick,doings 042: 043: class Do: 044: """'Do' adds to features to a standard 045: iterator. 'Do' lets you run an optimizer many 046: times, while incrementally storing performance 047: results. Using those results, 'Do' also lets 048: you determine in the optimizer should quit early. 049: 050: For a simple example of its usage, see 051: '_do()', above. Note that during each 052: iteration, we log some performance value using the 053: 'seen' method. 054: 055: For a more typical example of usage, see search.py 056: and the 'sa' function. From that example,here's 50 057: runs of SA from k=0 to 1000 where all the results 058: are grouped together in 'era's of size 30. On the 059: left is era number followed by [sampleSize] of 060: items in that era. Then there is a quantile chart 061: showing the 10,30,50,70,90-th percentile of the 062: performance scores in that era. 063: 064: This first run stops if the mean results in any 065: era are within an epsilon on 20% of max; 066: i.e. epsilon=0.2. 'Best' is the 'eb' scores and 067: 'every' are the 'e' scores. 068: 069: FIRST RUN: 070: 071: -----| best |----------------------------------------- ---------------- 072: 073: 0 [ 128] | -- * - ,0.62, 0.71, 0.79, 0.85, 0.91 074: 30 [ 128] | -- * ,0.79, 0.85, 0.88, 0.91, 0.94 075: 60 [ 120] | - *- ,0.75, 0.79, 0.82, 0.86, 0.89 076: 90 [ 30] | * ,0.80, 0.80, 0.80, 0.80, 0.80 077: 078: ------| every |-------------------------------------------------------- 079: 080: 0 [ 128] |--- *-- ,0.53, 0.67, 0.77, 0.82, 0.90 081: 30 [ 128] | -- *- ,0.74, 0.81, 0.87, 0.90, 0.94 082: 60 [ 120] | - *- ,0.74, 0.78, 0.82, 0.86, 0.89 083: 90 [ 30] |-------* ,0.53, 0.80, 0.80, 0.80, 0.80 084: 085: This second run stops if we reach an epsilon of 086: within 2.5% of max; i.e. epsilon=0.025. Notice we 087: run for more eras than in the FIRST RUN. 088: 089: SECOND RUN: 090: 091: ------| best |-------------------------------------------------------- 092: 093: 0 [ 128] | ---- *-- ,0.60, 0.76, 0.79, 0.84, 0.91 094: 30 [ 128] | -- * ,0.77, 0.84, 0.89, 0.91, 0.95 095: 60 [ 128] | *- ,0.84, 0.87, 0.92, 0.94, 0.97 096: 90 [ 128] | - * ,0.87, 0.91, 0.92, 0.94, 0.97 097: 120 [ 128] | --* ,0.86, 0.93, 0.94, 0.94, 0.97 098: 150 [ 30] | *,0.97, 0.97, 0.97, 0.97, 0.97 099: 100: ------| every |-------------------------------------------------------- 101: 102: 0 [ 128] | --- * -- ,0.60, 0.70, 0.76, 0.81, 0.88 103: 30 [ 128] | --- *-- ,0.69, 0.83, 0.86, 0.91, 0.96 104: 60 [ 128] | ----- *- ,0.67, 0.85, 0.89, 0.93, 0.96 105: 90 [ 128] | -- *- ,0.79, 0.87, 0.91, 0.94, 0.97 106: 120 [ 128] | --- * - ,0.66, 0.80, 0.88, 0.94, 0.97 107: 150 [ 30] |--* ,0.51, 0.62, 0.63, 0.83, 0.83 108: 109: Another thing we can 'Do' is stop if the 110: improvement in the new era is very small compared 111: to the last one. Here's the last run 112: (epsilon=0.025) but in this run, we say "very 113: small" is less than one standard deviation of the 114: all the scores seen in the run. This is the Cohen 115: test for effect size so we denote it cohen=1. 116: 117: THIRD RUN: 118: 119: -----| best |--------------------------------------------------------- 120: 121: 0 [ 128] | -- *- ,0.64, 0.75, 0.81, 0.86, 0.91 122: 30 [ 128] | - *- ,0.80, 0.84, 0.89, 0.91, 0.97 123: 60 [ 128] | -*- ,0.84, 0.89, 0.92, 0.94, 0.98 124: 90 [ 120] | *- ,0.84, 0.88, 0.91, 0.91, 0.98 125: 126: ------| every |------------------------------------------------------- 127: 128: 0 [ 128] | ---- * - ,0.59, 0.72, 0.80, 0.87, 0.91 129: 30 [ 128] | -- *-- ,0.73, 0.83, 0.87, 0.91, 0.97 130: 60 [ 128] | ----- *- ,0.62, 0.83, 0.89, 0.92, 0.98 131: 90 [ 120] | - *- ,0.81, 0.88, 0.89, 0.91, 0.98 132: 133: Finally, our last run shows a run with typical 134: settings for this `Do'-ing; 135: i.e. epsilon=0.01 (1% of max) and cohen=0.2 136: (i.e. "very small" is less than 20% of one 137: standard deviation). Note that its the same as 138: the SECOND RUN shown above since, in this model, 139: achieving 1% of max is not possible. 140: 141: FOURTH RUN: 142: 143: ------| best |-------------------------------------------------------- 144: 145: 0 [ 128] | ---- *-- ,0.60, 0.76, 0.79, 0.84, 0.91 146: 30 [ 128] | -- * ,0.77, 0.84, 0.89, 0.91, 0.95 147: 60 [ 128] | *- ,0.84, 0.87, 0.92, 0.94, 0.97 148: 90 [ 128] | - * ,0.87, 0.91, 0.92, 0.94, 0.97 149: 120 [ 128] | --* ,0.86, 0.93, 0.94, 0.94, 0.97 150: 150 [ 30] | *,0.97, 0.97, 0.97, 0.97, 0.97 151: 152: ------| every |-------------------------------------------------------- 153: 154: 0 [ 128] | --- * -- ,0.60, 0.70, 0.76, 0.81, 0.88 155: 30 [ 128] | --- *-- ,0.69, 0.83, 0.86, 0.91, 0.96 156: 60 [ 128] | ----- *- ,0.67, 0.85, 0.89, 0.93, 0.96 157: 90 [ 128] | -- *- ,0.79, 0.87, 0.91, 0.94, 0.97 158: 120 [ 128] | --- * - ,0.66, 0.80, 0.88, 0.94, 0.97 159: 150 [ 30] |--* ,0.51, 0.62, 0.63, 0.83, 0.83 160: """ 161: def __init__(i, 162: iterator=range(0,100), 163: epsilon= 0.005, 164: haltOn=None, 165: era= 100, 166: also=None, 167: cohen=The.math.cohen, 168: cmp=lambda old, new: new - old): 169: i.iterator=iterator 170: i.era = era 171: i.epsilon = epsilon 172: i.all = Nums() 173: i.now = Nums() 174: i.also = also 175: i.cohen = cohen 176: i.haltOn = haltOn 177: i.cmp = cmp 178: def thisEra(i,tick): 179: """Internally, 'Do' instances keep time 180: rounded off to the nearest era size.""" 181: return int(tick/i.era)*i.era 182: def small(i,where) : 183: "Small is, say, 30% of std.dev seen so far" 184: return i.all[where].s*i.cohen 185: def close2Best(i,where,when) : 186: "mean within epsilon of best" 187: return i.now[(where,when)].mu > (1 - i.epsilon) 188: def improving(i,where,when): 189: """The difference in means between now and 190: last is not small.""" 191: before = when - i.era 192: new = i.now[(where,when)] 193: old = i.now[(where,before)] 194: return i.cmp(old.mu, new.mu) > i.small(where) 195: def earlyStop(i,when): 196: "Early stop if close to best or not improving." 197: where = i.haltOn 198: if where: 199: if i.close2Best(where,when) or \ 200: not i.improving(where,when): 201: return True 202: return False 203: def seen(i,tick,**d): 204: "Syntactic sugar for adding new items" 205: when = i.thisEra(tick) 206: for where,what in d.items(): 207: i.seen1(when,what,where) 208: def seen1(i,when,what,where): 209: "Primitive data add. Also adds to parent." 210: if i.also: 211: i.also.seen1(when,what,where) 212: i.all[where].seen(what) 213: i.now[(where, when)].seen(what) 214: def loop(i): 215: "Every time we change eras, check for early stop." 216: before = 0 217: for tick in i.iterator: 218: now = i.thisEra(tick) 219: if before and now != before: 220: if i.earlyStop(before): 221: break 222: before = now 223: yield tick,i 224: 225: def done(doings,lo,hi, 226: key =lambda z:'%10s' % z, 227: value=lambda z: '%2d' % z 228: ): 229: d = doings.now 230: wheres= {} 231: whens = {} 232: for where,when in d.keys(): 233: wheres[where] = 1 234: whens[ when ] = 1 235: wheres = sorted(wheres.keys()) 236: whens = sorted(whens.keys()) 237: for where in wheres: 238: print '\n------|',str(where),'|'+'-'*75,'\n' 239: for when in whens: 240: if (where,when) in d: 241: all = d[(where,when)].cached() 242: s = "?" 243: if len(all) > 5: 244: s= xtile(all,show = value, 245: lo=lo,hi=hi,width=25) 246: print '%10s' % key(when),\ 247: '[%5s]' % len(all), s 248: 249: """ 250: ### xtile: Pretty Print Distributions 251: 252: The function _xtile_ takes a list of (possibly) unsorted numbers and 253: presents them as a horizontal xtile chart (in ascii format). The default 254: is a contracted _quintile_ that shows the 10,30,50,70,90 breaks in the 255: data (but this can be changed- see the optional flags of the function). 256: 257: """ 258: def xtile(lst,lo=0,hi=100,width=50, 259: chops=[0.1 ,0.3,0.5,0.7,0.9], 260: marks=["-" ," "," ","-"," "], 261: bar="|",star="*",show= lambda s:" %3.0f" % s): 262: def pos(p) : return ordered[int(len(lst)*p)] 263: def place(x) : 264: tmp= int(width*float((x - lo))/(hi - lo)) 265: if tmp == width: tmp += -1 266: return tmp 267: def pretty(lst) : 268: return ', '.join([show(x) for x in lst]) 269: ordered = sorted(lst) 270: lo = min(lo,ordered[0]) 271: hi = max(hi,ordered[-1]) 272: what = [pos(p) for p in chops] 273: where = [place(n) for n in what] 274: out = [" "] * width 275: for one,two in pairs(where): 276: for i in range(one,two): 277: out[i] = marks[0] 278: marks = marks[1:] 279: out[int(width/2)] = bar 280: loc = place(pos(0.5)) 281: #print loc, len(out) 282: out[loc] = star 283: return ''.join(out) + "," + pretty(what) 284: 285: def _tile() : 286: import random 287: r = random.random 288: def show(lst): 289: return xtile(lst,lo=0, hi=1,width=25, 290: show= lambda s:" %3.2f" % s) 291: print "one", show([r()**0.5 for x in range(1000)]) 292: print "two", show([r()**2 for x in range(1000)])w 293: 294: def _tile2(): 295: def show(lst): 296: return xtile(lst,lo=0, hi=1,width=25, 297: show= lambda s:" %3.2f" % s) 298: print "one", show([0.21, 0.29, 0.28, 0.32, 0.32, 299: 0.28, 0.29, 0.41, 0.42, 0.48]) 300: print "two", show([0.71, 0.92, 0.80, 0.79, 0.78, 301: 0.9, 0.71, 0.82, 0.79, 0.98]) 302: 303: if __name__ == '__main__' : 304: eval(cmd('_do()')) 305:
This file is part of Timm's charming Python tricks.
© 2014, Tim Menzies:
tim.menzies@gmail.com,
http://menzies.us.
Timm's charming Python tricks are free software: you can redistribute it and/or modify it under the terms of the GNU Lesser Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
Timm's charming Python tricks are distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU Lesser Public License along with Foobar. If not, see http://www.gnu.org/licenses.