This section defines the assumptions of POM2. There are three sources of these assumptions: (1) the literature reviewed by Port08; (2) the Boehm and Turner paper, (3) the COCOMO literature (used to determine, say, the effects on cost of increased reliability); (4) [Penharkar07] (used to determine team size); and (4) a set of values extrapolated from Port08 that handle larger projects. For example, a close reading of Port08 shows that that paper tacitly assumed small projects (about 10 developers per project with a maximum of 25 tasks). Here, we work with up to 300 developers so we expand the maximum number of tasks by 300/10*25=750.
We acknowledge that our conclusions are sensitive to the assumptions of POM2. However, in defense of our current assumption set, we note unlike many other publications in this area, at least we fully state our assumptions. We belief that other studies reporting (say) that agile development is always the best might be glossing over some of the assumptions of those studies.
Port08 implemented a simulator of teams implementing requirements. For the sake of brevity, POM2 calls requirements ``tasks''.
A project is divided into teams and teams perform tasks. Each task has a value and a cost. As we shall see, these values may change over the lifetime of a task (using the mechanics discussed below).
The semantics of the tree is that all child-tasks must be completed before a parent class can start. We say that a task become "ready" when there does not exist a child class that isn't "ready". Initially, all leaf classes are "ready" since, because they have no sub-tasks, they have no sub-tasks that can block their start.
Initially, all teams are assigned different trees. After the trees are generated, links are added between trees from different teams such that one team must wait for another team to finish some tasks before they can finish there. Inter-team dependencies are generated as follows:
This method is used since it mimics a common structure seen in industry: teams often share large sub-systems (tasks that use many sub-tasks) rather than very small task assemblies.
The tree is generated as follows:
kids = 2 + 8*rand()^-T
At large T values, the number of kids approaches 10 and a low T values, the number of kids approaches 2.
In the usual case, the set of tasks in one tree are handled by one team. However, sometimes, the children of a task are completed by multiple teams. For example, in the above, there is a "dotted line between task l in tree2 and task d in tree1. This shows that team1's task "d" requires two sub-tasks from the same tree ("e" and "f") plus one task ("l") from another team implementing tree2 ("l")
As a project progresses, teams may discover that they have new tasks. We model this as follows. Suppose that the project size dictates we have, say, 500 tasks. We then build task trees with 500 tasks. At first, all tasks are marked as "hidden". Then, starting at the leaves and working up the project trees, we less than 500 tasks as "visible" (ensuring that all descendants of visible nodes are also visible). This means that above the visible tasks, hidden tasks may be lurking that the teams cannot yet see. As the project continues, we may make more parent classes "visible", at which point teams will "see" that they have new tasks to perform.
The above process handles inter-team dependencies. As we move further up the project tree, it becomes more likely that we will make "visible" a parent node that has children assigned to different teams. That is, it is very unlikely that tasks near the leaves will be completed by more than one team. However, as more tasks are completed, large sub-systems are built which, sooner or later, must be combined with sub-systems built by other teams.
Each node in the tree has N kids at probability 1/(d^(N+1)). By increasing "d", we increase the number of dependencies. For our sims, we'll use d = 2..10.
Teams maintain a plan- a set of tasks taken from their heap.
Teams process their plans in interactions. The total number of interactions is 6. A probability 0.9, an project can end after iteration_i. This means that projects assigned 6 iterations only has a (0.9^6 = 53.2%) of lasting that long. This early termination rule models the industry reality that many projects are canceled before their originally planned end date.
The number of tasks that can be completed in each iteration is controlled by the budget. Following Port08, we say that the budget for an iteration is
budget = (total initial cost of all tasks) / num_iterwhere "total initial cost" is computed from the costs of the plans, as they are known at the start of the project, after criticality is applied (see below). Note that this base budget will be adjusted to reflect the effects of using less skilled programmers (see below).
At each iteration, teams copy some unfinished tasks from the heap to their plan. The teams may also reflect over the relative cost/value of the tasks in their plan and (possibly) reorder their tasks using a planning "policy". POM2 experiments with four policies- three of which are variants of agile programming while the fourth emulates traditional non-agile software development:
At the end of an iteration, all uncompleted tasks are left in the plan are passed to the the next iteration. Also, all the tasks completed in that interation are (a) marked "not waiting" and (b) moved to a "done" set.
As each new task is added to "done", statistics are kept on the team's performance:
By tracking (sum of costs,sum of values) on a (X,Y) plane, for X = 1 to N, we can visualize the time-varying performance of the team:
^ KEY | ** * from policy total value | **** | ****** | * |* |-----------------> total cost
When the project is finished, this plot can be compared to an optimal frontier obtained by sorting all the "done" tasks using the final value/cost for those tasks. Note that the sort used to generate the optimal frontier relies on information not available halfway through the simulation (specifically, the _final_ cost/value of each task). This is the sort order offered by an god-like being who knows the future cost/value of tasks. No method can do better than the optimal frontier.
If the project finishes after completing (say) 100 tasks, then we score each run as follows:
POM2 explores a set of options for five factors identified by Boehm and Turner that distinguished agile projects from traditional plan-based projects. This section summarizes Boehm and Turners descriptions of those factors, as well as other literature that explicates further details.
Agile methods are untested on safety-critical products: they present potential difficulties with simple design and lack of documentation. Plan-based methods, on the other hand, evolved to handle highly critical products; hard to tailor down efficiently to low-criticality products.
Criticality is measured in terms of losses die to impact or defects and ranges from "none" (best for agile development) to "impact on discretionary funds" to "impact on essential funds" to "loss of single life" to "loss of many lives". Plan-based methods are best suited to projects that must be carefully planned, lest defects cause loss of many lives.
According to the COCOMO research, criticality effects cost as follows:
1 | 2 | 3 | 4 | 5 | |
very low | low | nominal | high | very high | |
none | impact on discretionary funds | impact on essential funds | single life | many lives | |
cost'= | 0.82 | 0.92 | 1.00 | 1.10 | 1.26 |
POM2 assumes that all tasks performed by one team are of equal criticality (this reflects the industrial reality that specialist teams work on particular portions of the code base). Once that criticality is known is known, then we adjust interation budget as follows:
budget' = budget / criticalitywhere budget comes from above and criticality is one of 0.82, 0.92, 1, 1.1, 1.26. That is, increasing reality means we can code less each iteration (since we need to take greater care with each line of code).
While agile methods work best for simple design and continuous refactoring in highly dynamic environments, they present a source of potentially expensive rework for highly stable environments.
Plan-based methods, on the other hand, are better suited for detailed plans and "big design up front". This approach is excellent for highly stable environments.
According to Boehm and Turner, dynamism is measured in terms of the percent of requirements changed each month and has the range 50% (best for agile) to 30 to 10 to 10 to 5 to 1 (best for planned-based).
According to Port08, dynamism effects tasks as follows:
value'= value + (value*N(0,sigma))
Note that, as dynamism increases, we discover more new tasks and the value of the tasks becomes more variable.
Note also a special case of the above process. If the dynamism parameter tells a team of find "new" number of tasks, but there are not that many "visible" or "ready" tasks, then a team's plan may cost less to complete than the budget for that interation. In this case, a team can finish an iteration with leftover budget, and unfinished possible tasks to complete. If this happens, the budget is kept until next round. This is accomplished by having 2 numbers, attached to each team: totalAccumulatedBudget and totalSpentBudget. Budget is burned by setting the totalSpentBudget to the totalAccumulatedBudget. At the start of each iteration, totalAccumulatedBudget is incremented by the budget for the iteration.
Before continuing, we digress to discuss another method (which we do not use) for handling leftover budgets. We considered allowing a team to work on tasks from other teams. However, Brooke's Law (adding programmers to a late task makes it later) convinced us of the folly of that approach. Teams are specialists in the quirks and capabilities of their own code base. An outsider coming in to work temporarily on a small part of another teams's code base can be quite unproductive (since they do not know the quirks of that code). Further, they can slow up the remaining team (while they teach the newcomer the tricks of that code)
For a fully agile project, changing task value means resorting the tasks in the plan to ensure that the most cost-effective valuable task is completed first. However, as discussed in this section, the corporate culture may inhibit that resorting process.
Agile processes thrive in a culture where people feel comfortable and empowered by having many degrees of freedom and thrive on chaos. Plan-based methods, on the other hand, thrive in a culture where people feel comfortable and empowered by having their roles defined by clear policies and procedures. Personnel in plan-based projects thrive on order.
Culture is measured in terms of the percent of staff thriving on chaos and has the range 90% (best for agile) to 70 to 50 to 30 to 10 (best for plan-based). At culture=90%, the changes to task value described above are used when resorting tasks for the next iteration. However, at culture=10%, developers are loathed to change the initial project plan since this introduces a degree of disorder into their work life.
We therefore distinguish between the true value and the accepted of task, calculated as follows:
accepted = value + (value * N(0,sigma) * culture)
So, as the percent thriving on chaos decreases, "culture" drops to 0 and the accepted value remains as the old value. Note that the accepted value is used to resort (exception: not for the plan-based policy, which never resorts) the tasks but, when performance statistics are gathered, we use the true value.
Agile and plan-based development require different kinds of personnel. Before discusses those differences, we introduce Cockburns's personal levels.
According to Cockburn, level 3 developers are able to revise a method, breaking its rules to fit an unprecedented new situation. Level 2 developers, on the other hand, prefer to tailor a method to fit a precedented new situation.
Moving down the scale, level 1A developers can, with training, perform discretionary method steps such as sizing stories to fit increments, composing patterns, compound refactoring, or complex COTS integration. Next on the scale are the level 1B developers. With training, these developers are able to perform procedural method steps such as coding a simple method, simple refactoring, following coding standards and CM procedures, or running tests.
According to Boehm and Turner, agile software development require continuous presence of a critical mass of scarce Cockburn Level 2 or 3 experts. In agile projects, it is risky to use non-agile Level 1B people.
Plan-based methods, on the other hand, need a critical mass of scarce Cockburn Level 2 and 3 experts during project definition, but can work with fewer later in the project.
Personnel is scored according the following table (from Boehm and Turner). In this table, agile methods are best suited to left-hand side projects while plan-based methods are best suited to right-hand side projects:
A | % alpha-level programmers (level 2 and 3) | 45 | 50 | 55 | 60 | 65 |
B | % beta-level programmers (level 1a) | 40 | 30 | 20 | 10 | 0 |
O | % gamma-level programmers (level 1b) | 15 | 20 | 25 | 30 | 35 |
SUM: | 1*alpha + 1.2* beta + 1.6* gamma | 118 | 119 | 120 | 121 |
Personnel effects the cost of tasks.
After Port08, we say that
the base of a task is a random variable from 0 to 100. Personnel skill changes this code.
Personnel have different productivity scale given by the COCOMO [Chulani99] dimensions.
vl l n h vh
acap, , 1.42, 1.19, 1.00, 0.85, 0.71 ,
pcap, , 1.34, 1.15, 1.00, 0.88, 0.76,
Assuming that vh,h,n,l,vl represents alpha, alpha, beta, gamma,gamma developers respectively, we can compute the
cost implications of using different kinds of personnel.
This leads to values describing how using beta and gamma programmers slows down task completion: omega = 1.6; beta = 1.22; alpha = 1.
These values are used to adjust the budget available to an iteration as follows:
budget'' = budget'/(1.6*O + 1.22*B + A).where budget' was calculate above and (O,B,A) comes from one column of the above table. That is, less can be completed in each iteration if a team comprises higher numbers of gamma and beta-programmers.
Note that the net effect of all the above is minimal. As shown in the "SUM:" row of the above table, the above calculations lead to adjustment factors ranging from 118 to 121. Hence, for the current study, we do not adjust the personnel.
Agile methods are well matched to small products and teams. Their reliance on tacit knowledge limits scalability. Plan-driven projects, on the other hand, use methods evolved to handle large products and teams that are hard to tailor down to small projects.
Size is measured in terms of number of personnel and has the range 3 (best for agile), 10,30,100, 300 (best for plan-based). The size of a POM2 project is picked at random from this range and the number of tasks is then set to size * 25/10.
Once size is known, we build teams as follows.
Originally, we planned to apply
[Pehharkar07] results who report that the size of software development teams has
(min,mean,sd)=(1,8,20). However this leads to a large number of single-person teams.
We modified that result, slightly.
In consultation with some of our NASA colleagues. POM2 selects team size randomly from the following distribition, until
the total team size exceeds size.
1 ***
3 *********
6 *****************
9 ****************************
12 **********************
15 ******************
18 **************
21 **********
24 ******
27 ***
30 **
33 *
36 *
39 *
42 *
We using the following algorithm:
Size=300*rand()
t=0
while Size > 0
t++
team[t] = max(1, N(8,20))
Size = Size - team[t]
Each team of size team[t] gets allocated team[t]/Size * Tasks of the tasks.