popularity is a near perfect predictor for defects

intro should be more positive

- great recent discovery, not the internal structure, but its use

- social metrics often include detailed develper nowledge (e.g.
status, where they eat, etc). this data often missing in, say, large
OSS data bases.

- we find we can infer some social info from large OSS code repos,
in particular, what classes everyone is working on (we call these,
the most popular)

- therefore, we assess here the value of this concept of popularity.
at is issue is, is it enough? should we be redesigning our OSS repos
to include more detailed developer knowledge

- we therefore offer a mathematical characterization of one task1:
how one might use a metric for controlling a project. we then
characterize the value of this notion of popularity in terms of its
impact on those control decisions. our particular  one: what to
inspect next

- we show that opularity is effective for that task (usually,
predicts within 0.4 of the actual number of defects, allows us to
read less code while finding more bugs)

- we also show that, popularity works within 4% of optimial for
this one task. that is, not only is popularity effective, there is
very little change that anything will be more effective for improving
this one specific task

- therefore, at last for one task, there is no call to redesign OSS
repos to collect more social metrics.

- one a meta-note, we would also offer this paper as an example of
a ``stage two'' analysis of mining software repos. in stage one,
in various proof-of-concept studies, numerous tools were shown to
be effective for the analysis of software repos. based on that
success, we propose that this research area is now mature enough
to move to stage two. rather than just focus on the analysis
algorithms (as done in many papers, including many of our own~\cite{}),
it is now time to consider the business context in which these tools
will be used. here, we offer a simple mathematical characterization of
one such business context (optimizing inspections). to be sure, there are
probably very many more and resaerch in this area should not stop
just because this paper shows that we have near-optimal metrics for
the business needs for one specific context. rather, we would strongly
suggest that future work attempts to offer other mathematical characterizations
of other business contexts, then explores which metrics/algorithms work best
in that context.

related work

the usage hypothesis (dynamic aspects predict for defects- musa, operational profile)

the structural hypothesis (that static code features predict for defects)
e.g. the infamous mccabe v(g)>10, or briand's exploration of OO metrics. drawback here is
that the parsing required to usnderstand some measures is very complex (e.g. call graph
of a system that allos pointers to functions)

the social hypotheisis (that social contenxt predicts for defects)

the mixture hypothesis (that some combination of the above matters).

this paper takes the unusal stand that structure can indicate social. our pre-experimental
intuition was that a mixture of sturcutral and social as best (based on numerous recent papers).
here, we explored if simple structural measures can predict for the social. perhaps, we 
a weak partial indicator that would serve as a poor substiture for ``real'' social metrics.
the results shown below were hence a surprise. not only are our popularity metrics adequite
(can generate predictions very close to the actual values) but there is very little room
for improvement (at least, measured in terms of improving on the one task explored in this paper). 

future work

java vs other oo languages

non-oo-languages

closed source, not open source