\bi \item ``Tag'' each record with its cluster identifier \item Using feature subset selection, learn which SILAP attributes are most important \bi \item 10-way, CFS, selected: {\color{red} HS2, EX3, US3, CL3, FR3, DT3, RM3} \ei \item Learn a decision procedure that identifies each clusters \item If SILAP performs differently for each cluster, then those clusters represent truly different project types. \ei {\tiny \begin{verbatim} DT3= use of defect tracking; CL3= CMM level; US3= use of standards; EX3= experience; HS2= human safety DT3 <= 1: cluster2 (150.0/4.0) DT3 > 1 | CL3 <= 4 | | US3 <= 2: cluster0 (170.0) | | US3 > 2 | | | DT3 <= 2 | | | | EX3 <= 2: cluster3 (25.0) | | | | EX3 > 2 | | | | | HS2 <= 3: cluster0 (14.0) | | | | | HS2 > 3: cluster3 (6.0) | | | DT3 > 2: cluster0 (42.0) | CL3 > 4: cluster1 (19.0) \end{verbatim} } In a 10-way cross-val, accuracy=99.061\% (!!). ~\\ ``The variables in the tree are all about how much a project knows about itself and how much it is willing to share that knowledge with others.''