tim@menzies.us Cc: tse@computer.org Subject: Decision Re: TSE-0037-0207 Body: RE: TSE-0037-0207, "A Data Miner for Searching Model-Based Software" Manuscript Type: Regular Dear Dr. Menzies, We have completed the review process of the above referenced paper that was submitted to the IEEE Transactions on Software Engineering. Enclosed are your reviews. We hope that you will find the editor's and reviewers? comments and suggestions helpful. I regret to inform you that based on the reviewer feedback, Associate Editor Dr. Mark Harman could not recommend publishing your paper to our Editor-in-Chief. We hope that this decision does not deter you from submitting to us again. Thank you for your interest in the IEEE Transactions on Software Engineering. Sincerely, Ms. Jennifer Carruth on behalf of Dr. Mark Harman Transactions on Software Engineering. 10662 Los Vaqueros Circle Los Alamitos, CA 90720 USA tse@computer.org Phone: +714.821.8380 Fax: +714.821.9975 *********** Editor Comments Editor: 1 Comments to the Author: I found this to be an interesting paper and I hope that the work will continue. However, I am afraid that the referees? comments on the paper are very clear and unequivocal and so I have to recommend rejection of the paper. I hope that you find the referees? comments helpful in developing the work and, perhaps, in revising the paper for submission elsewhere. *********************** Reviewer Comments Reviewer: 1 Comments A. Reader Interest This article presented an interesting data miner algorithm, called TAR2 minimal contrast set (or treatment) learner. It can be used to find a "satisfying" (just good enough) conclusion in a complex space (i.e., high-dimensionality non-linear, non-continuous models built in domains with noise or other uncertainties). As demonstrated in the article, the algorithm can be applied in solving different software development/engineering problems. B. Content On Page 5, "in our experience, generating such a fitness function is usually possible," This is not obvious to me at all. I am particularly interested in knowing how to define such a fitness function for a given UML 2.0 model, where the inputs/outputs of the model are signals with data (a signal typically contains many different types of data fields). The paper recommends an exponential scoring system (Page 13). Why? Any particular reason behind? In addition, is there any rule to score different classes? Can I assign any unique score to any class? It seems that the way of scoring classes makes a big difference since TAR2 seeks attribute ranges that occur frequently in the highly weighted classes and rare in the lower weighted classes." On Page 18, "There utilities were discretized into four classes, of approximately equal frequency:" It seems that it is arbitrary to divide the value of Equation 2 into different classes. Can I divide it into any number of classes rather than 4? Any rules or theory behind? C-1: Suggest to change the index team "model-driven" to "model-based" to be consistent with the title of the article. It is more appropriate to use "model-based" since the algorithm is generic, not particularly more suitable for model-driven software engineering. C-2: More than 100 references were listed. But most of them were just referred as indices without descriptions. For example, references were used as "[4]-[18]." Suggest to cut all the uncessary references. C-3: "model-driven" is very different from "model-based", but these two phrases were used interchangeablly in the paper. This makes the paper more difficult to read. For example, on Page 4, "The utility of model-based SE is widely acknowledged. For example, the Object Management Group (OMG) has recently adopted a model-driven architecture ..." The paper seems to imply that the algorithm is suitable for model-driven engineering (say, OMG's MDA), but no good example is provided to demonstrate it. Reviewer: 2 Comments Summary ------- This paper proposes the use of the TAR2 data mining algorithm to simplify the task of understanding the configuration possibilities within model-based software engineering. The paper starts by outlining model-driven engineering (MDE) and then discusses the potential of search-based software engineering. Subsequently, it describes a specific class of "hard modeling problems", and how these can be understood in terms of "collars and clumps". Then it describes the TAR2 treatment learner which is suitable for this class of problems. It disccusses how TAR2 has been applied to two software process modeling cases. Main comments ------------- While there can be value in the use of treatment learners to software engineering problems, this paper suffers from a number of (related) issues. 1. The paper spends ample space in explaining what MDE is. It fails, however, to make a connection between models (expressed in, say, UML) typically used in software development projects, and those that can be analyzed by means of TAR2. 2. The case studies discussed are in the area of software process modeling. While this can be useful, this is not representative for (UML) modeling. It is therefore unclear what we can learn from these case studies. The paper should discuss such limitations of the case studies. 3. The main claim in the introduction is: "For a certain class of hard modeling problems, the TAR2 treatment learner can dramatically simplify the task of understanding the configuration possibilities within model-based software engineering. Hence our conclusion will be that it is useful to augment standard modeling methods with treatment learning". While I do believe that TAR2 could help dealing with the class of hard modeling problems, the paper doesn't show that doing so "dramatically simplifies ... within model-based software engineering". Consequently, I cannot share the conclusion that, say, UML-based methods and tools should be extended with treatment learning. 4. The contributions of this particular paper are very unclear. They cannot be in sections I-IV, so the only possibilty would be that the innovations are in sections V and VI. But section V suggests that TAR2 comes from [94], so I doubt that the contributions are here. This means that the actual innovations are in the case studies, described on pages 16-26. The paper should be clear about this. (Note that section VI.C just summarizes case studies published before, so the contribs cannot be in that section). 5. Lack of critical evaluation. While the paper includes two case studies, these focus on what is working well. What I miss is a critical account of the cirumstances under which the proposed techniques work well, and under which they don't. 6. The paper includes over 100 (!) references, many of which are self citations. Overloading the reader with so many references is easy: it is harder to select precisely the most relevant ones. In particular just citing [4-18] in one go, all self citations, isn't very helpful -- I didn't know where to start reading. In short, I'd rather see a focused paper discussing the use of TAR2 treatment learning to the optimization of software processes, as discussed in section VI. If you do such a job properly, and convince readers of the fact that your case studies are representative, the result would be much better than the present paper that includes over-general yet unsupported claims. Other notes & observations. --------------------------- I am somewhat confused over what exactly you want to express with this paper. When I just started reading the paper, so the title, abstract, indexing terms and the introduction section, I was thinking that this paper was about model-driven software engineering and how to optimize or reach certain design decisions. However, after reading further, practically half-way through the paper, when I was confronted with the case studies, I was not so sure anymore. The case studies (and for that matter the running example that you are using), doesn't conform to my image of what is meant with model-driven software engineering. So, in essence that might not be a problem, as maybe my image of MDE is too strict, but you should prepare the reader for what is coming and in my opinion, you have not done so. Another remark is that you should carefully re-read the paper, as a large numer of typo's are still present in the text. While I enjoyed the topic, I was less happy with the paper's structure. I thought that the abstract and introduction were misleading. Maybe the title not so, but after reading the abstract and intro, I wasn't thinking of the title anymore actually. I am also thinking that you are using a number of forward references, which are very annoying. You are also presuming that the reader has knowledge of a number of terms (sensor/actuator/design of experiments, noise), so maybe you should do something about that to attract a broader audience. Also, is this the right journal to send this material to? The software engineering side of the paper seems a little bit flimsy... but that might also be influenced by the fact that I think that you should go for a complete rewrite of both the abstract and the introduction. Maybe I can add to that that another symptom of the bad abstract and intro is the fact that the conclusion seems to come from a totally different paper... Detailed comments (pages numbers refer to draft version) ------------------------------------------------------- Introduction ---------------- - The introduction starts with a reference to the MDE paper by Schmidt, this pretty much sets the tone for this section (and the paper). Is this what you want? - first paragraph, last line: the use *of* models - you state that this paper synthesizes prior work and presents new case studies. Does this mean that the two case studies presented here are exclusive to this paper? - p.3, 2nd line: a handful of key variables (remove one "of") - p.3, a special case of hard models is discussed. - p.3: maybe you want to add a reference to these hard models, I for one was unsure of the term. You do define it later, but maybe you can ease the pain with a reference here? - p.3: I am unsure what sensors and actuators are, is this something from the world of neural nets? Still, maybe you can either remove these terms of explain them (add a reference). I suppose this is more of an AI term than a SE term. Please write your paper according to the target audience. Section 2 -------------- - Is MDA still recent as you write? For me, MDE is not so recent anymore actually... - You forgot a final stop "." after[30] (halfway p.4) - p.4, bottom: Model*s* are useful - p.4, bottom: Gray *et* al - p.5, 2nd paragraph: ... successful*ly* applied to model-based SE Section 3 -------------- - p.6, don't --> do not - p.6: As argued in the *previous* section - the first paragraph in section 3, is this your own kind of "definition" or do you borrow it from other people? I am only presuming here, but I think a reference is needed here... - p.6: Further, as the dimensionality of *the* model increases... - p.7: A reference to sweet spots maybe? Or is it your own? Unclear... - p.7. A final stop "." after [75]. - p.8: ... it is usual practice in our method to expand... "your method" is at this point still not introduced. This makes it strange to read this... - p.8: ... of the our method discussed... --> re-read sentence and correct please - p.8: Mintzberg's [79] --> genitive form! - p.8: On page 8 you finally provide a reference to the "hard problems". Maybe you either bring this definition forwards in your paper or use the reference you use here in one of the previous sections? Section 4 ------------- - In general, I think this is a very hard section to explain, and you did a good job. - I am confused however, because I am unsure what causes what. Do "clumps cause collars" (you write this just above the itemize at the bottom of p.9) or do collars restrict the behavior of a model such that their state space clumps (2nd line of section 4). Maybe it works in both directions, I am unsure, but after reading this section, I am a little confused. - p.9: won't --> will not - p.10: d/c >= 64 --> why, this is unclear to me... do I have all the info or am I missing something? - p.10: ... will naturally select for small collars. (remove "for") - p.12: ... was linked to every other statement). --> put "." outside brace Section 5 ------------- - p.13, bottom: weight --> weigh - p.15, at the bottom you mention an example of 250 000 rows and 100 attributes, why is this not in Table 4? (you call it figure 4) Section 6 -------------- Here starts the real contribution! (we are at page 16) - p.16: several case studies. Why not mention the exact number? - p.17: what are injection rates? - p.17: footnote 5: what is "mode"? - p.18: at the bottom you mention that there is only one recommended change, namely hidesign_12 = F. The fact that you mention this, what does this add for the reader and more precisely, what do you want to tell the reader with this? - p.19: Good that you tell something about the applicability here (just below halfway the page) - p.21: Figures 8, 10, 9 --> you might want to order them - p.22: An important feature of this second case study is that it analyzes ... --> is this a "feature" or a "characteristic"??? - p.22: Try to find a better solution with introducing the "rainy" operator. I found the "is discussed below" to be awkward, because I wanted to know what is was immediately. - p.23: Fig 12: you have a closing brace, but no opening brace - p.24: Apart from rany... mechanisms --> should be mechanism Section 7 ------------- Visualization can't --> cannot References ---------------- - Reference 49: model refactoring - Reference 50: model-driven - Reference 60: testing Reviewer: 3 Comments The paper describes a data mining approach to model comprehension, and presents a learner that provides empirically better results than other approaches. The particular aim of the paper is to synthesise prior technical work -- particularly on the data mining approach -- and to summarise new case studies (in Section 6). The approach is generally classified as a contribution to model-based software engineering, and both quantitative and qualitative data are presented to support the improvements offered by the new approach. The intent behind the paper is very suitable for IEEE TSE. In my view, the current paper is not ready for publication. I first make some general observations, and then more specific comments (predominantly about the first few sections, wherein substantial content and presentation improvements can be made). General comments: 1) The paper attempts to align with model-driven/model-based software engineering, but this appears tacked on and not fully integrated with the data mining/search-based material. In particular, much of the early material in Section 2 is not properly motivated, and the relation to data mining is not made clear. Certainly, data mining is related to MDD, but this paper is focusing on a specific kind of model -- which is very different from the typical kinds of models we see in MDD. I think the paper could be contextualised much better, particularly by removing any attempts to link its contributions with MDD/MDSE. 2) The paper never comes out and says, from the start, what kinds of models are of interest. Instead, there is very general discourse about models. I think the paper would be very substantially improved by stating, in the first couple of paragraphs, what models are interesting -- and provide an example. 3) The contribution of the paper is not made clear. The introduction makes a statement about synthesising previous work and presenting new case studies. While this may be sufficient for a journal paper, it's not clear that the *results* of the synthesis are actually new and novel. I expected to see new observations resulting from the synthesis (and presumably the new case studies), but I didn't find any. The paper really needs a deeper analysis of the results beyond what is already available in the literature. 4) The abstract and introduction are rather rambling and unfocused. This will be improved by addressing the previous three points, but I mention it separately as I think the authors would do well to spend quite a bit of effort making the introduction more focused as to the paper's novelty, differences from previously published work, etc. Specific comments: Abstract: - model exploration is a critical part of MDD; I would remove the "not necessarily explore" as I'm unaware of any MDD approach that doesn't support and emphasise this. - the abstract overall does not motivate the paper, explain its contributions in relation to MDD, and explain what is different from previous work. I think it needs rewriting. Introduction: - please clarify the scope of models in the paper. - first paragraph is rather repetitive (e.g., "models are used throughout the software development process" appears twice). - "- and that use models will only increase" -> "use *of* models". - define what is a configuration possibility; an example would also be helpful. - define precisely what is a "binary choice" in a model; choice amongst what? Different features? Different values of boolean variables? Configuration options? I think if you clarify what is a model, confusion might be reduced. - the fourth paragraph really doesn't make the contribution of the paper clear. You need a precise statement of what the goal of the paper is, and how it differes from your previous work. - "most improve model output" -- what does this mean? I know what you meant only after I read some of the following sections. - define precisely what is a "model configuration possibility". - you should better explain the relevance of the bullet points on page 2 to your overall objectives and goals. Otherwise, these bullet points seem to come out of nowhere. - the second bullet point on random search doesn't seem to say much -- any technique can easily produce models that are too complex to understand if they are misused. Certainly, automated model transformation can do this as well. - an example would help clarify what the paragraph at the end of page 2/start of page 3 is trying to say. - when you discuss "differences" you might want to look at some of the work on difference descriptions, e.g., in aspect-oriented modelling, model versioning, etc. - what are "standard methods" (second full paragraph on page 3) and "standard modelling methods"? For example, is MDA a standard method -- probably not, based on what you say later. - explain what are the key variables that control a model (in general, I think a little example will help to clarify most of these undefined terms). Section 2: - When you discuss MDA it's worth explaining what a model is in this context; it's quite different from what you mean. - you should probably refer to work on models of control laws, e.g., for avionics systems. Ursula Martin has done some elegant work on this. - page 5, why do you note lighweight formal methods? Why Alloy and SCR, and not others, such as JML, Spec#, Eiffel, etc? There are also many UML/MDD tools that allow execution of life cycles, and you missed lots of work on process modelling (e.g., based on SPEM). - I was surprised to not see any discussion of work on software clustering, e.g., by Holt, Tzerpos, et al. - "Clarke et al" should be "Clark et al". - first bullet point on page 5; such representations don't always exist in MBSE; consider Agile Modelling (Ambler) where models are blackboard sketches. Feature Driven Development also treats models as sketches. - third bullet point: you may want to read work on model management to generalise your discussion on manipulation operations, see, eg., proceedings of G@MMA'06 (ICSE workshop) - I found it a bit of a leap going from section 2 to Section 3. Some linking text would improve readability here. Section 3: - reference on "satisficing" would be useful here. - I think an example problem would be helpful to explain hard modelling problems. - page 8: at this point, I think the link to traditional MBSE/MDD has been lost. I think you'd do well to go back every now and then and link what you've said to your objectives. - the example on page 8 suggests that you also need to mention KBSE and more recent techniques that don't require total knowledge, such as model checking in the face of uncertainty (Easterbrook et al). - "state space is the set of options" -> what is an option with respect to a UML model? - "Our contribution to hard modeling is to comment" -> this seems rather informal for a TSE paper. I think you want to rephrase this. Section 4: - "This research assumes..." -> I thought you were going to present evidence for this, rather than assume it? I think you should also give some examples of where this assumption holds. If it holds in a small set of cases then the results have little value. - typos, paragraph 2, page 9: "These traces *can* reach.."; extraneous space prior to last period of the paragraph. - "If all those hotels were designed by different architectures then their internal search spaces would be different." I don't really believe this, as it ignores external constraints (e.g., the size of the lot, planning restrictions, budget, etc). - "Decades later, we can assert that deleting irrelevant variable*s* has proven to be a useful strategy..." -> You probably need to discuss refactoring, as there are several refactoring patterns that aim to accomplish this. Also, data refinement can do similar things. - "...since they were written by humans" -> what evidence do you have for this? And what about auto-generated models (e.g., in MDA) which can be huge, on the order of thousands of classes with tens of thousands of variables. Section 6: - are all these case studies new? - "In theory" -> "Theory predicts that we should". - clarify the domain from which these case studies are taken. - bad typesetting in "after" in last paragraph on page 18. - did you want to discuss the differences in how data is structured between Figs. 8 and 10? - page 24, "one other mechanisms" -> "one other mechanism" - page 27, what evidence do you have for these observations (the bullet points). - I don't see any overall analysis or assessment of the case studies with respect to your techniques. Reviewer: 4 Comments The title of the article is appealing. The abstract defines what the rest of the article is about and is not fully related to the title of the article. The keywords could be more specific. Sections I, I and III are a very broad introduction to the main topic of the article. They deal with many aspects which are not directly related to the article. A clear example of this is Figure 1, which is not related to the rest of the article, at all. In fact, by using many references the main focus of the article is lost. These sections should be much shorter and more centered on the article's goals. Section IV introduces the concepts of "collars and clumps" which seem a property of the TAR2 method. Section V explains basically the TAR2 method and Section VI shows the case studies. In sections IV and V few new valuable new concepts are introduced since they use the well-know TAR2 method. The reader is expected more since the purpose of the article stated that it is specifically targeted to model-based software. The application of the method to the case studies are interesting but not as general as the authors say. They are a simple application of the method to some general issues in software engineering. The authors draw some conclusions about the application of the TAR2 method and that what the article is about. The impression is that the authors wanted to find something new and general by applying that method, but the conclusions are of minor importance. The comparison of TAR2 to C4.5 and CART is not new and the results of their application to the software issues is not striking. The comparison of methods is not directly related to these Transactions and the application of the method to the the software inspections and to the CMM method does not provide new significant results, neither in the method nor in the case studies. As an aside aspect, it is obvious that the TAR2 method obtains simpler explanations than the C4.5 and CART since it weigths some variables by classifying them as good or bad. This property of the method was already known and it is valuable, but it isn't new. The authors do not provide new significant new results related to the purpose and the title of the article. I recommend that the authors should write the article with the focus on new variations of the method which are specifically targeted to "some properties" of software. Otherwise, what we have now is a simple application to two sw.eng. issues (the authors also acknowledge this fact in their conclusions). Date Sent: 27-Apr-2007 File 1: TSE-0037-0207.doc