Dear Mr. Tim Menzies:
I have received review reports on your manuscript, "Real-time
Optimization of Requirements Models", which you submitted to Automated
Software Engineering.
Based on the advice received, your manuscript could be reconsidered for
publication should you be prepared to incorporate major revisions.
When preparing your revised manuscript, you are asked to
carefully consider the reviewer comments which are attached, and submit
a list of responses to the comments. Your list of responses
should be uploaded as a file in addition to your revised manuscript.
PLEASE NOTE: YOUR REVISED VERSION CANNOT BE SUBMITTED IN .PS OR .PDF.
IN THE EVENT THAT YOUR REVISED VERSION IS ACCEPTED, YOUR PAPER
CAN BE SENT TO PRODUCTION WITHOUT DELAY ONLY IF WE HAVE THE SOURCE
FILES ON HAND. Submissions without source files will be returned
prior to final acceptance.
In order to submit your revised manuscript electronically, please access the following web site:
http://ause.edmgr.com/
Your login is: timmenzies
Your password is: menzies323
Please click "Author Login" to submit your revision.
I look forward to receiving your revised manuscript.
Sincerely,
Robert J. Hall, PhD, FASE
Editor-in-Chief
Automated Software Engineering
COMMENTS FOR THE AUTHOR:
Reviewer #1: The paper concerns real time adaption of ultra lightweight
requirements models (within 5 second tolerance). The paper uses search
based optimization methods associated with the rapidly developing field
of SBSE to optimize models of requirements. The work is written in a
compelling and direct style (which occasionally is too direct and a
little grating) and the contributions are clearly motivated and
explained and backed by some evidence in the form of empirical work. I
feel that this is a paper that should definitely be worthy of
publication in JASE. However, I do have some suggestions for
improvement and also I noticed some missing work and weaknesses in
presentation that I think should be included. I think that a moderate
revision will easily be sufficient to satisfy these change requests.
The revisions concern several broad aspects which need attention
1. Claims and search based techniques (these need to be reconsidered in the light of the No Free Lunch Theorem)
2. Related work. Some very closely related work on BSSE for requirements is missing
3. Clear explanation of fitness function and representation is needed
4. Clear setting out of empirical Software Engineering aspects is needed: research questions and answers to them.
These changes do not require any new experimental work, but merely
refactoring of the paper and the results and some more clarity in
places, and inclusion of related work. I think it is a "major
revision" (in terms of category of referee outcome), but I am sure it
can be achieved with relative ease and that it will make the impact of
the paper much greater.
Here are some details:-
The authors make available their code which is helpful. It is not clear
whether real world requirements data form JPL is also included in the
on line provision. I would like some clarification on this issue in the
revised version. Such provision would be a great help to follow on
research, though I realize that for reasons of commercial
confidentiality this may not be possible.
The take on simmulated annealing is an odd one. Clearly it is well
known that no search technique can outperform all others for an
arbitrary problem (consider the no free lunch theorem). This is, so far
as I know, well understood within the SBSE community and its
publications and this rather special rendition: SA cannot find the best
solution for all problems, seems rather curious (why attack SA in this
way, why not any technique, HC, SA, Tabu, GAs ... )?. The authors cite
several applications that have used Simmulated Annealing, but they are
different problems in SBSE os obviously the milage will differ for each
problem and one cannot generalize in the way that the authors do: They
include work on testing and modularzation as examples where SA is
sub optimal. OK. But these are very different problems and have no
necessary connection to requirement models. In fact Mancoridis have
found that Hill climbing can outperformed simmulated annealing *for*
*modularization*; this
says *nothing* about how it would perform for your problem. It did not
seem to make sense to cite other examples of other problems here in the
context of showing that SA is not the best available techniques.
So the authors statement:-
"
This is an exciting
result since it means that current results from simulated annealing(e.g.[5,8,10,62,73])
could be greatly improved, just by switching to an alternate search engine.
"
really needs to be re-written. This statement is naive (sorry, but it
really is!). It shows that the authors did not really understand the No
Free Lunch (NFL) Theorem, which his central to search based
optimization. Just because technique X beats SA in problem Y does not
mean that it will beat it for all problems. The problems cited in the
list above by the authors are all very different. NFL means that we
cannot generalize from them and certainly can't claim any carry over to
the problem for requirements modeling to which the authors apply their
techniques in this paper.
I am not saying that this invalidates the paper, but it does require a
careful toning down of claims here. I would strongly recommend that the
authors read about the NFL too in this context.
In general in SBSE I am not sure that SA is even the most widespread
used technique (though even if it were, this is not relevant to the
authors argument here because different SBSE problems have different
formulations and therefore results will differ from the requirement
modeling problem). I believe that GAs are the most widely used SBSE
technique (certainly seems so from the repository on SBSE.
I recommend not to use the phrase "search engine' when referring to
search based optimization algorithms. There could easily be a confusion
with Google, which would not help!
In related work, some recent work by Finkelstein on multi objective
requirements models for fairness is missing (this was published in RE
08) and is clearly relevant since it involves a search based
optimization model of requirement engineering. Also there was a survey
by Finkelstein et al. on requirements optimization at REFSQ 2008. These
two citations should be included in the reviewed version of the paper
and worked onto the related work section. The authors should check for
other related work on Search based optimization for requirements. They
cite work on SBSE, which is fine, but they need to get the work
specifically on SBSE for requirements into their related work section.
For instance, other work on SBSE for requirements that is clearly missing and needs to be included is:
1. The paper by Gunther Ruhr at FSE 08
2. The Zhang et al (GECCO 2007) and
3. the work by Bagnall (I&ST 2001).
The paper already has a lot of references, but these are clearly relevant and should be worked into the related work section.
Some of the citations are messed up too. For example citation [5] "S M B and S. Mancoridis". S M B is Brian Mitchell I believe.
Clearly some more scholarly care is required here, both to chaise up
relevant work and to ensure that the citations are present correctly.
Does Fig 4 add sufficient value to justify a whole page?
The colourful writing style comes up again on p17; Uribe and Stickel
"struggled valiantly". I don't think that this style is suitable for an
academic journal. Maybe I am old fashioned. I think it should be toned
down and the style more dispassionate. Did they really struggle
"valiantly" anyway? We are talking about research work not a major
international conflict. I hope the authors understand why I raise this.
It may well put readers off and that would be a pity.
Fig 10 is too small to read properly.
I missed a clear and unequivocal explanation of what the fitness
function and representation were for the Simmulated Annealing. This is
standard for SBSE work; the representation and fitness function should
be very easy to locate in the paper and should be defined clearly
formally and with some informal explanation so that other authors can
find these and replicate with their own pet search based optimization
technique.
I also felt that the preparatory sections built up lots of background,
but the meat in the results section was skimmed over by comparison. I
would like to see this section extended to explain clearly the research
questions, how the associated hypotheses were tested and the finding,
mapped back to the research questions. This is fairly standard for
empirical software engineering research and makes it much easier to
understand the results.
I liked the novel style opening "The room is crowded and everyone is
talking at once.". However this may put off some readers. I expect that
this is rather cramping the style of the authors but they should
consider boxing out the scenario as a figure. At present the paper is
nice as a compelling read, but starting off like this really could
distract the more traditional SE reader who have views about how
scientific work should be presented. I would not make this a
requirement for major revisions, so if the authors want to overrule me
on this then that's fine. I expect they knew when they opened with this
provocative style that it would draw comment. If other referees comment
on it then I think they definitely should add this to the list of "must
fixes"
Reviewer #2: This paper presents a comparative performance study of
several optimization algorithms that can play roles in automated
requirements engineering. The results suggest that the authors'
preferred algorithms have the potential to outperform some competing
approaches.
The novelty of the work remains unclear. It appears to be little beyond
an implementation and very limited performance study of known
techniques. The paper presents little other (of real substance) than a
small performance measurement experiment, results of which are
interpreted as supporting the appropriateness of the algorithm for use
in automated requirements engineering.
The title and abstract and indeed the overall narrative of this paper
are misleading. This is a paper on a small-scale runtime performance
study *masquerading* as a paper on requirements analysis. The link
between the algorithms and any real advance in requirements engineering
is extremely tenuous, and, in any case, not tested or supported by
evidence.
The writing in this paper is somewhere between terrible and just plain
poor: at the level of sentences, paragraphs, and overall narrative
structure. The paper is rife with errors in grammar and usage. It's far
too long for the actual new material presented. It takes far too long
to get to the point. It never really explains the model at issue. The
paper doesn't adequately state the problem it's addressing, the
specific technical approach it's proposing to address the problem, the
novel claims being made for the approach, the experimental or
analytical support for these claims, or the overall conclusions one can
draw from this work. To the extent that the paper does explicitly state
its contributions, the forms of the contributions are not appropriate
for a research paper. The paper is really dominated by tutorial
background. The authors lose a good deal of credibility by making a
whole slew of writing errors "right out of the block," on pages one and
two.
The claims made explicitly or tacitly in this work are largely
unsupported by evidence. The scenarios the paper describes are not
credible. With due respect to the authors, this paper is nowhere near
being suitable for journal publication It's really a disservice to the
reviewing community to submit such work for peer review. Authors should
at least proofread their own works before asking others to do so.
Some detailed comments:
Opening scenario isn't credible. A decision as substantial as a
redesign of the file system would not be made without documenting it in
a way that would make everyone with an interest in that aspect of the
aware of the decision. If major architectural decisions were made in
this manner, well, the project has bigger problems to worry about than
tooling. I suggest that the scenario
There's a grammatical error at the very start of the intro, on page 1:
experts on one side the room can make decisions that one hinder
decisions and the group is unaware of this conflict.
There's another grammatical error on page 1: As our tools grow better,
and they will be used by larger groups who will build more complex
models.
The presence of two obvious grammatical errors on page one of a journal
submission is not a sign of care in preparation and not a sign of a
promising situation.
There's a grammatical error at the top of page 2: The problem of
co-ordinating group discussions is challenging in the 21st century
net-enabled world where participates communicate via via multiple
channels;
Numerous statements in this paper are speculative and without support, e.g., For example,
suppose a requirements analyzer finds a major problem or a novel better solution. Such
a result would command the attention of the whole group, in which case everyone in
the room would interrupt their current deliberations to focus on the new finding.
The writing in this paper has many problems. One problem is that
antecedents are not always clear, as in this sentence: A premise of
this approach is that requirements analyzers offer feedback . To what
does this approach refer?
References are made to important works without citations, e.g.,: Based on current projections from JPL .
This paper states that the main contribution is a problem definition. A
problem definition is not really a contribution worth publishing
without substantiation of the importance of the problem and a solution.
"The specific contribution is to define the problem of real-time
requirements engineering."
The paper is very unclear, at least in the first three pages, on the
nature of the formalism being used to represent requirements. It's
clearly some kind of logic-based approach, but what is it precisely?
Moreover, the paper presents at most anecdotal evidence that the new
method is promising: At least for the models explored in this paper, we
can achieve optimizations in around 10?2 seconds.
Nor is commenting on a search engine a research contribution: Another
important result from this paper is to comment on a standard search
engine, used widely in the field of search-based software engineering
(SBSE).
Spelling error: we are willing to trade off representational or constrain expressiveness for faster runtimes.
This paper makes strong claims for the proposed approach before the
approach is explained and certainly before supporting evidence is
offered, e.g.: It is trivial for our preferred method (KEYS and KEYS2)
to offer robust information around partial solutions.
Ambiguous antecedent: DDP provides a succinct ontology for representing this design process.
Use of undefined term: What assertions? The DDP tool supports a graphical interface for the rapid entry of the assertions.
Unexplained and unsubstantiated claim: Cost savings in at least two
sessions have exceeded $1 million, while savings of over $100,000 have
resulted in others.
Critical but unexplained and unsupported assumption: Our proposed
solution to real-time requirements optimization assumes that the
behavior of a large system is determined by a very small number of key
variables.
Reviewer#3:
I think there is a kernel of some interesting novel work here.
Unfortunately, the paper in its current form is not publishable
for several reasons.
1. It apparently has not been proofread even quickly. The errors here
are obvious and quite frequent.
2. The whole discussion of the "real time requirements
optimization" problem, while a nice idea and interesting,
falls flat by being bogus. It is interesting to try to calculate
how fast things have to go today in order that five years from now
the tools handle the load. That is good.
However, I cannot follow much of the actual quantitative reasoning.
Here are some questions:
- How do you known X is likely between 1 and 4? Do you have measurements
or an argument to back this up? How do you know it isn't 200?
- You claim that KEYS2 meets the 0.01 target, but *none* of the
performance results in Figure 11 meet this, except for the
small model. It seems like 0.038 now could balloon into
(5sec)x3.8 or even (5 sec)^(3.8).
- Why is there one number (0.01) for all current-day models?
It seems like Model 5 is already 3 times bigger than Model 2,
so why is it necessary that both meet the 0.01 requirement?
Or are you claiming that 0.01 is the bound for the biggest model
of today? (If so, then KEYS2 is off by a factor of nearly 4.)
- In the appendix, you say machines in 2013 will be "560%" faster,
but you actually mean "5.6 times faster". And this is wrong, because
you have the Moore's Law calculation wrong: the doubling is every
18 months, not two years. So this number should be a factor of 10.1
not 5.6. This presumably means we should shoot for 0.02 today, no?
- Why are models expected to grow by a factor of 8? Is it we cannot
type fast enough today? We will be eight times smarter then?
What is the basis for this claim?
3. It is weird in Figure 12 that the x axes go 1,10,20,30, etc.
Why isn't this 0,10,20,30,...?
4. First paragraph of Sec 7: Why? why is it "unfair" to compare
BDD-based against search-based? Aren't they solving the same problem?
5. Sec 7.2: what is the variable "C"?
6. I'm not sure it is valid to make claims about KEYS2 performance
versus MaxWalkSAT when you are actually comparing to MaxFunWalk.
You have to do a better job showing how they are related. (E.g.
at what level are they the same algorithm?)
7. Sec 7.3 para 2: you claim the scoring function is "a Euclidean distance
measure". I'm not sure what you mean by that. Do you mean an
admissable metric? An L-p norm? What? Please say more and convince
us of this and show why it is important.
8. Last para of Sec 9. "...some requirements models". This comes out of
the blue. For which reqts models does KEYS approach fail?
9. In general, what are the limits of applying KEYS/KEYS2? When can
we expect them to fail? To be worse than A*?
|
|