RE: TKDE-0074-0305, "Incremental Discretizastion and Bayes
Classifiers Hanldes Concept Drfit and Scales Very Well"
Manuscript Type: Concise

Dear Dr. Menzies,

We have completed the review process of the above referenced
paper that was submitted to the IEEE Transactions on Knowledge
and Data Engineering.  Enclosed are your reviews. We hope that
you will find the editor's and reviewers comments and
suggestions helpful.

I regret to inform you that based on the reviewer feedback,
Associate Editor, Dr. Qiang Yang could not recommend publishing
your paper to our Editor-in-Chief. Final decisions on acceptance
are based on the referees' reviews and such factors as
restriction of space, topic, and the overall balance of
articles.   

We hope that this decision does not deter you from submitting to
us again. Thank you for your interest in the IEEE Transactions
on Knowledge and Data Engineering. 

Sincerely,

Ms. Susan Miller
Transactions Assistant
IEEE Computer Society
10662 Los Vaqueros Circle
Los Alamitos, CA 90720
USA
tkde@computer.org
Phone: +714.821.8380
Fax: +714.821.9975

***********
Editor Comments

Reviewer 2 raised serious concern over the novelty of the work
as well provided many good suggestions (same as reviewer one).
On the basis of their reviews, I have to recommend rejection of
the paper.  

***********************

Reviewer Comments

Please note that some reviewers may have included  additional
comments in a separate file. If a review contains the note "see
the attached file" under Section III  A - Public Comments, you
will need to log on to Manuscript Central to view the  file. 
After logging in to Manuscript Central, enter the Author Center.
Then, click on Submitted Manuscripts and find the correct paper
and click on "View Letter". Scroll down to the bottom of the
decision letter and click on the file attachment link.  This
will pop-up the file that the reviewer included for you along
with their review. 

***********************
Reviewer 1
				

Section I. Overview

A. Reader Interest

1. Which category describes this manuscript?

( ) Practice / Application / Case Study / Experience Report

(X) Research / Technology

( ) Survey / Tutorial / How-To


2. How relevant is this manuscript to the readers of this
periodical? Please explain  your rating under IIIA. Public
Comments.

( ) Very Relevant

(X) Relevant

( ) Interesting - but not very relevant

( ) Irrelevant


B. Content

1. Please explain how this manuscript  advances this field of
research and / or contributes something new to the literature.
Please explain your  answer under IIIA. Public Comments.

2. Is the manuscript technically sound? Please explain your
answer under IIIA. Public Comments. 

( ) Yes

( ) Appears to be - but didn't check completely

( ) Partially

(X) No


C. Presentation

1. Are the title, abstract, and keywords appropriate? Please
explain your answer under IIIA. Public Comments.

( ) Yes

(X) No


2. Does the manuscript contain sufficient and appropriate
references? Please explain your answer under IIIA. Public
Comments.

( ) References are sufficient and appropriate 

(X) Important references are missing; more references are
needed

( ) Number of references are excessive


3. Does the introduction state  the objectives of the manuscript
in terms that encourage the reader to read on? Please explain
your answer under IIIA. Public Comments.

(X) Yes

( ) Could be improved

( ) No


4. How would you rate the organization of the manuscript? Is it
focused? Is the length appropriate for the topic? Please explain
your answer under IIIA. Public Comments.

( ) Satisfactory

(X) Could be improved

( ) Poor


5. Please rate and comment on the readability of this
manuscript. Please explain your answer under IIIA. Public
Comments.

( ) Easy to read

(X) Readable - but requires some effort to understand

( ) Difficult to read and understand

( ) Unreadable


Section II. Summary and Recommendation


A. Evaluation

Please rate the manuscript. Please explain your answer under
IIIA. Public Comments.

( ) Award Quality

( ) Excellent

( ) Good

(X) Fair

( ) Poor


B. Recommendation 

Please make your recommendation.  Please explain your answer
under IIIA. Public Comments.

( ) Accept with no changes

( ) Author should prepare a minor revision

(X) Author should prepare a major revision for a second review

( ) Reject


Section III. Detailed Comments


A. Public Comments (these will be made available to the author)
 Incremental discretization is enchanting when put into the
context of concept drift. However, this interesting idea has not
been studied gracefully enough to justify its publication.

The paper title claims that incremental discretization and Bayes
classifiers handle concept drift very well. There exist a large
number of discretization methods for naove-Bayes as well as
concept drift learning algorithm. However no empirical results
are presented that compare SPADE or SAWTOOTH against its
alternatives. This leaves readers wonder why they can be claimed
to work ``very well``.

In the conclusion section, the paper claims that one advantage
is that ``In Figiure 3... This discretizer performed nearly as
well as other discretization methods without requiring multiple
passes through the data``. However, in Figure 3, SPADE is only
compared with naove-Bayes with kernel estimation, which does not
involve discretization at all. Where is the conclusion drawn
from then?

The understating of (naove) Bayes classifiers is far less than
accurate. In the first paragraph on page 6, it is said that
``Bayes classifiers are called naove``. This expression is
misleading. Bayes classifiers have a very big family. Naove
Bayes is only one member out of it. Nobody calls Bayes
classifiers naove except for naive Bayes.  The paper then goes
on by saying `` since they assume that the frequencies of
different attributes are independent``. This statement is wrong.
Instead, naove Bayess `attribute independence assumption` is:
``attributes are independent of each other given the class``.

SPADE is interesting since it does not need to repeat scanning
the data. This will be useful in applications where one can not
retain the whole historical data. However, there are two
potential pitfalls that the paper fails to address:

   >>> first on the merge mechanism. It produces new cut points
from the old cut points. For example, the old discretization of
age is (, [30, 39], [40, 49], ). Merging the two intervals
will still retain the old cut points like 30 and 49. But what if
should the appropriate cut points be 35 and 45 instead? 

   >>> second on lacking a split mechanism. Although the paper
has mentioned it is because ``do not know how to best divide up
a bin without keeping per-bin  data`` and `` experiments
suggested that adding SubBins=5 new bins between old ranges and
newly arrived out-of-range values was enough to adequately
divide the range``, those arguments can not trade-off the need
of a split operator. For example, the instances are patients
coming into a clinic one after the other. The first one is an
infant while the second one is an old lady. In the two first
instances, one has seen the two far ends of the age attribute
[1, 90]. SPADE will produce 1+5 intervals by now and forever
(assume the oldest is 90 years old). The reason behind this
sub-optimality is that the attribute values do not necessarily
gradually change, they can abruptly shift.


In the second to last paragraph of Section V, the paper claims
that SPADE is good because it outperforms dealing with numeric
attributes by normal or kernel probability estimation. However,
the observation that discretization is better than probability
estimation has long been established. Mentioning it here only
again proves that discretization is better, but not that SPADE
itself is good discretization. A much convincing way is to
compare SPADE with peer discretization methods.

At the end of section C in experiments, it is said that ``
SAWTOOTH can retain knowledge of old contexts and reuse that
knowledge when contexts re-occur``. But the paper does not
mention before any mechanism to retain old concepts or identify
re-appearing concepts at all.  How did this achievement happen
then?


Other minor comments:

1. Is WASTOOTH a method newly proposed in this paper or it is
only reused by this paper? It does not hurt to clarify this
point. If it is new, should emphasize more; if not, should give
a reference. 

2. At the end of this paper, in the conclusion section, the term
``V & V`` agent is mentioned for the first time. What does it
mean?

3. The paper mentions the MaxBins parameters is by default set
to be the square root of all the instances seen to date.  If the
paper wants to justify this setting, it may help by citing a
causal paper: Ying Yang and Geoff Webb, Proportional k-interval
discretization for naive-Bayes classifiers, ECML 2001.
***********************
Reviewer 2
				

Section I. Overview

A. Reader Interest

1. Which category describes this manuscript?

(X) Practice / Application / Case Study / Experience Report

( ) Research / Technology

( ) Survey / Tutorial / How-To


2. How relevant is this manuscript to the readers of this
periodical? Please explain  your rating under IIIA. Public
Comments.

( ) Very Relevant

(X) Relevant

( ) Interesting - but not very relevant

( ) Irrelevant


B. Content

1. Please explain how this manuscript  advances this field of
research and / or contributes something new to the literature.
Please explain your  answer under IIIA. Public Comments.

2. Is the manuscript technically sound? Please explain your
answer under IIIA. Public Comments. 

(X) Yes

( ) Appears to be - but didn't check completely

( ) Partially

( ) No


C. Presentation

1. Are the title, abstract, and keywords appropriate? Please
explain your answer under IIIA. Public Comments.

(X) Yes

( ) No


2. Does the manuscript contain sufficient and appropriate
references? Please explain your answer under IIIA. Public
Comments.

(X) References are sufficient and appropriate 

( ) Important references are missing; more references are
needed

( ) Number of references are excessive


3. Does the introduction state  the objectives of the manuscript
in terms that encourage the reader to read on? Please explain
your answer under IIIA. Public Comments.

(X) Yes

( ) Could be improved

( ) No


4. How would you rate the organization of the manuscript? Is it
focused? Is the length appropriate for the topic? Please explain
your answer under IIIA. Public Comments.

(X) Satisfactory

( ) Could be improved

( ) Poor


5. Please rate and comment on the readability of this
manuscript. Please explain your answer under IIIA. Public
Comments.

(X) Easy to read

( ) Readable - but requires some effort to understand

( ) Difficult to read and understand

( ) Unreadable


Section II. Summary and Recommendation


A. Evaluation

Please rate the manuscript. Please explain your answer under
IIIA. Public Comments.

( ) Award Quality

( ) Excellent

( ) Good

(X) Fair

( ) Poor


B. Recommendation 

Please make your recommendation.  Please explain your answer
under IIIA. Public Comments.

( ) Accept with no changes

( ) Author should prepare a minor revision

( ) Author should prepare a major revision for a second review

(X) Reject


Section III. Detailed Comments


A. Public Comments (these will be made available to the author)
 This paper describes SAWTOOTH and SPADE - the former is an
implementation of a Naive Bayes (NB) classifier that does
windowing on
the input data, and the latter is a one-pass discretization
algorithm.  It is a bit difficult to ascertain the contribution
of the
paper.  It could be, and the introduction leads one to believe
that
the authors consider it to be at least in part, the observation
that
simple systems can perform well on large datasets (such as the
1999
KDDCUP dataset).  When Rob Holte made this observation over a
decade
ago, it was surprising to many.  However, we now know that,
roughly
speaking, getting 90% of the best possible performance is quite
easy,
but getting that last 10% can be quite hard.  Therefore, the
results
on the KDDCUP dataset presented in this paper are not
surprising.
They're close to, but not as good as, the results from the
winning
system which was much more complicated.

The observations in section II on finding plateaus, and the
method
used, do not seem to constitute a novel contribution.  As the
authors
acknowledge, the fact that relatively few instances often
suffice has
been noticed by others before.  Figure 1 confirms this
observation yet
again.  Also, there's a paper from KDD by Provost, Jensen, and
Oates
on progressive sampling in which issues related to determining
when
learning curves have plateaued that's relevant.  The problem is
fairly
difficult.  

The use of sliding windows to deal with non-stationarity is not
new,
though the use of equation 1 to control window growth may be.
However, that equation is presented without discussion as to
its
derivation and appears to be ad hoc.  That's not necessarily a
bad
thing, but some discussion of why equation 1 is expected to be
useful
is in order.

Section IV is just a review of NB, and section V presents
SPADE.
Figure 3 suggests that SPADE performs roughly as well as John
and
Langley's method, which is true of a large number of other
discretization methods.  There's nothing particularly new or
insightful about the approach.

Finally, the experiments are mostly done well, though there is
a
complete lack of information about variance in the paper.  Are
any of
the results statistically significant?  I suspect in the end
that the
answer may not be relevant - some results will be, and some
won't, and
SAWTOOTH/SPADE will enter the pack of other
algorithms/approaches that
exhibit similar behavior, though on different datasets.  There
is no
free lunch in machine learning.

Section VI-C describes an experiment in which the ability of
SAWTOOTH
to deal with concept drift is explored.  However, very little
information about the simulator is provided and, more
significantly,
the paper never says precisely how SAWTOOTH "retain[s] knowledge
of
old contexts".

In summary, there's nothing really new in this paper.