| p. 32 | Problem 2.6 - In the first sentence, drop the word "prior." |
| p. 42 | In the middle of the page, change the two instances of 44% to 34%. |
| p. 45 | In the second line from the top, change 88% to 86%. |
| p. 52 | Problem 3.2 - After "...summarize the data as follows:" add (note that a few records contain missing values; since there are just a few, a simple solution is to remove them first. You can use the "missing data handling" utility in XLMiner) |
| p. 64 | In the middle of the page, change the sentence "A classifier that..." to "A classifier that misclassifies 2% of buying households as nonbuyers and 20% of the nonbuyers as buyers...". This is in agreement with the table below. |
| p. 70, 71 | Although it's not really wrong, the confusion tables on these pages have actuals along the top and predicteds along the side, the opposite of the way they're arranged in the rest of the book and in XLMiner. |
| p. 71 | In the line above the table, it should be "3.545 predicted 0's for every predicted 1". |
| p. 73 | In the last paragraph, the two instances of 1.7 should be more like 1.5 (according to the chart at the top of the page), and the 400 in the next-to-last line should be more like 600. |
| p. 86 | Problem 5.1 c - Take the number of rooms per house as 6, rather than 3. |
| p. 87 | Table 5.4 describes the variables to be used in the problem; the file has additional variables.
5.2.c.i - The categorical variables are binary variables, so there is no
need to create dummy variables from them.
|
| p. 88 |
5.3 c.i and ii - The dummy variables should be created before
partitioning, not after. |
| p. 89 | 5.3.c.ix - In the second part of the question, "predictive interval"
can be ignored as it is not covered in the chapter.
|
| p. 96 | In the last line of the second paragraph, parentheses are missing. It should be 0.05/0.18 = (50/180)/(180/230). |
| p. 109 | 6.2.c - Ignore the suggested percentages (60%:40%), as XLMiner's limits
will not permit that many training records.
|
| p. 113 | In section 7.3, "the p-dimension6al" should be "p-dimensional" |
| p. 122 | This is the tree for the previous example, not the current example. |
| p. 125 | This page is mistakenly numbered as p. 25 |
| p. 132 | In section 7.9, "the are senstive to changes" should be "they are sensitive to changes" |
| p. 134 | Problem 7.1.g - The first sentence should read "...about the chances of an auction obtaining at least two bids..." instead of "...about the chances of an auction transacting..." |
| p. 135 | Problem 7.2.c.v - The first sentence should read "...about the chances of an auction obtaining at least two bids..." instead of "...about the chances of an auction transacting..." |
| p. 135 | Problem 7.2 (a) Add the following as the second and third sentences: Do not include DEP_TIME (actual departure time) in the model because it is unknown at the time of prediction (unless we are doing our predicting predicting of delays after the plane takes off, which is unlikely). In the third step of the Classification Tree menu, choose "Maximum # levels to be displayed = 6". |
| p. 135 | Problem 7.2 - Add the following just before the sentence that begins "This will avoid treating...":
After binning the DEPT_Time into 8 bins, this new variable should be broken
down
into 7 dummies (because the effect won't be linear, due to the morning and
afternoon rush hours).
|
| p. 136 |
Problem 7.3 (b) Add the following sentence at the end of the
paragraphs, before (i):
Select "Normalize input data".
|
| p. 145 | The left-hand side of the equation in the middle is upside down. The x1+1 term should be in the numerator, and the x1 term should be in the denominator. |
| p. 151 | In the last two lines of the second paragraph, parentheses are missing. They should be (D0-D)/D0 and D0=D/(1-R2). The same error appears in line 5 of page 158. |
| p. 158 | In the 12th line of the Variable Selection section, it should be "only 7 predictors" |
| p. 163 | Problem 8.2 - In the second paragraph ignore the first sentence ("Using these data, the consultant performs a discriminant analysis"). |
| p. 163 | Problem 8.2, parts (a) and (d) - The references should be to "Training," not "Education". |
| p. 219 | Paragraph 1, last sentence - "Mendeleeyev's" should be "Mendeleev's". |
| p. 225 | The right-hand side of the formula for r2 is actually the formula for the correlation, not its square. |
| p. 227 | A more precise specification for centroid distance is: distance (Xbar_A, Xbar_B). |
| p. 227 | Just prior to the two bullet points at the bottom of the page,
at the end of the prior paragraph, add: "The distance measure used in the
calculations that follow is Euclidean distance."
|
| p. 237 | problem 12.1.c - Should read "Compare the summary statistics for
each cluster
centroid ..." and the following should appear at the end: Hint: To obtain
cluster centroids for hierarchical clustering, use Excel's pivot table on
the "Predicted Clusters" sheet. |
| p. 238 | problem 12.3.a - The second sentence should read "Compare the
dendrograms from single
linkage and complete linkage, and look at cluster centroids.". The
following should appear at the end: Hints: (1) To obtain cluster
centroids for hierarchical clustering, use Excel's pivot table on the
"Predicted Clusters" sheet. (2) Running hierarchical clustering in
XLMiner is an iterative process -- run it once with a guess at the right
number of clusters, then run it again after looking at the dendrogram,
adjusting the number of clusters if needed.
|
| p. 238 | Problem 12.4.a - Should read "Apply hierarchical clustering with Euclidean distance and Ward's method." |
p. 246 p. 247 | Table 13.2 - The "Rcode=" header for each sub-table needs to be renumbered, as follows: Top table on each page should be Rcode=all, second table is Rcode=1, third table is Rcode=2, fourth table is Rcode=3, fifth table is Rcode=4. |