Errata (3rd Edition)

To view tables and graphs referred to in the errata, please log in.

 p. 33 The URL for the dataset no longer works. Instead, go to https://data.boston.gov/dataset/property-assessment and choose Property Assessment FY2014 p. 58, para 3 Text should read: "For example, in the left panel of Figure 3.2, there are about 20 tracts where the median value (MEDV) is between \$5000 and \$10,000." Fig 3.16 caption "blue and orange" should be "light pins"; "red" should be "dark pins" Figures 4.8-4.11 The first print of the textbook had an error with these four figures. See this PDF file for the correct figures. p. 98 Sentence should read "High scores on  principal component 1 mean that the cereal is low in calories and the amount per bowl, and high in protein and potassium." p. 98 and 100 End of p.98 should read "as we move from right (bran cereals) to left"; p.100 top should read "middle-left" p. 102 Problem 4.2(a) Some print editions are missing the file name. It should be Wine.xls p. 103 Problem 4.4 Some print editions are missing the file name. It should be ToyotaCorolla.xls p. 119 One-variable tables in Excel: bullet 3 should be "A13 to A33" (instead of "B13 to B33") p. 120 Text about ROC diagonal should read: "The comparison curve is the diagonal, which reflects the average performance of a guessing classifier that has no information about the predictors or outcome variable. This guessing classifier guesses that a proportion alpha of the records is 1's and therefore assigns each record an equal probability P(Y=1)=alpha. In this case, on average, a proportion alpha of the 1s will be correctly classified (sensitivity=alpha), and a proportion alpha of the 0s will be correctly classified (1-specificity=alpha). As we increase the cutoff value alpha from 0 to 1, we get the diagonal line Sensitivity = 1-Specificity.  Note that the naive rule is one point on this diagonal line, where alpha=proportion of actual 1's. A common metric to summarize an ROC curve is area under the curve (AUC), which ranges from 1 (perfect discrimination between classes) to 0.5 (no better than random guessing)" p. 121 box In the box, replace "False-Positive Rate" with "False Discovery Rate", and replace "False-Negative Rate" with "False Omission Rate" p. 132 Text about ROC diagonal should read: "The comparison curve is the diagonal, which reflects the average performance of a guessing classifier that has no information about the predictors or outcome variable. This guessing classifier guesses that a proportion alpha of the records is 1's and therefore assigns each record an equal probability P(Y=1)=alpha. In this case, on average, a proportion alpha of the 1s will be correctly classified (sensitivity=alpha), and a proportion alpha of the 0s will be correctly classified (1-specificity=alpha). As we increase the cutoff value alpha from 0 to 1, we get the diagonal line Sensitivity = 1-Specificity.  Note that the naive rule is one point on this diagonal line, where alpha=proportion of actual 1's. A common metric to summarize an ROC curve is "area under the curve" (AUC), which ranges from 1 (perfect discrimination between classes) to 0.5 (no better than random guessing)" p. 133 Table "Classification Matrix, Reweighted", row "Actual 1": the numerators should be 80 and 420 (not 19,180 and 5,420) p. 151, para 3 Paragraph should read: "For the Toyota Corolla price example, forward selection yields exactly the same results as those found in an exhaustive search: For each number of predictors (up to 6 predictors) the same subset is chosen (it therefore gives a table identical to the one in Figure 6.4 for up to 7 coefficients)... In other words, it correctly identifies CC and Met_Color as the least useful predictors." [and delete last part "Backward elimination... Age and HP"] p. 151, last para The corrected paragraph should read: "The results for stepwise selection can be seen in Figure 6.6. It chooses the same subsets as exhaustive search for subset size of one to 9 predictors. R2-adj is largest at 9 predictors, and Cp also indicates the 9-predictor model is best." p. 152 Delete sentence before last ("This example shows clearly that it is not always so") p. 153 Problem 6.1 part (c), ignore the final text "What is the prediction error?" Table 8.1 (X=1) should be under Prior Legal, (X=0) should be under No Prior Legal and Total should be in the last column. Table 8.3 Last line should read: Weather - Coded as 1 if inclement, 0 otherwise p. 191 [This is a clarification] Addition: "As with k-nearest-neighbors, a predictor with m categories (m>2) should be factored into m dummies (not m-1). In addition, whether predictors are numerical or categorical, it does not make any difference whether they are standardized (normalized) or not." Prob 9.3 In parts (a) and (b), replace the instruction "Keep the minimum... least restrictive." to "Set the parameters for the tree so as to produce as deep a tree as possible and obtain scores from this deep tree." In (a)(iv.) change "full tree" to "deep tree". Prob 9.3(a)iii Due to the software change, replace this problem with "How might we achieve better validation predictive performance at the expense of training performance?" Prob 9.3(a)iv Replace text with "Create a best pruned tree using the same data partitioning. Compared to the deeper tree, what is the predictive performance on the validation set? and on the training set?" Ch 10, Sec. 10.2 Text right after eq. (10.5) should be: "a unit increase in predictor xj is associated with anaverage increase of eβj ×100% in the odds" p. 240 Corrupted word in title is "Profiling" p. 252 The acceptance score for observation 4 should be "dislike" p. 255 For Output6, the last term in the exponent should be (-0.02)(0.52) p. 258 For output node 6 the error is 0.481(1-0.481)(0-0.481)=  -0.120 p. 283 -50.58 should be -51.58 p. 285 Sentence should read: "For instance, the no-injury classification score for the first accident in the training set is -24.5+(1.95)(1)+(1.19)(0) +...+ (16.36)(1) = 31.42. The nonfatal score is similarly computed as 30.93, and the fatal score as 25.94. p. 294, eq (13.2) On right-hand-side, replace 1/2 with 1/4. The last term should be 2x (1/4) Cov(e1i,e2i) p. 300 Paragraph after table 13.2, last sentence should read "...the lift from the flyer is 5.8" (instead of 4.8) p. 300, Table 13.3 Voter 1's values for "Flyer" and "Moved_AD" should be 0 for "Flyer" and 1 for "Moved_AD" p. 304 Problem 13.2 part (a) should read "setting... terminal nodes to 50" p. 311, Table 14.1 Row 7 should read "red, blue" instead of "white, orange" p. 344 (top) In Distance Measures for Categorical Data, replace "x_ij's" with "p measurements", and replace n with p in the table and in the Matching coefficient formula. p. 357, 2nd para "cluster 1" should be "cluster 6". "cluster 3" should be "cluster 4" p. 383 In last sentence, replace "Month" with "Season" Ch 15-17 Several of the time series datasets used in the problems (souvenir sales, shampoo sales, Australian wine sales) have a new source reference: Hyndman, R., and Yang, Y. Z. (2018).  tsdl:  Time Series Data Library.  v0.1.0.   https://pkg.yangzhourang.com/tsdl/ END ERRATA