To view tables and graphs referred to in the errata, please log in.
Chap 2, Table 2.6 (code) | remove: "# use drop_first=True to drop the first dummy variable housing_df = pd.get_dummies(housing_df, prefix_sep='_', drop_first=True)" add: "# the missing values will create a third category |
Chap 3, Fig 3.6 | Right panel bar charts should not use multiple colors |
Chap 5, Fig 5.1 (code) | should be: boxdata_df = pd.concat([pred_error_train, pred_error_valid]) |
Chap 5, Fig 5.2 | code update: https://github.com/gedeck/dmba/issues/11 |
Chap 5, p. 140 | should be: "As we increase the cutoff value 1-alpha from 0 to 1..." |
Chap 5, mid p. 147 | should be: "we see that taking 10% of the records ... selection of 10% of the records." |
Chap 5, Fig 5.10, 5.11 | "Classify as 'x'" should be at bottom and "Classify as 'o'" should be at top |
Chap 7, Table 7.2 | should be: outcome = 'Ownership' |
Chap 8, Table 8.5 | should be: pd.set_option('display.precision', 4) |
Chap 10, after eq (10.5) | should be: "a unit increase in predictor xj is associated with an average increase of eβj ×100% in the odds" |
Chap 10, eq. (10.8) | in denominator, the first term in exponent, 6.04892 should not have a minus sign |
Chap 10, Table 10.2 (code) | should be: "bank_df.Education.cat.rename_categories(new_categories) bank_df = pd.get_dummies(bank_df, prefix_sep='_', drop_first=True, dtype=int)" |
Chap 10, Fig 10.3 | code update for Fig 10.3: https://github.com/gedeck/dmba/issues/11 |
Chap 11, pp. 290 | In Back Propagation of Error, the text should read: "in Figure 11.3, for a person with output class “like” we have y6 = 1)." |
Chap 11, pp. 295 | MLPClassify() should be MLPClassifier() |
Chap 11, pp. 297-298 (twice) | MLPCRegressor() should be MLPRegressor() |
Chap 11, Table 11.2 | should be: hidden_layer_sizes=[3] |
Chap 11, Table 11.6 | should be: hidden_layer_sizes=[2] |
Chap 12, eq 12.2 | Formulas should have square-root |
Chap 12, Table 12.4 (code) | should be: "fct = pd.concat([ pd.DataFrame([lda_reg.intercept_], columns=lda_reg.classes_, index=['constant']), pd.DataFrame(lda_reg.coef_.transpose(), columns=lda_reg.classes_, index=list(accidents_df.columns)[:-1])])" |
Chap 12, problem 12.3 d+e | should be "Compute the intercept of the classification function" |
Chap 14, Table 14.12 | should be: "print('Top-4 recommended items for each user')" |
Chap 14, problem 14.3 | replace "You will get a Null matrix." with "All recommendations will be 1." |
Chap 17, Fig 17.1 code | replace "# shorter and longer time series" with "# plot the time series" |
Chap 17, Fig 17.7 | add to end of caption: "Autocorrelation plot for lags 1--12 (for first 24 months of Amtrak ridership, 95% confidence region is blue shaded)" |
Chap 17, Table 17.8 (code) | should be: "train_res_arima = ARIMA(train_lm_trendseason.resid, order=(1, 0, 0), freq='MS', trend='c').fit() forecast = train_res_arima.get_forecast(1) conf_int = forecast.conf_int()" |
Chap 17, Table 17.9 (code) | should be: sp500_arima = ARIMA(sp500_ts, order=(1, 0, 0)).fit() |
Chap 17, p. 435 | add clarification after "is at lag 6 and is negative (exceeding the 95% confidence interval). Autocorrelations that fall outside the confidence interval point to possible model improvement." |
Chap 19, mid p. 481 | should be "by the number of all possible shortest paths between the other nodes (n-1)(n-2)/2" |
Chap 21, case 21.1, p. 521 | should be: "The full set of 16 predictors in the dataset" |
Chap 21, case 21.2 | should be: "Use this vector to create a cumulative gains chart for the validation set that incorporates the net profit." |
Chap 21, case 21.5, p. 536 | In second bullet, delete sentence "The data file contains... date/time field" |
Chap 21, case 21.5, p. 536 | first line: NULL should be NaN |
END ERRATA |