Discussion questions#
A model with 95% predictive accuracy (on test data) identifies which students will fail a class, but none of the predictors are interpretable (e.g., a neural network using Canvas click behavior). A second model explains performance using sleep, study time, and prior GPA, but predicts worse. Which model is ‘better science’ and why? What are the caveats and implications of either?
Is finding the “optimal” point between where minimizing error and maximizing generalizability always the goal (i.e., finding the inflection point where delta-MSE between training and test are minimized)? Are there cases where you want a more generalizable but noisier model or, conversely, a model that is overfit to a training data? Give some examples.