Common statistical tests are linear models (or how to teach best easy ways to make money stats)

Most of the common statistical models (t-test, correlation, ANOVA; chi-square, etc.) are special cases of linear models or a very close best easy ways to make money approximation. This beautiful simplicity means that there is less to learn. In particular, it all comes down to \(y = a \cdot x + b\) which most students know from highschool. Unfortunately, stats intro courses are usually taught as if each test best easy ways to make money is an independent tool, needlessly making life more complicated for students and teachers alike.

For this reason, I think that teaching linear models first and foremost and best easy ways to make money then name-dropping the special cases along the way makes for an best easy ways to make money excellent teaching strategy, emphasizing understanding over rote learning. Since linear models are the same across frequentist, bayesian, and permutation-based inferences, I’d argue that it’s better to start with modeling than p-values, type-1 errors, bayes factors, or other inferences.

Concerning the teaching of “non-parametric” tests in intro-courses, I think that we can justify lying-to-children and teach “non-parametric”" tests as if they are merely ranked versions of the best easy ways to make money corresponding parametric tests. It is much better for students to think “ranks!” than to believe that you can magically throw away assumptions. Indeed, the bayesian equivalents of “non-parametric”" tests implemented in JASP literally just do (latent) ranking and that’s it. For the frequentist “non-parametric”" tests considered here, this approach is highly accurate for N > 15.

Recall from high-school: \(y = a \cdot x + b\), and getting a really good intuition about slopes and intercepts. Understanding that this can be written using all variable names, e.G., money = profit * time + starting_money or \(y = \beta_1x + \beta_2*1\) or, suppressing the coefficients, as y ~ x + 1. If the audience is receptive, convey the idea of these models as a solution to best easy ways to make money differential equations, specifying how \(y\) changes with \(x\).

One mean: when there is only one x-value, the regression model simplifies to \(y = b\). If \(y\) is non-metric, you can rank-transform it. Apply the assumptions (homoscedasticity doesn’t apply since there is only one \(x\)). Mention in passing that these intercept-only models are called one-sample t-test and wilcoxon signed rank test respectively.

Two means: if we put two variables 1 apart on the x-axis, the difference between the means is the slope. Great! It is accessible to our swizz army knife called linear best easy ways to make money modeling. Apply the assumption checks to see that homoscedasticity reduces to best easy ways to make money equal variance between groups. This is called an independent t-test. Do a few worked examples and exercises, maybe adding welch’s test, and do the rank-transformed version, called mann-whitney U.

Logarithmic transformation: making multiplicative models linear using logarithms, thus modeling proportions. See this excellent introduction to the equivalence of log-linear models and chi-square tests as models of proportions. Also needs to introduce (log-)odds ratios. When the multiplicative model is made summative using logarithms, we just add the dummy-coding trick from 3.1, and see that the models are identical to the ANOVA best easy ways to make money models in 3.2 and 3.3, only the interpretation of the coefficients have changed.

Hypothesis testing as model comparisons: hypothesis testing is the act of choosing between a full best easy ways to make money model and one where a parameter is fixed to a best easy ways to make money particular value (often zero, i.E., effectively excluded from the model) instead of being estimated. For example, when fixing one of the two means to zero in best easy ways to make money the t-test, we study how well a single mean (a one-sample t-test) explains all the data from both groups. If it does a good job, we prefer this model over the two-mean model because it is simpler. So hypothesis testing is just comparing linear models to make best easy ways to make money more qualitative statements than the truly quantitative statements which were best easy ways to make money covered in bullets 1-4 above. As tests of single parameters, hypothesis testing is therefore less informative however, when testing multiple parameters at the same time (e.G., a factor in ANOVA), model comparison becomes invaluable.

RELATED POSTS