Culturomics and linguistic analysis:

A neat paper based on Google Books: “Quantitative Analysis of Culture Using Millions of Digitized Books” https://dl.dropboxusercontent.com/u/254940/Science-2011-Michel-176-82.pdf

False positives/negatives

ROC curves, specificity, sensitivity, accuracy, error rates, etc.: http://en.wikipedia.org/wiki/Receiver_operating_characteristic

A positive test for breast cancer still only indicates an 8% probability of having breast cancer: “Visualizing Uncertainty About the Future” https://dl.dropboxusercontent.com/u/254940/spiegelhalter2011.pdf

Andrew Gelman on why we should be more focussed on Type S (sign) errors and Type M errors (magnitude – which are more likely to happen with underpowered analyses) than Type 1 and 2 errors: http://www.stat.columbia.edu/~gelman/research/published/retropower20.pdf http://andrewgelman.com/2004/12/29/type_1_type_2_t/

GAMs

R packages for GAMs:

mgcv is the main package: http://cran.r-project.org/web/packages/mgcv/index.html

To use lme4 as the random effect backend to GAMs: http://cran.r-project.org/web/packages/gamm4/index.html

Simon Wood’s book: http://books.google.ca/books/about/Generalized_Additive_Models.html?id=hr17lZC-3jQC