Notes

Property	Learnings, musings, and questions starting May 2020.

Summer Reading

Done:

The White Tiger, by Aravind Adiga

City of Djinns, by William Dalrymple

Going on:

India After Gandhi, by Ram Guha

Our Moon Has Blood Clots, by Rahul Pandita

Coming up:

Zero: The Biography of a Dangerous Idea, by Charles Seife

Brave New World, by Aldous Huxley

Statistical Learning: Prediction vs Inference

References:

Intro to Statistical Learning (Springer)

Stats stack-exchange

Datasci blog

It's sort of a nuanced difference. Both involve developing a 'model'. A model, can simply be viewed as the following (taking a supervised approach):

f: X \to Y

Here, X is the set of independent, predictor variables. Y is the outcome codomain; this can be a number if we're doing regression, or probability of a class in classification, and so on.

Inference, in the statistical sense used here, refers to using the model to uncover the structure of the data causal relationships between a set/subset of independent variables and the dependent variable. Eg., we can infer how a student's performance in a math test (the outcome) is dependent on...the amount of chocolate milk he drinks per day (predictor). Prediction here means predicting the outcome, or the Y when an input of the X variables is entered in the outcome. Eg. Given that a student drinks 42 glasses of chocolate milk per day, what will be his performance in the math test?