# Notes

Property | Learnings, musings, and questions starting May 2020. |
---|

## Summer Reading

Done:

- The White Tiger, by Aravind Adiga

- City of Djinns, by William Dalrymple

Going on:

- India After Gandhi, by Ram Guha

- Our Moon Has Blood Clots, by Rahul Pandita

Coming up:

- Zero: The Biography of a Dangerous Idea, by Charles Seife

- Brave New World, by Aldous Huxley

## Statistical Learning: Prediction vs Inference

References:

- Intro to Statistical Learning (Springer)

It's sort of a nuanced difference. Both involve developing a 'model'. A model, can simply be viewed as the following (taking a supervised approach):

Here, *X* is the set of independent, predictor variables. *Y* is the outcome codomain; this can be a number if we're doing regression, or probability of a class in classification, and so on.

*Inference*, in the statistical sense used here, refers to using the model to uncover the structure of the data causal relationships between a set/subset of independent variables and the dependent variable. Eg., we can *infer *how a student's performance in a math test (the outcome) is dependent on...the amount of chocolate milk he drinks per day (predictor).
*Prediction* here means predicting the outcome, or the *Y *when an input of the X variables is entered in the outcome. Eg. Given that a student drinks 42 glasses of chocolate milk per day, what will be his performance in the math test?