# Statistics: A Pre-Course Introduction

Updated: Sep 16, 2020

__Aamir Khan__'s Phunsukh Wangdu, a.k.a. 'Rancho', from __Rajkumar Hirani__'s movie *3 Idiots** *(2009)* *says to the Dean of his institution, "Mere paas kuchh statistics hain, Sir," which literally translates into "I have with me certain statistics, Sir." Very clearly, by 'statistics' he means data. Britannica, however says __Statistics__ is "the science of collecting, analyzing, presenting, and interpreting data."

So, what is Statistics?

Well, there is no denying the Britannica. Statistics, indeed, is a science or a study, or more accurately, a scientific tool that involves collecting, presenting, analyzing and interpreting data. When intended to mean simply 'data', the singular form '*statistic*' is to be used. There! Now, we are done with the formal definitions. Is that all there is to be known about Statistics without actually delving into the subject?

Well, certainly not. For starters, we have all heard the passing joke "Lie, Damn lie, Statistics", and yet, not many of us understand why. We have also heard that Mathematics and Statistics are very much alike, but many appreciate neither the similarity nor the difference. And most importantly, a question which most of you didn't ask, "Why study Statistics? What does interpreting data mean, or help with?"

Let's just begin with the easiest answer. Yes, Mathematics and Statistics are a lot alike, but only as much as Physics and Mathematics, or Economics and Statistics, are. We use mathematical theories to work through the problems that Physical and Statistical studies encounter. Mathematics is only but a language to communicate the ideas of Statistics. To be even clearer, Mathematics is a __deductive__ study, while Statistics is a study both __inductive__ and __abductive__ in nature.

Keeping the first question's answer for the last, we shall talk about what interpreting data means and yields. With almost every deterministic study, the outcome is the result of previously occurring phenomena. And hence, if those phenomena be determined, the outcome can be predicted. For example, if the exact point of force exertion, the amount of force used and the friction in the medium be known, then it should be possible to determine the exact outcome of a toss performed on a certain coin. It's just a matter of mind-bending physics. However, when a person performs a toss, we have no way of determining those variables. But, we can observe a person make quite a few number of tosses and start to notice a pattern, of sorts. You can never with certainty guess the outcome, but you can start to bet. That is exactly what Statistics does, only in a more professional and logical manner, so that even if we be wrong, we know the chances of the prediction being wrong.

And that brings us to the answer of the first question. If we can predict even the chances of a certain prediction being wrong, we can almost surely agree that we'd never be lying. At most, we would be making a prediction with high chances of it being false. And as long as you pay attention to that little detail no one can fool you. Statistics only seems worse than 'damn lie' when those little details of error in prediction be overlooked! It all depends upon how we interpret a certain data, and interpretations can be subjective, even erroneous. And to be honest, the media and the policy makers depend on the common mass' inclination towards ignoring those details. Who am I kidding? We hardly ever properly read an article. We just make opinions based on headlines!

But, hey, do we still know the answer to the question "Why study Statistics?"

As I mentioned earlier, Statistics is a scientific tool. It deals extensively with data, and that data can come from any phenomenon. If determining physical laws of a natural phenomenon be a herculean task, Statisticians can look into the relevant data and figure out an estimated relationship between the causal factors and the event itself. Economists use Statistics to do the same on a regular basis. Also, in case you didn't know, __Mendel__'s experiment with *Pisum sativum *(garden pea) that revealed the Mendel's Laws of Inheritance, were actually a statistical study of the data from his own garden.

As one final example of what Statistics is capable of, I would like to conclude with one incident quite famous among statisticians in India. Due to the communal riots in 1947, a huge number of refugees from a certain minority community took refuge in the Red Fort and the Humayun's Tomb Complex. The government had the responsibility to feed them, and contractors had been hired to carry out the mission. Soon, the government realized that the price quotations made by the contractors was extremely high. However, the government had no way to exact the number of people to be fed, since the community did not trust 'outsiders' in their safe refuge. Well, who could be better trusted to solve the issue than statisticians? ('cause, duh, they count numbers!)

The entrusted statisticians including The Father of Statistics in India, __P.C. Mahalanobis____,__ and late J.M. Sengupta, who was associated with the __Indian Statistical Institute__ for a long time, couldn't afford not to find a solution, as that would count as a failure of Statistics and its community. They did have access to all the bills that the contractors submitted. On dividing the amounts of individual items, by the estimated per-capita consumption of that item, it was discovered that separate items gave different estimates of the number of people to be fed. It was observed that the estimates of the number of people was fairly high when it came to rice, but quite low when it came to salt. The price of rice was quite high, and salt was cheap, and hence exaggerating the amount of rice actually made sense. The salt-estimate for the number of people was submitted to the government and months later, it was verified to be correct.

I know, pretty common sense, right? Statistics really is just that, only a bit more sophisticated.

***The Salt in Statistics story is available for detailed read in "Statistics and Truth" by C.R. Rao, 1989.***

Doodles by** **__Nishant Choksi__** **and __Bene Rohlmann__**.**