The world runs on Data today – you see it used everywhere and anywhere. Data is colloquially referred to as the new ‘oil’ which keeps the world running. Most of us, if we choose to get placed after our degree, will enter the data analytic/strategic world. Statistics plays a primary role in this field and I cannot emphasise enough on how pivotal it is for analysis but let’s take a different approach and try and see how this applies in our favourite sports!
Pick any game you know- Football, Basketball, Cricket, Baseball, Archery, etc. You will realise that statistics, probability and strategy works its way into each one of them. How you ask?
In the spirit of the Cricket World Cup fever let’s take a look at an example from there.
Source:Times Of India (09/10/2023)
The above picture shows a comparison between the two countries in anticipation of the World Cup. Internal statistics like average run rate, ICC rank, past clashes, player form, and external conditions like weather, pitch, lights have all been taken into account to determine the probability and odds of a team winning or losing the game. This is a classic example of conditional probability being used to predict the outcome of the match.
The picture given above says that the probability of New Zealand winning is 9/10 and on the other hand, the probability of Netherlands winning is 1/10.
Lets try and calculate the new probability of winning but with the chance of rain. In fact for the sake of simplicity, let’s assume that rain is the only factor that can change the outcome of the match. Thus, now we shall calculate the probability of New Zealand winning given that it will rain.
P(E1)- probability that New Zealand wins the match – 9/10
P(E2) – probability that Netherlands win the match – 1/10
P(R)- probability that it rains
P(R/E1) – ⅗ (chances of rain given New Zealand wins)
P(R/E2) – ⅕ (chances of rain given Netherlands win)
We want to know the chances of New Zealand winning given that it WILL rain –
Thus we want to find P(E1/R)
Using Bayes theorem (conditional probability) –
Thus the probability that NZ wins given it rains =
P(rain given NZ wins) [P(rain given NZ wins) + P(rain given Netherlands win)]
(9/10 * ⅗) [(9/10*⅗) + (1/10* ⅕)]
Which means that there is a 96.4% chance that New Zealand wins given that it rains.
Now, there are some obvious flaws in the way I have found this probability-
- I’ve assumed that rain and player form is the only factor that affects the outcome of the match (No importance given to pitch conditions – whether its a batting or bowling wicket)
- I’ve assumed the fact that even if it rains, the match will carry on but that will obviously not happen. We all know that the game will be stopped and the DLS (Duckworth- Lewis – Stern Method) will apply. This method factors in the ‘resources’ (overs left and wickets remaining) for a team and estimates the target that makes the game fair and allows the deserving team to win.
- It is difficult for statistics in general to measure and predict qualitative factors. The factors enumerated before (Pitch, weather, athlete form, team rank) can be quantified and calculated. However, other vital factors like audience cheers, player confidence and their mental health is challenging and complicated to estimate. Thus, it becomes difficult to predict outcomes when qualitative factors are crucial to performance.
Now you may be wondering that if so many assumptions are to be made, why use statistics at all? Well, the answer is exactly why the data analytics and sports statistics field is in boom today. People are working hard on minimising assumptions and making models as close to reality as possible. Models get refined over time, qualitative variables are taken into consideration and statistical software like SPSS (Statistical Package for Social Sciences)
and R is used to predict outcomes with the highest accuracy.
The only drawback to this kind of modelling and statistics is that one can never say something with a 100% certainty. Statistics always have the chance of an error or the chance of a miracle! That’s what makes us so invested in sports and games. Humans love to root for the ‘underdog’
Why you ask? Well once we know the statistics and the chances of a team winning – we as humans secretly want the unthinkable. We hope that against all odds, the losing team will outshine their opponents and unexpectedly win our hearts – Just like a ‘masaledar’ Bollywood film!
Even though this example was only for cricket, you can pretty much apply this for any other sport. It can be used in F1 to calculate speed (lap time, tire degradation, engine, humidity) or it can be used in Basketball to predict the score (Time, conversion percentage, zone or man to man strategy) and even Baseball (cue Moneyball ).
Statistics and prediction is used in EVERY sport and its accuracy has been increasing day by day!
Not just this but a regression analysis can be and is also used to analyse and make strategy decisions in various sports but let’s tackle that article once I completely understand Econometrics!