(Basic Statistics for Citizen Data Scientist)
Runs
Example 1: What is the probability that there will be a run of at least 6 heads in 20 tosses of a fair coin?
We solve this problem by recursion. Let p = the probability that a heads will occur on any toss, r = the size of run we are looking for and n = the total number of tosses. For Example 1, p = .5, r = 6 and n = 20. Now for i = 0, 1, …, n, and j = 0, 1, …, r define
f(i, j) = the probability of getting a run of at least r heads in n tosses assuming there are i tosses remaining and the last j tosses have all been heads and so far there has not been a run of r heads.
Clearly for all j < r
f(0, j) = 0
And for all i
f(i, r) = 1
For all i > 0 and j < r, we have the following recursive formula:
f(i, j) = p ∙ f(i – 1, j + 1) + (1 – p ) ∙ f(i – 1, 0)
Here the leftmost term on the right side of the equation corresponds to getting a heads when i tosses remain and the rightmost term corresponds to getting tails when i tosses remain.
Thus the probability that there will be a run of at least r heads in n tosses of a coin with probability p that a heads will occur on any toss is given by f(n, 0).
We solve Example 1 by building an Excel worksheet as shown in Figure 1.
Figure 1 – Run of at least 6 heads in 20 tosses
Here range B3:G3 consists of all 0’s, range H4:H24 consists of all 1’s. Cell B5 contains the formula =0.5*C4+0.5*$B4. We then copy this formula into the rest of the table by highlighting the range B5:G24 and pressing Ctrl-D and Ctrl-R.
We see (cell B24) that the value of f(20, 0) = .122315. This is the result we are looking for. The probability of getting a run of at least 6 heads in 20 tosses of a fair coin is .122315.
If we want to know the probability that the longest run of heads in 20 tosses is 6 heads, then we need to first calculate the probability of a run of at least 7 heads in 20 tosses, as shown in Figure 2.
Figure 2 – Run of at least 7 heads in 20 tosses
Figure 2 shows that the probability of a run of at least 7 heads is .058182. Thus the probability that the longest run of heads is exactly 6 heads is .122315 – .058182 = .064133. We can’t say that the probability of a run of exactly 6 heads is .064133 since we can have situations where there are runs of 6 heads as well as 7 or more heads (e.g. HHHHHHTHHHHHHHTTTTTT).
Example 2: What is the probability that there will be a run of at least 6 in 20 tosses of a fair coin?
Here the run can be of either heads or tails. Once again we solve this problem by recursion. This time for i = 0, 1, …, n – 1, and j = 1, 2, …, r define
g(i, j) = the probability of getting a run of least r during n tosses assuming there are i tosses remaining and the last j tosses have all been the same and so far there has not been a run of r heads or tails.
This time to keep things simple, we will assume that we have a fair coin and so p = .5.
Clearly for all 0 < j < r
g(0, j) = 0
And for all i < n
g(i, r) = 1
For all 0 < i < n and j < r, we have the following recursive formula:
g(i, j) = .5 ∙ g(i – 1, j + 1) + .5 ∙ g(i – 1, 1)
Here the leftmost term on the right side of the equation corresponds to getting the same outcome as on the previous toss when i tosses remain and the rightmost term corresponds to getting a different outcome from the previous toss when i tosses remain.
Thus the probability that there will be a run of at least r in n tosses of a fair coin is given by g(n – 1, 1).
We solve Example 2 by building the Excel worksheet shown in Figure 3.
Figure 3 – Run of at least 6 in 20 tosses
The probability of getting a run of at least 6 heads or tails in 20 tosses of a fair coin is .23877 (cell U23).
Disclaimer: The information and code presented within this recipe/tutorial is only for educational and coaching purposes for beginners and developers. Anyone can practice and apply the recipe/tutorial presented here, but the reader is taking full responsibility for his/her actions. The author (content curator) of this recipe (code / program) has made every effort to ensure the accuracy of the information was correct at time of publication. The author (content curator) does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. The information presented here could also be found in public knowledge domains.
Learn by Coding: v-Tutorials on Applied Machine Learning and Data Science for Beginners
Latest end-to-end Learn by Coding Projects (Jupyter Notebooks) in Python and R:
All Notebooks in One Bundle: Data Science Recipes and Examples in Python & R.
End-to-End Python Machine Learning Recipes & Examples.
End-to-End R Machine Learning Recipes & Examples.
Applied Statistics with R for Beginners and Business Professionals
Data Science and Machine Learning Projects in Python: Tabular Data Analytics
Data Science and Machine Learning Projects in R: Tabular Data Analytics
Python Machine Learning & Data Science Recipes: Learn by Coding
R Machine Learning & Data Science Recipes: Learn by Coding
Comparing Different Machine Learning Algorithms in Python for Classification (FREE)
There are 2000+ End-to-End Python & R Notebooks are available to build Professional Portfolio as a Data Scientist and/or Machine Learning Specialist. All Notebooks are only $29.95. We would like to request you to have a look at the website for FREE the end-to-end notebooks, and then decide whether you would like to purchase or not.