(Basic Statistics for Citizen Data Scientist)
Continuous Probability Distributions
We now extend the definition of probability distribution from discrete to continuous random variables. Such variables take on an infinite range of values even in a finite interval (weight of rice, room temperature, etc.).
While for a discrete random variable x, the probability that x assumes a value between a and b (exclusive) is given by
the frequency function f of a continuous random variable can assume an infinite number of values (even in a finite interval) and so we can’t simply sum up the values in the ordinary way. For continuous variables, the equivalent formulation is that the probability that x assumes a value between a and b is given by
i.e. the area under the graph of y = f(x) bounded by the x-axis and the lines x = a and x = b.
Figure 1 – Area under the curve
Definition 1: For a continuous random variable x is a frequency function, also called the probability density function (pdf) provided:
The corresponding (cumulative) distribution function F(x) is defined by
Property 2: For any continuous random variable x with distribution function F(x)
Observation: f is a valid probability density function provided that f always takes non-negative values and the area between the curve and the x-axis is 1. f is the probability density function for a particular random variable x provided the area of the region indicated in Figure 1 represents the probability that x assumes a value between a and b inclusively. Note that the probability that f takes any particular value a is not f(a). In fact for any specific value a, the probability that x takes the value a is considered to be 0.
Essentially the area under a curve is a way of summing when dealing with an infinite range of values in a continuum. For those of you familiar with calculus
Statistics for Beginners – Continuous Probability Distributions
Disclaimer: The information and code presented within this recipe/tutorial is only for educational and coaching purposes for beginners and developers. Anyone can practice and apply the recipe/tutorial presented here, but the reader is taking full responsibility for his/her actions. The author (content curator) of this recipe (code / program) has made every effort to ensure the accuracy of the information was correct at time of publication. The author (content curator) does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. The information presented here could also be found in public knowledge domains.
Learn by Coding: v-Tutorials on Applied Machine Learning and Data Science for Beginners
Latest end-to-end Learn by Coding Projects (Jupyter Notebooks) in Python and R:
All Notebooks in One Bundle: Data Science Recipes and Examples in Python & R.
End-to-End Python Machine Learning Recipes & Examples.
End-to-End R Machine Learning Recipes & Examples.
Applied Statistics with R for Beginners and Business Professionals
Data Science and Machine Learning Projects in Python: Tabular Data Analytics
Data Science and Machine Learning Projects in R: Tabular Data Analytics
Python Machine Learning & Data Science Recipes: Learn by Coding
R Machine Learning & Data Science Recipes: Learn by Coding
Comparing Different Machine Learning Algorithms in Python for Classification (FREE)
There are 2000+ End-to-End Python & R Notebooks are available to build Professional Portfolio as a Data Scientist and/or Machine Learning Specialist. All Notebooks are only $29.95. We would like to request you to have a look at the website for FREE the end-to-end notebooks, and then decide whether you would like to purchase or not.