Blog Logo
AIML Resident @ Apple
·
read
Image Source: https://pbs.twimg.com/media/DBQttG0XYAAK8_M.jpg
· · ·

Introduction

The concept of normal distribution as explained using the game of darts is easier to understand and explain. Consider a game of dart with aim of throwing the dart at the origin of a cartesian plane. The errors in throwing the dart at the origin will have random errors and produce varying results in different trials. Some of the assumptions one can make in this game are:

  • The errors do not depend on the orientation of the cartesian plane.
  • Errors in perpendicular directions are independent i.e. dart hitting too high does not alter the probability of it being off to the right.
  • Large errors are less likely to occur than the small errors.

Determining the Shape of the Distribution

The probability of dart falling in a region that lies in the vertical strip from \(x\) to \(x + \Delta x\) can be given by,

Similarly, the probability of dart landing in horizontal strip from \(y\) to \(y + \Delta y\) can be given by,

The Dart Game

Because the two events are assumed to be independent, the probability of dart falling in the shaded region is given by

Also, since orientation does not alter probability of an error, any region r units from the origin and with area \(\Delta x \cdot \Delta y\) has the same probability and hence can be expressed as,

This results in the inference

Differentiating on both the sides,

From the figure above,

So the derivative in (1) can be expressed as,

Using (2) and (3), (4) can be rewritten as,

Differential equation can be solved by seperating the variables, so (5) becomes,

Differential equation (6) is true for any x and y, and x and y are independent. This leads to the result that the ratio must be a constant, i.e.,

So,

Integrating (7),

So,

Since, large errors are less likely than smaller errors, C must be negative, so,

where k is positive.

Determining the Coefficient A

If p is the probability density function of a random variable following normal distribution, then the total area under the curve must be 1. So value of A should be such that this property is satisfied. The equation to ve evaluated is,

Dividing both sides by A,

Since the distribution is symmetric, changing the limits of distribution,

Then,

Since x and y are independent, (8) can be rewritten as double integral,

The double integral in (9) can be evaluated as polar coordinates,

Applying u-substitution to (9),

differentiating w.r.t. r,

So,

Substituting (11) and (12) in (10),

Using (9),

So finally,

Substituting A in the \(p(x)\) for normal distribution,

Determining the value of k

Probably k can be calculated using the formulae for mean or variance.

The mean, \(\mu\), is defined as the following integral,

Since function \(x\,p(x)\) is an odd function, \(\mu\) is zero.

The variance, \(\sigma^2\), is given by following integral,

Since mean is zero, above equation becomes,

Changing the limits of integral,

Evaluating the integral on left by parts where u and v are given by,

Then v can be evaluated by substitution,

So,

Substituting (16) and (17) in (15),

Applying the parts to (14) ,

Simplifying it further,

Now consider the second part of the integral in (18),

Let \(k_1 = {k \over 2}\), and \(u = x \sqrt{k}\) which means \(du = dx \sqrt{k} \), and using gaussian integral.

Substituting (19) and (20) in (18),

So,

Substituting A and k from (13) and (21) in the basic equation,

The general equation for the normal distribution with mean \(\mu\) and standard deviation \(\sigma\) is a simple horizontal shift of this basic distribution,

REFERENCES:

The Normal Distribution: A derivation from basic principles
Derivation of univariate normal distribution

· · ·