Normal Distribution in Python

Even if you’re not a statistician, you’ve probably heard of the term “Normal Distribution” somewhere.

According to probability theory, a random variable’s possible values can be described by its probability distribution. Here, we’re referring to the possible range of values that can be assigned to a parameter based on a random sample.

It is possible to have a probability distribution that is either discrete or constant.

Assume that in a city, the average height of an adult in the 20-30 year old demographic is somewhere between 4.5 and 7 feet tall.

If we were to randomly select one adult and inquire as to his/her height (assuming gender has no bearing on height), what would we discover? It’s impossible to predict the height. Having the height distribution of adults in the city, we can make a more informed bet on what is most likely to happen.

What is Normal Distribution?

A Gaussian distribution, or Bell Curve, is another name for a normal distribution. It’s common practice to use these terms interchangeably, but the meaning is the same in both cases. It is a probability distribution with a wide range of values.

Normal distribution’s probability density function (pdf).

Normal Distribution in Python
where, σ = Standard deviation, μ = Mean , , x = input value.
  • Mean: Ordinary average is referred to as “mean.” Calculated by dividing the sum of all points by the total number of points.
  • Standard Deviation: When we look at standard deviation, we can see how “spread out” the information is. Each observed value is compared to the mean to determine how far it deviates from that value.

1. Example Implementation of Normal Distribution

Let’s take a look at the code in the following section. For this demonstration, we’ll make use of NumPy and matplotlib:

# Importing numpy and Matlibplot
 
import numpy as np
import matplotlib.pyplot as graph
 
#Creating Data Line till 100
x = np.linspace(1,100,200)
 
#Defining Normal Distribution Function
def normalDist(x , mean , sd):
    prob_density = (np.pi*sd) * np.exp(-0.5*((x-mean)/sd)**2)
    return prob_density
 
#Mean And Standard deviation Calculation
mean = np.mean(x)
sd = np.std(x)
 
#Calling Normal Distrubution Function
pdf = normalDist(x,mean,sd)
 
#Plotting the Results
graph.plot(x,pdf , color = 'blue')
graph.xlabel('Discrete Data')
graph.ylabel('Probability Density')

graph.show()

Output:

Example Implementation of Normal Distribution

2. Properties Of Normal Distribution

If we have a data point, along with a mean and standard deviation, the normal distribution density function simply returns a probability density.

We can change the shape of the bell curve by changing the standard deviation and the mean. Changing the mean will cause the curve to move in the direction of the new mean value, allowing us to move the curve’s position while maintaining its original shape.

The Standard deviation has a significant impact on the curve’s appearance. It’s easier to draw a straight line with a smaller standard deviation, but this isn’t always the case.

Properties Of Normal Distribution

The following are the Normal Distribution properties to remember:

  • In this case, the mean, median, and mode are all the same.
  • The total area under the curve is the same as 1.
  • Around the mean, the curve is symmetric.

Empirical rule For Normal Distribution

  • The data falls within one standard deviation of the mean in 68% of the cases.
  • The majority of the data falls within two standard deviations of the mean equivalent to 95%.
  • 99.7% of the data is within three standard deviations of the mean.

It’s one of the most important distributions in all of Statistics. Natural phenomena tend to follow a normal distribution, making the normal distribution magical in that most of them do. Blood pressure, IQ scores, and heights, for example, all fall within the normal distribution.

Using the Normal Distribution to Calculate Probabilities

A normal distribution’s area under the curve can be used to calculate the probability of an individual value occurring in a specific range.

The density function needs to be integrated, so to speak. In a continuous distribution, the area under a normal distribution’s curve represents probabilities. The first thing we need to know about a Standard Normal Distribution is what it is.

To put it another way, the mean and standard deviation of a standard normal distribution are both set to one.

Z = (x-μ)/ σ

A z-score is another name for the above z value. A z-score tells you how far a data point is from the mean.

To find the cumulative percentage value of our z-value, we must consult a z-table if we intend to perform the probability calculations by hand. Modules in Python take care of all of this for us. Let us now begin.

Create Normal Curve

In order to calculate probabilities based on the normal distribution, we will use the scipy.norm class function.

Consider the following scenario: we have data on the heights of adults in a town that follows a normal distribution; we have a sufficient sample size with a mean of 5.3 and a standard deviation of 1.

Let us see in the below code example the implementation of the Normal Curve.

# Importing scipy, numpy, matplotlib and seaborn
from scipy.stats import norm
import numpy as np
import matplotlib.pyplot as graph
import seaborn as sb
 
# Creating the normal Distribution
data = np.arange(1,10,0.01)
pdf = norm.pdf(data , loc = 5.3 , scale = 1 )
 
#Plotting the Above Data
sb.set_style('whitegrid')
sb.lineplot(data, pdf , color = 'red')
graph.xlabel('Current Height')
graph.ylabel('Probability Density')

#Visualizing the Data Plotted Above
graph.show()

Output:

Create Normal Curve

The probability density value is returned by the norm.pdf() class method, which requires the loc and scale parameters as well as the data as input arguments.

The locator is nothing more than the mean of the data, and the scale is the standard deviation of the data. The code is very similar to what we wrote in the previous section, but it is significantly shorter.

Wrap Up

In this article, we learned about the Normal Distribution, what a normal Curve looks like, and, most importantly, how to implement it in the Python programming environment.

Please let me know if you have any problems or issues with Normal Distribution in the comments section. I would be delighted to assist you as soon as possible.

Further Read:

  1. ImportError: attempted relative import with no known parent package
  2. [Fixed] TypeError: list indices must be integers or slices, not tuple
  3. How To Replace Characters In A String In Python
  4. [Fixed] Deprecationwarning: find_element_by_* commands are deprecated

Leave a Comment