# Continuous Distributions

## Definition

A random variable is continuous if possible values comprise either a single interval on the number line or a union of disjoint intervals. X = f(x) is the probability density function of the continues random variable X.

We model a continuous random variable with a curve f(x), called a probability density function (pdf).

### Applications

In the study of the ecology of a lake, a rv X could be the depth measurements at randomly chosen locations.

In a study of a chemical reaction, Y could be the concentration level of a particular chemical in solution.

In a study of customer service, W could be the time a customer waits for service.

f(x) represents the height of the curve at point x.

For continuous random variables probabilities are areas under the curve.

Attention

We can’t model continuous random variable using discrete rv method.

### Properties

The probability density function \(f:(-\infty, \infty) \rightarrow[0, \infty) \text{ so } f (x) \geq 0\).

\(P(-\infty<X<\infty)=\int_{-\infty}^{\infty} f(x) d x=1=P(S)\)

\(P(a \leq X \leq b)=\int_{a}^{b} f(x) d x\)

Note

\(P(X=a)=\int_{a}^{a} f(x) d x=0 \text { for all real numbers } a\)

## Uniform rv

Random variable \(X \sim U[a,b]\) has the uniform distribution on the interval [a, b] if its density function is

```
import torch
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import uniform
sns.set_theme(style="darkgrid")
# random numbers from uniform distribution
n = 10000
start = 10
width = 20
data_uniform = uniform.rvs(size=n, loc = start, scale=width)
ax = sns.displot(data_uniform,
bins=100,
kde=True)
ax.set(xlabel='Uniform Distribution ', ylabel='Frequency')
plt.show()
```

```
/home/docs/checkouts/readthedocs.org/user_builds/ml-math/envs/latest/lib/python3.10/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
```

```
/home/docs/checkouts/readthedocs.org/user_builds/ml-math/envs/latest/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
with pd.option_context('mode.use_inf_as_na', True):
```

```
/home/docs/checkouts/readthedocs.org/user_builds/ml-math/envs/latest/lib/python3.10/site-packages/seaborn/axisgrid.py:118: UserWarning: The figure layout has changed to tight
self._figure.tight_layout(*args, **kwargs)
```

### CDF

### Expected Value and Variance

### Example

For random variable \(X \sim U(0,23)\). Find P(2 < X < 18)

\(P(2 < X < 18) = (18-2)\cdot \frac 1 {23-0} = \frac {16}{23}\)

## Exponential Distribution

The exponential distribution is a continuous probability distribution that often concerns the amount of time until some specific event happens. It is a process in which events happen continuously and independently at a constant average rate. The exponential distribution has the key property of being memoryless.

### Applications

The family of exponential distributions provides probability models that are widely used in engineering and science
disciplines to describe **time-to-event** data.

Time until birth

Time until a light bulb fails

Waiting time in a queue

Length of service time

Time between customer arrivals

the amount of money spent by the customer

Calculating the time until the radioactive particle decays

### PDF

The continuous random variable, say X is said to have an exponential distribution, if it has the following probability density function:

λ is called the distribution rate.

```
import torch
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import expon
sns.set_theme(style="darkgrid")
data_expon = expon.rvs(scale=1,loc=0,size=1000)
ax = sns.displot(data_expon,
kde=True,
bins=100)
ax.set(xlabel='Exponential Distribution', ylabel='Frequency')
plt.show()
```

```
/home/docs/checkouts/readthedocs.org/user_builds/ml-math/envs/latest/lib/python3.10/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/ml-math/envs/latest/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
with pd.option_context('mode.use_inf_as_na', True):
```

```
/home/docs/checkouts/readthedocs.org/user_builds/ml-math/envs/latest/lib/python3.10/site-packages/seaborn/axisgrid.py:118: UserWarning: The figure layout has changed to tight
self._figure.tight_layout(*args, **kwargs)
```

### Expected Value

The mean of the exponential distribution is calculated using the integration by parts.

### Variance

To find the variance of the exponential distribution, we need to find the second moment of the exponential distribution

### Properties

The most important property of the exponential distribution is the memoryless property. This property is also applicable to the geometric distribution.

## Normal (Gaussian) Distribution

It is often called **Gaussian distribution**, in honor of Carl Friedrich Gauss (1777-1855), an eminent German
mathematician who gave important contributions towards a better understanding of the normal distribution.

### Normal Random Variable

A continuous random variable \(X \sim N(\mu,\sigma^2)\) has the normal distribution with parameters \(\mu\) and \(\sigma^2\) if its density is given by

A normal distribution is a distribution that is solely dependent on two parameters of the data set: mean and the standard deviation of the sample.

Attention

This characteristic of the distribution makes it extremely simple for statisticians and hence any variable that exhibits normal distribution is feasible to be forecasted with higher accuracy. Essentially, it can help in simplying the model.

#### Parameters

**Mu**is a location parameter. If you change the value of Mu, the entire bell curve is going to slide around.If you increase

**Sigma squared**, it’s going to get fatter and therefore shorter because the total area is one, So if it gets fatter, it has to come down. If Sigma squared gets smaller, it’s going to get really tall and thin.

### Properties

Normal distribution is simple to explain: Why?

The reasons are:

The mean, mode, and median of the distribution are equal.

We only need to use the mean and standard deviation to explain the entire distribution.

f(x) is symmetric around \(x=\mu\) as a consequence, deviations from the mean having the same magnitude.

f(x) > 0 for all \(x\) and \(\int_{-\infty}^{\infty} f(x) dx = 1\).

\(\mu + \sigma\) and \(\mu - \sigma\) are inflection points on f(x).

Mean and median are equal; both are located at the center of the distribution.

### Why is it so important

The normal distribution is extremely important because:

many real-world phenomena involve random quantities that are approximately normal

it plays a crucial role in the Central Limit Theorem, one of the fundamental results in statistics;

its great analytical tractability makes it very popular in statistical modelling.

The following variables are close to normally distributed variables:

Height of a population

Blood pressure of adult human

Position of a particle that experiences diffusion

Measurement errors

Residuals in regression

Shoe size of a population

Amount of time it takes for employees to reach home

A large number of educational measures

#### But how are so many variables approximately normally distributed?

Let’s consider that the height of a population is a random variable. We can take a sample of heights, plot its distribution and calculate the sample mean. When we repeat this experiment whilst we increase the number of samples then the mean of the samples will end up being very close to normality.

**This is known as the Central Limit Theorem.**

### Probability Density Function

If we plot the normal distribution density function, it’s curve has the following characteristics:

Mean is the center of the curve. This is the highest point of the curve as most of the points are at the mean.

There is an equal number of points on each side of the curve. The center of the curve has the most number of points.

The total area under the curve is the total probability of all of the values that the variable can take.

The total curve area is therefore 100%

Approximately 68.2% of all of the points are within the range -1 to 1 standard deviation.

About 95.5% of all of the points are within the range -2 to 2 standard deviations.

About 99.7% of all of the points are within the range -3 to 3 standard deviations.

This allows us to easily estimate how volatile a variable is and given a confidence level, what its likely value is going to be. As an instance, in the grey bell-shaped curve above, there is a 68.2% chance that the value of the variable will be within 101–99.

#### Normal Probability Distribution Function

```
import torch
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import norm
sns.set_theme(style="darkgrid")
sample = torch.normal(mean = 8, std = 16, size=(1,1000))
sns.displot(sample[0], kde=True, stat = 'density',)
plt.axvline(torch.mean(sample[0]), color='red', label='mean')
plt.show()
```

```
/home/docs/checkouts/readthedocs.org/user_builds/ml-math/envs/latest/lib/python3.10/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/ml-math/envs/latest/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
with pd.option_context('mode.use_inf_as_na', True):
```

```
/home/docs/checkouts/readthedocs.org/user_builds/ml-math/envs/latest/lib/python3.10/site-packages/seaborn/axisgrid.py:118: UserWarning: The figure layout has changed to tight
self._figure.tight_layout(*args, **kwargs)
```

**norm.pdf** returns a PDF value. The following is the PDF value when 𝑥=1, 𝜇=0, 𝜎=1.

We graph a PDF of the normal distribution using scipy, numpy and matplotlib. We use the domain of −4<𝑥<4, the range of 0<𝑓(𝑥)<0.45, the default values 𝜇=0 and 𝜎=1. plot(x-values,y-values) produces the graph.

```
print(norm.pdf(x=1.0, loc=0, scale=1))
x = torch.arange(-4,4,0.001)
fig, ax = plt.subplots()
ax.set_title('N(0,$1^2$)')
ax.set_xlabel('x')
ax.set_ylabel('f(x)')
ax.plot(x, norm.pdf(x))
ax.set_ylim(0,0.45)
plt.show()
```

```
0.24197072451914337
```

A normal curve is smooth bell-shaped. It is symmetrical about the 𝑥=𝜇 and has a maximum point at 𝑥=𝜇.

#### Normal distribution PDF with different standard deviations

Let’s plot the probability distribution functions of a normal distribution where the mean has different standard deviations.

scipy.norm.pdf has keywords, loc and scale. The location (loc) keyword specifies the mean and the scale (scale) keyword specifies the standard deviation.

```
fig, ax = plt.subplots()
x = torch.arange(-4,4,0.001)
stdvs = [1.0, 2.0, 3.0, 4.0]
for s in stdvs:
ax.plot(x, norm.pdf(x,scale=s), label='stdv=%.1f' % s)
ax.set_xlabel('x')
ax.set_ylabel('pdf(x)')
ax.set_title('Normal Distribution')
ax.legend(loc='best', frameon=True)
ax.set_ylim(0,0.45)
ax.grid(True)
```

#### Normal distribution PDF with different means

Let’s plot the probability distribution functions of a normal distribution where the mean has different values.

```
fig, ax = plt.subplots()
x = torch.linspace(-10,10,100)
means = [0.0, 1.0, 2.0, 5.0]
for mean in means:
ax.plot(x, norm.pdf(x,loc=mean), label='mean=%.1f' % mean)
ax.set_xlabel('x')
ax.set_ylabel('pdf(x)')
ax.set_title('Normal Distribution')
ax.legend(loc='best', frameon=True)
ax.set_ylim(0,0.45)
ax.grid(True)
```

The mean of the distribution determines the location of the center of the graph. As you can see in the above graph, the shape of the graph does not change by changing the mean, but the graph is translated horizontally.

#### A cumulative normal distribution function

The cumulative distribution function of a random variable X, evaluated at x, is the probability that X will take a value less than or equal to x. Since the normal distribution is a continuous distribution, the shaded area of the curve represents the probability that X is less or equal than x.

```
fig, ax = plt.subplots()
# for distribution curve
x= torch.arange(-4,4,0.001)
ax.plot(x, norm.pdf(x))
ax.set_title("Cumulative normal distribution")
ax.set_xlabel('x')
ax.set_ylabel('pdf(x)')
ax.grid(True)
# for fill_between
px=torch.arange(-4,1,0.01)
ax.set_ylim(0,0.5)
ax.fill_between(px,norm.pdf(px),alpha=0.5, color='g')
# for text
ax.text(-1,0.1,"cdf(x)", fontsize=20)
plt.show()
```

### Expected Value and Variance

\(E(X) = \mu\)

\(V(X) = \sigma^2\)

### Problems With Normality

Assuming normality has its own flaws. As an instance, we cannot assume that the stock price follows normal distribution as the price cannot be negative. Therefore the stock price potentially follows a log of the normal distribution to ensure it is never below zero.

We know that the daily returns can be negative, therefore the returns can at times follow a normal distribution. It is not wise to assume that the variable follows a normal distribution without any analysis.

A variable can follow Poisson, Student-t, or Binomial distribution as an instance and falsely assuming that a variable follows normal distribution can lead to inaccurate results.

### Identify the Normal RV?

Many methods exist for testing whether a variable has a normal distribution

#### 1. Histogram

The histogram is a data visualization that shows the distribution of a variable. It is a bar graph that shows the frequency of each value in the variable. The histogram is a graphical representation of the distribution of a variable.

```
sample = torch.normal(mean = 8, std = 16, size=(1,1000))
sample2 = torch.distributions.uniform.Uniform(2,3).sample([1,1000])
sns.displot(sample[0], kde=True,).set(title='Normal Distribution')
plt.axvline(torch.mean(sample[0]), color='red', label='mean')
sns.displot(sample2[0], kde=True,).set(title='Uniform Distribution')
plt.show()
```

```
/home/docs/checkouts/readthedocs.org/user_builds/ml-math/envs/latest/lib/python3.10/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/ml-math/envs/latest/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
with pd.option_context('mode.use_inf_as_na', True):
```

```
/home/docs/checkouts/readthedocs.org/user_builds/ml-math/envs/latest/lib/python3.10/site-packages/seaborn/axisgrid.py:118: UserWarning: The figure layout has changed to tight
self._figure.tight_layout(*args, **kwargs)
/home/docs/checkouts/readthedocs.org/user_builds/ml-math/envs/latest/lib/python3.10/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/ml-math/envs/latest/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
with pd.option_context('mode.use_inf_as_na', True):
```

```
/home/docs/checkouts/readthedocs.org/user_builds/ml-math/envs/latest/lib/python3.10/site-packages/seaborn/axisgrid.py:118: UserWarning: The figure layout has changed to tight
self._figure.tight_layout(*args, **kwargs)
```

#### 2. Box Plot

The Box Plot is another visualization technique that can be used for detecting non-normal samples.

The box plot is a graphical representation of the distribution of a variable. It is a graphical representation of the five-number summary of a variable. The five-number summary is the minimum, first quartile, median, third quartile, and maximum of a variable.

#### 3. QQ Plot

QQ Plot stands for Quantile vs Quantile Plot, which is exactly what it does: plotting theoretical quantiles against the actual quantiles of our variable.

The QQ Plot allows us to see deviation of a normal distribution much better than in a Histogram or Box Plot.

## Standard Normal Distribution

The normal distribution with parameter values \(\mu\) = 0 and \(\sigma^2\) = 1 is called the standard normal distribution.

A rv with the standard normal distribution is denoted by \(Z \sim N(0, 1)\)

If \(X \sim N\left(\mu, \sigma^2\right)\) then

If \(Z \sim N(0,1)\) then

### PDF

\(f_{Z}(x)=\frac{1}{\sqrt{2 \pi}} e^{-x^{2} / 2} \text { for }-\infty<x<\infty\)

### Cumulative distribution function

We use special notation to denote the cdf of the standard normal curve

\(F(z)=\Phi(z)=P(Z \leq z)=\int_{-\infty}^{z} \frac{1}{\sqrt{2 \pi}} e^{-x^{2} / 2} d x\)

### Properties

The standard normal density function is symmetric about the y axis.

The standard normal distribution rarely occurs naturally.

Instead, it is a reference distribution from which information about other normal distributions can be obtained via a simple formula.

The cdf of the standard normal, \(\Phi(z)\) can be found in tables and it can also be computed with a single command in R.

As we’ll see, sums of standard normal random variables play a large role in statistical analysis.

### Proposition

\(\text { If } X \sim N\left(\mu, \sigma^{2}\right), \text { then } \frac{X-\mu}{\sigma} \sim N(0,1)\)

\(\frac{X-\mu}{\sigma}\) Shifted by \(\mu\) or (Centered at zero) and scaled by \(\frac{1}{\sigma}\) that will give us variance of 1.

\(\mathrm{Z} \sim \mathrm{N}(0,1) \Rightarrow \sigma \mathrm{Z}+\mu \sim \mathrm{N}\left(\mu, \sigma^2\right)\)

#### Example

Let \(X \sim N(2,3)\)

### Proving this proposition

For any continuous random variable. Suppose we have Y rv, with Desnity function \(f_{Y}(y)\)

We know

\(P(y \leqslant a)=\int_{-\infty}^{a} f_{y}(y) d y\)

What if

\(P(2y \leqslant a)\)

= Can’t really use the density function until we isolate y =

\(P\left(y \leq \frac{a}{2}\right) = \int_{-\infty}^{a / 2} f_{y}(y) d y\)

This true for all transformation of Y.

With \(P\left(\frac{x-\mu}{\sigma} \leq a\right)=P(x \leq a \sigma+\mu) = \int_{x}^{a \sigma+\mu} \frac{1}{\sqrt{2 \pi} \sigma} e^{-(x-\mu)^{2} / 2 \sigma^{2}}\)

**U subsitution**

Let

\(u=\frac{x-\mu}{\sigma}\)

\(d u=\frac{1}{\sigma} d x\)

SO \(= \int_{-\infty}^{a} \frac{1}{\sqrt{2 \pi}} e^{-u^{2} / 2} d u\) This is density function for N(0,1).

### Examples

#### Find P(X<2) when N(3, 2)?

In norm.cdf, the location (loc) keyword specifies the mean and the scale (scale) keyword specifies the standard deviation.

```
from scipy.stats import norm
val = norm.cdf(x=2, loc=3, scale=2)
print(f"P(X<2) = {val}")
fig, ax = plt.subplots()
x= torch.arange(-4,10,0.001)
ax.plot(x, norm.pdf(x,loc=3,scale=2))
ax.set_title("N(3,$2^2$)")
ax.set_xlabel('x')
ax.set_ylabel('pdf(x)')
ax.grid(True)
# for fill_between
px=torch.arange(-4,2,0.01)
ax.set_ylim(0,0.25)
ax.fill_between(px,norm.pdf(px,loc=3,scale=2),alpha=0.5, color='g')
# for text
ax.text(-0.5,0.02,round(val,2), fontsize=20)
plt.show()
```

```
P(X<2) = 0.3085375387259869
```

R code: pnorm(1.2)

#### Find P(X<4.1) when N(2, 3)?

Let \(X \sim N(2,3)\). Then

R Code: pnorm(1.21)

```
z_score <- (4.1 - 2) / sqrt(3)
pnorm(z_score)
```

#### Interval between variables

To find the probability of an interval between certain variables, you need to subtract cdf from another cdf.

Let’s find 𝑃(0.5<𝑋<2) with a mean of 1 and a standard deviation of 2.

```
print(norm(1, 2).cdf(2) - norm(1,2).cdf(0.5))
```

```
0.2901687869569368
```

```
fig, ax = plt.subplots()
# for distribution curve
x= torch.arange(-6,8,0.001)
ax.plot(x, norm.pdf(x,loc=1,scale=2))
ax.set_title("N(1,$2^2$)")
ax.set_xlabel('x')
ax.set_ylabel('pdf(x)')
ax.grid(True)
px=torch.arange(0.5,2,0.01)
ax.set_ylim(0,0.25)
ax.fill_between(px,norm.pdf(px,loc=1,scale=2),alpha=0.5, color='g')
pro=norm(1, 2).cdf(2) - norm(1,2).cdf(0.5)
ax.text(0.2,0.02,round(pro,2), fontsize=20)
plt.show()
```

#### P(Z > 1.25) ?

Let’s plot a graph.

```
fig, ax = plt.subplots()
x= torch.arange(-4,4,0.01)
gr4sf=norm.sf(x=1.25, loc=0, scale=1)
print(gr4sf)
ax.plot(x, norm.pdf(x,loc=0,scale=1))
ax.set_title("N(0,$1^2$)")
ax.set_xlabel('x')
ax.set_ylabel('pdf(x)')
ax.grid(True)
px=torch.arange(1.25,4,0.01)
#ax.set_ylim(0,0.15)
ax.fill_between(px,norm.pdf(px,loc=0,scale=1),alpha=0.5, color='g')
ax.text(1.25,0.02,"sf(x) %.2f" %(gr4sf), fontsize=20)
plt.show()
```

```
0.10564977366685535
```

we can use sf which is called the survival function and it returns 1-cdf.

#### If X = N(1, 4), find P(0 < X < 3.2)?

```
print(norm(1, 2).cdf(3.2) - norm(1,2).cdf(0))
```

```
0.5557964003276304
```

```
fig, ax = plt.subplots()
# for distribution curve
x= torch.arange(-6,8,0.001)
ax.plot(x, norm.pdf(x,loc=1,scale=2))
ax.set_title("N(1,$2^2$)")
ax.set_xlabel('x')
ax.set_ylabel('pdf(x)')
ax.grid(True)
px=torch.arange(0.5,2,0.01)
ax.set_ylim(0,0.25)
ax.fill_between(px,norm.pdf(px,loc=1,scale=2),alpha=0.5, color='g')
pro=norm(1, 2).cdf(3.5) - norm(1,2).cdf(0)
ax.text(0.2,0.02,round(pro,2), fontsize=20)
plt.show()
```

## Gamma Distribution

The gamma distribution term is mostly used as a distribution which is defined as two parameters – shape parameter and inverse scale parameter, having continuous probability distributions. Its importance is largely due to its relation to exponential and normal distributions.

Gamma distributions have two free parameters, named as alpha (α) and beta (β), where;

α = Shape parameter

β = Rate parameter (the reciprocal of the scale parameter)

The scale parameter β is used only to scale the distribution. This can be understood by remarking that wherever the random variable x appears in the probability density, then it is divided by β. Since the scale parameter provides the dimensional data, it is seldom useful to work with the “standard” gamma distribution, i.e., with β = 1.

### Gamma function:

The gamma function \([10]\), shown by \(\Gamma( x )\), is an extension of the factorial function to real (and complex) numbers. Specifically, if \(n \in\{1,2,3, \ldots\}\), then

### Probability Density Function

### Mean

### Variance

## Chi-squared Distribution

One Parameter:

degrees of freedom: \(n \geq 1\) ( \(n\) is an integer) \(X \sim \chi^2(n)\) is defined as \(\Gamma\left(\frac{n}{2}, \frac{1}{2}\right)\)

### mean

### variance

## T-distribution

Let \(Z \sim N(0,1)\) and \(W \sim \chi^2(n)\) be independent random variables. Define

the t-distribution Write \(X \sim t ( n )\) One Parameter: degrees of freedom: \(n \geq 1\) ( \(n\) is an integer) The pdf: