Moments Generating Functions

We are still, believe it or not, trying to estimate things from a larger population based on a sample. For example, sample mean, or maybe the sum of the values in the sample etc. Any function of your data is known as a statistic. And we’re going to use them to estimate other things. And in order to figure out how well we’re doing, we’re going to need to know often the distributions of some of these statistics.

Distributions of sums

A lot of them depend on sums, so we’re going to start out by talking about the distribution of sums of random variables.

Suppose That,

\[\begin{split}X_{1}, X_{2}, \ldots, X_{n} \stackrel{\text { iid }}{\sim} Bernoulli(p) \\ \text { What is the distribution of } Y=\sum_{i=1}^{n} X_{i} ? \\ \text { Sum of Bernoulli rv is equal to bin(n,p) } \\ Y=\sum_{i=1}^{n} X_{i} \sim bin(n, p)\end{split}\]

Each X_i take value success (P) and failure (1-P). So summing all X_i is equal to sum of all success gives the value of Y. Which is binomial distribution.


Not all random variables are so easily interpreted by methods of Distributions of sums. So we need a tool.

Moment generating functions

The moments generating functions are the functions that generate the moments of a random variable. The expected values \(E(X), E\left(X^{2}\right), E\left(X^{3}\right), \ldots E\left(X^{r}\right)\) are called moments.

  • Mean \(\mu=E(X)\)

  • Variance \(\sigma^{2}=Var(X)=E\left(X^{2}\right)-\mu^{2}\)

which are functions of moments. moment-generating functions can sometimes make finding the mean and variance of a random variable simpler.

Let X be a random variable. It’s moment generating function (mgf) is denoted and defined as

Continuous Random Variables:

\(M_{X}(t)=E\left[e^{t X}\right]=\int_{-\infty}^{\infty} e^{t x} f_{X}(x) d x\)

Discrete Random Variables:

\(M_{X}(t)=E\left[e^{t X}\right]=\sum_{x} e^{t x} f_{x}(x)\)

where \(f_{X}(x)\) is the distribution of X.


  • Moment generating functions also uniquely identify distributions.

MGT of Famous Distributions


\[ \begin{align}\begin{aligned}M_{X}(t)=E\left[e^{t X}\right]=\sum_{x} e^{t x} f_{X}(x)=\sum_{x} e^{t x} P(X=x)\\=e^{t \cdot 0} P(X=0)+e^{t \cdot 1} P(X=1)\\=1 \cdot(1-p)+e^{t} \cdot p\\=1-p+p e^{t}\end{aligned}\end{align} \]


\(X \sim bin(n, p)\)

\[\begin{split}M_{x}(t)=\sum_{x=0}^{n}e^{tx}\binom{n}{x}p^x(1-p)^{n-x} \\ M_{x}(t)=\sum_{x=0}^{n}e^{tx}\binom{n}{x}(pe^t)^x(1-p)^{n-x}\end{split}\]
Binomial Theorem:

\((a + b)^n =\sum_{k=0}^{n}\binom{n}{k}a^k b^{n-k}\)

\[M_{X}(t)=(1-p+p e^{t})^n\]

Finding Distributions

A moment-generating function uniquely determines the probability distribution of a random variable. if two random variables have the same moment-generating function, then they must have the same probability distribution.
Some distribution with \(X_{1}, X_{2}, \ldots, X_{n} \text { iid }\) and \(Y=\sum_{i=1}^{n} X_{i}\) .

We have just seen that the moment generating function of the sum. Is the moment generating function of one of them raised to the nth power.

Key points

  • sum of n iid Bernoulli(p) random variables is bin(n, p)

  • sum of n iid exp(rate =lambda) random variables is Gamma(n, lambda)

  • sum of m iid bin(n,p) is bin(nm,p)

  • sum of n iid Gamma(alpha, beta) is Gamma(n alpha, beta)

  • sum of n iid \(N\left(\mu, \sigma^{2}\right) is N\left(n \mu, n \sigma^{2}\right)\).

  • sum of $n$ independent normal random variable with \(\mathrm{X}_{\mathrm{i}} \sim \mathrm{N}\left(\mu_{\mathrm{i}}, \sigma_{\mathrm{i}}^{2}\right)$ is $\mathrm{N}\left(\sum_{\mathrm{i}=1}^{\mathrm{n}} \mu_{\mathrm{i}}, \sum_{\mathrm{i}=1}^{\mathrm{n}} \sigma_{\mathrm{i}}^{2}\right)\)

Method of Moments Estimators(MMEs)

Method of moments means you set sample moments equal to population/theoretical moments.

It totally makes sense if you’re trying to estimate the mean or average out there in the entire population. That you should use the sample mean or sample average of the values in the sample, but what about parameters with not such an obvious interpretation?

Idea: Equate population and sample moments and solve for the unknown parameters.

Suppose that \(X_{1}, X_{2}, \ldots, X_{n} \stackrel{\text { iid }}{\sim} \Gamma(\alpha, \beta)\)
How can we estimate α ?
We could estimate the true mean \(\alpha / \beta\) with the sample mean \(\bar{X}\) , but we still can’t get at α if we don’t know β.


Recall that the “moments” of a distribution are defined as 𝖤[𝖷], 𝖤[𝖷𝟤], 𝖤[𝖷𝟥 ], … These are distribution or “population” moments

  • \(\mu=E[X]\) is a probability weighted average of the values in the population.

  • \(\bar{X}\) is the average of the values in the sample.

It was natural for us to think about estimating $mu$ with the average in our sample.

  • \(\mathrm{E}\left[\mathrm{X}^{2}\right]\) is a probability weighted average of the squares of the values in the population.

It is intuitively nice to estimate it with the average of the squared values in the sample:

\[ \begin{align}\begin{aligned}\frac{1}{n} \sum_{i=1}^{n} X_{i}^{2}\\\text{The kth population moments:}\\\mu_{\mathrm{k}}=\mathrm{E}\left[\mathrm{X}^{\mathrm{k}}\right] \quad \mathrm{k}=1,2,3, \ldots\\\text{The kth population moments:}\\\mu_{\mathrm{k}}=\mathrm{E}\left[X^{\mathrm{k}}\right] \quad \mathrm{k}=1,2,3, \ldots\\\text{The kth sample moments:}\\M_{k}=\frac{1}{n} \sum_{i=1}^{n} X_{i}^{k} \quad k=1,2,3, \ldots\end{aligned}\end{align} \]


\[ \begin{align}\begin{aligned}X_{1}, X_{2}, \ldots, X_{n} \stackrel{\text { iid }}{\sim} \exp (\text { rate }=\lambda)\\\text{First population moment:}\\\mu_{1}=\mu=\mathrm{E}[\mathrm{X}]=\frac{1}{\lambda}\\\text{First sample moment:}\\M_{1}=\frac{1}{n} \sum_{i=1}^{n} X_{i}=\bar{X}\\\text{Equate:} \frac{1}{\lambda}=\bar{x}\\\text{Solve for the unknown parameter...} \lambda=\frac{1}{\bar{x}}\\\text{The MME is } \hat{\lambda}=\frac{1}{\bar{x}}\end{aligned}\end{align} \]