Statistics¶
Mean, Variance and Standard Deviation¶
Mean¶
The mean of a vector, usually denoted as \(\mu\) , is the mean of its elements, that is to say the sum of the components divided by the number of components
Variance¶
The variance is the mean of the squared differences to the mean.
with \(var(x)\) being the variance of the variable \(x\), \(n\) the number of data samples, \(x_i\) the ith data sample and \(\bar{x}\) the mean of \(x\).
Standard Deviation¶
The standard deviation is simply the square root of the variance. It is usually denoted as \(\sigma\):
We square root the variance to go back to the units of the observations.
Both the variance and the standard deviation are dispersion indicators: they tell you if the observations are clustered around the mean.
Note also that the variance and the standard deviation are always positive (it is like a distance, measuring how far away the data points are from the mean):
Covariance and Correlation¶
Correlation¶
The correlation, usually refering to the Pearson’s correlation coefficient, is a normalized version of the covariance. It is scaled between -1 and 1