ML Notation / Equations
Notation
Symbol 
Formula 
Explained 

\(\mu\) 
\(\sum_{x} k P(X=x) = \int_{\infty}^{\infty} x f(x) d x\) 

\(V(X)\) or \(\sigma^2\) 
\(E[(X  E[X])^2] = E[(X  \mu)^2] = E[X^2]  E[X]^2\) 

\(\sigma\) 
\(\sqrt{V(X)}\) 
Standard deviation 
\(Cov(X,Y)\) 
Covariance of X and Y 
Covariance of X and Y 
\(\bar{X}\) 
The sample 
The sample mean is an average value 
\(\delta\) 
\(\delta(v)\) 
Activation fucntions, sigmoid, relu, etc. 
Equations
Cosine Similarity
Cosine similarity is a metric used to measure the similarity between two vectors in a multidimensional space. Cosine similarity measures the cosine of the angle between two nonzero vectors in an ndimensional space.
Formula = dot product / normalized sum of squares
Properties
Scale Invariance Cosine similarity is scaleinvariant, meaning it is not affected by the magnitude of the vectors, only by their orientations.
One hot and multi hot vectors easily.
import torch
from torch.nn import functional as F
v1 = torch.tensor([0, 0, 1], dtype=torch.float32)
v2 = torch.tensor([0, 1, 1],dtype=torch.float32)
print(F.cosine_similarity(v1, v2 , dim=0))
print(F.normalize(v1, dim=0) @ F.normalize(v2, dim=0))
print(torch.norm(v1) / torch.norm(v2))
print( torch.matmul(v1, v2.T) / ( torch.sqrt( torch.sum(v1 ** 2)) * torch.sqrt( torch.sum(v2 ** 2))) )
tensor(0.7071)
tensor(0.7071)
tensor(0.7071)
tensor(0.7071)
/tmp/ipykernel_1107/3833807425.py:10: UserWarning: The use of `x.T` on tensors of dimension other than 2 to reverse their shape is deprecated and it will throw an error in a future release. Consider `x.mT` to transpose batches of matrices or `x.permute(*torch.arange(x.ndim  1, 1, 1))` to reverse the dimensions of a tensor. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3571.)
print( torch.matmul(v1, v2.T) / ( torch.sqrt( torch.sum(v1 ** 2)) * torch.sqrt( torch.sum(v2 ** 2))) )