Vectors, Matrices, and Tensors

Scalars, vectors, matrices, and tensors are the fundamental data structures of deep learning. In this section, we will briefly review these concepts.

Vectors, Matrices, and Tensors

Scalar

rank-0 tensor
\(x \in \mathbb{R}\)
x = 1.23

Vector

rank-1 tensor
\( x \in \mathbb{R}^nx1 \)

\[\begin{split} \mathbf{x}=\left[\begin{array}{c} x_1 \\ x_2 \\ \vdots \\ x_n \end{array}\right] \end{split}\]

Matrix

rank-2 tensor
\( x \in \mathbb{R}^{nxm} \)

\[\begin{split} \mathbf{X}=\left[\begin{array}{cccc} x_{1,1} & x_{1,2} & \ldots & x_{1, n} \\ x_{2,1} & x_{2,2} & \ldots & x_{2, n} \\ \vdots & \vdots & \ddots & \vdots \\ x_{m, 1} & x_{m, 2} & \ldots & x_{m, n} \end{array}\right] \end{split}\]
Tensors Tensors
import torch

t = torch.tensor([ [1, 2, 3, 4,], [6, 7, 8, 9] ])

print(t)
print(t.shape)
print(t.ndim)
print(t.dtype)
tensor([[1, 2, 3, 4],
        [6, 7, 8, 9]])
torch.Size([2, 4])
2
torch.int64

Data onto the GPU

print(torch.cuda.is_available())
print(torch.backends.mps.is_available())

if torch.cuda.is_available():
  t = t.to(torch.device('cuda:0'))
  print(t)
False
False

Broadcasting

Making Vector and Matrix computations more convenient

Computing the Output From Multiple Training Examples at Once

  • The perceptron algorithm is typically considered an “online” algorithm (i.e., it updates the weights after each training example)

  • However, during prediction (e.g., test set evaluation), we could pass all data points at once (so that we can get rid of the “forloop”)

  • Two opportunities for parallelism:

    1. computing the dot product in parallel

    2. computing multiple dot products at once

import torch

X = torch.arange(6).view(2, 3)

print(X)

w = torch.tensor([1, 2, 3])

print(w)

print(X.matmul(w))

w = w.view(-1, 1)

print(X.matmul(w))
tensor([[0, 1, 2],
        [3, 4, 5]])
tensor([1, 2, 3])
tensor([ 8, 26])
tensor([[ 8],
        [26]])
Tensors Tensors

This (general) feature is called “broadcasting”

print(torch.tensor([1, 2, 3]) + 1)

t = torch.tensor([[4, 5, 6], [7, 8, 9]])

print(t)

print( t + torch.tensor([1, 2, 3]))
tensor([2, 3, 4])
tensor([[4, 5, 6],
        [7, 8, 9]])
tensor([[ 5,  7,  9],
        [ 8, 10, 12]])

Notational Linear Algebra

Tensors
X = torch.arange(50, dtype=torch.float).view(10, 5)

print(X)

fc = torch.nn.Linear(in_features=5, out_features=3)

print(fc.weight)

print(fc.bias)

print(f"X dim: {X.size()}")
print(f"Weights dim: {fc.weight.size()}")
print(f"bias dim: {fc.bias.size()}")

A = fc(X)

print(f"A dim: {A.size()}")
tensor([[ 0.,  1.,  2.,  3.,  4.],
        [ 5.,  6.,  7.,  8.,  9.],
        [10., 11., 12., 13., 14.],
        [15., 16., 17., 18., 19.],
        [20., 21., 22., 23., 24.],
        [25., 26., 27., 28., 29.],
        [30., 31., 32., 33., 34.],
        [35., 36., 37., 38., 39.],
        [40., 41., 42., 43., 44.],
        [45., 46., 47., 48., 49.]])
Parameter containing:
tensor([[ 0.0283, -0.2716, -0.1834, -0.4373,  0.2282],
        [ 0.2731, -0.2637, -0.3609, -0.0048,  0.0700],
        [ 0.2773, -0.2305, -0.1061, -0.4259, -0.3791]], requires_grad=True)
Parameter containing:
tensor([-0.3188, -0.0365,  0.1801], requires_grad=True)
X dim: torch.Size([10, 5])
Weights dim: torch.Size([3, 5])
bias dim: torch.Size([3])
A dim: torch.Size([10, 3])