Vectors, Matrices, and Tensors¶
Scalars, vectors, matrices, and tensors are the fundamental data structures of deep learning. In this section, we will briefly review these concepts.

Scalar
rank-0 tensor
\(x \in \mathbb{R}\)
x = 1.23
Vector
rank-1 tensor
\( x \in \mathbb{R}^nx1 \)
Matrix
rank-2 tensor
\( x \in \mathbb{R}^{nxm} \)


import torch
t = torch.tensor([ [1, 2, 3, 4,], [6, 7, 8, 9] ])
print(t)
print(t.shape)
print(t.ndim)
print(t.dtype)
tensor([[1, 2, 3, 4],
[6, 7, 8, 9]])
torch.Size([2, 4])
2
torch.int64
Data onto the GPU¶
print(torch.cuda.is_available())
print(torch.backends.mps.is_available())
if torch.cuda.is_available():
t = t.to(torch.device('cuda:0'))
print(t)
False
False
Broadcasting¶
Making Vector and Matrix computations more convenient
Computing the Output From Multiple Training Examples at Once¶
The perceptron algorithm is typically considered an “online” algorithm (i.e., it updates the weights after each training example)
However, during prediction (e.g., test set evaluation), we could pass all data points at once (so that we can get rid of the “forloop”)
Two opportunities for parallelism:
computing the dot product in parallel
computing multiple dot products at once
import torch
X = torch.arange(6).view(2, 3)
print(X)
w = torch.tensor([1, 2, 3])
print(w)
print(X.matmul(w))
w = w.view(-1, 1)
print(X.matmul(w))
tensor([[0, 1, 2],
[3, 4, 5]])
tensor([1, 2, 3])
tensor([ 8, 26])
tensor([[ 8],
[26]])


This (general) feature is called “broadcasting”
print(torch.tensor([1, 2, 3]) + 1)
t = torch.tensor([[4, 5, 6], [7, 8, 9]])
print(t)
print( t + torch.tensor([1, 2, 3]))
tensor([2, 3, 4])
tensor([[4, 5, 6],
[7, 8, 9]])
tensor([[ 5, 7, 9],
[ 8, 10, 12]])
Notational Linear Algebra¶

X = torch.arange(50, dtype=torch.float).view(10, 5)
print(X)
fc = torch.nn.Linear(in_features=5, out_features=3)
print(fc.weight)
print(fc.bias)
print(f"X dim: {X.size()}")
print(f"Weights dim: {fc.weight.size()}")
print(f"bias dim: {fc.bias.size()}")
A = fc(X)
print(f"A dim: {A.size()}")
tensor([[ 0., 1., 2., 3., 4.],
[ 5., 6., 7., 8., 9.],
[10., 11., 12., 13., 14.],
[15., 16., 17., 18., 19.],
[20., 21., 22., 23., 24.],
[25., 26., 27., 28., 29.],
[30., 31., 32., 33., 34.],
[35., 36., 37., 38., 39.],
[40., 41., 42., 43., 44.],
[45., 46., 47., 48., 49.]])
Parameter containing:
tensor([[ 0.1588, 0.1991, -0.1828, 0.3645, 0.0632],
[-0.3881, 0.3027, -0.1874, -0.2637, -0.0837],
[-0.1106, -0.1106, 0.1551, -0.4373, 0.3108]], requires_grad=True)
Parameter containing:
tensor([-0.4462, 0.1830, 0.2864], requires_grad=True)
X dim: torch.Size([10, 5])
Weights dim: torch.Size([3, 5])
bias dim: torch.Size([3])
A dim: torch.Size([10, 3])