Lecture 09 Softmax Classifier

Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

PyTorch Tutorial

09. Softmax Classifier

Lecturer : Hongpu Liu Lecture 9-1 PyTorch Tutorial @ SLAM Research Group
Revision: Diabetes dataset

𝑥1 Linear Layer

Sigmoid Layer
𝑥2 𝑧1 𝑜1

𝑥3 𝑧2 𝑜2 𝑧1 𝑜1

𝑥4 𝑧3 𝑜3 𝑧2 𝑜2
𝑧1 𝑜1 𝑦ො
𝑥5 𝑧4 𝑜4 𝑧3 𝑜3

𝑥6 𝑧5 𝑜5 𝑧4 𝑜4

𝑥7 𝑧6 𝑜6

𝑥8

Lecturer : Hongpu Liu Lecture 9-2 PyTorch Tutorial @ SLAM Research Group
Revision: MNIST Dataset

There are 10 labels in MNIST dataset.

How to design the neural network?

Lecturer : Hongpu Liu Lecture 9-3 PyTorch Tutorial @ SLAM Research Group
Design 10 outputs using Sigmoid?
Linear Layer
𝑜1 𝑦ො1
Sigmoid Layer
𝑜2 𝑦ො2
Input Layer
𝑜3 𝑦ො3

𝑜4 𝑦ො4

𝑜5 𝑦ො5
… What is wrong?
𝑜6 𝑦ො6
We hope the outputs is competitive!
𝑜7 𝑦ො7 Actually we hope the neural network

𝑜8 𝑦ො8 outputs a distribution.

𝑜9 𝑦ො9

𝑜10 𝑦ො10

Lecturer : Hongpu Liu Lecture 9-4 PyTorch Tutorial @ SLAM Research Group
Output a Distribution of prediction with Softmax
Linear Layer
𝑜1 𝑃(𝑦 = 0)
Sigmoid Layer
𝑜2 𝑃(𝑦 = 1)
Input Layer
𝑜3 𝑃(𝑦 = 2)
Softmax Layer
𝑜4 𝑃(𝑦 = 3)

𝑜5 𝑃(𝑦 = 4)
… such that
𝑜6 𝑃(𝑦 = 5)
𝑃 𝑦=𝑖 ≥0
𝑜7 𝑃(𝑦 = 6)
9
𝑜8 𝑃(𝑦 = 7) ෍ 𝑃(𝑦 = 𝑖) = 1
𝑜9 𝑃(𝑦 = 8) 𝑖=0

𝑜10 𝑃(𝑦 = 9)

Lecturer : Hongpu Liu Lecture 9-5 PyTorch Tutorial @ SLAM Research Group
Softmax Layer

Suppose 𝑍 𝑙 ∈ ℝ𝐾 is the output of the last linear layer, the Softmax function:

𝑒 𝑧𝑖
𝑃(𝑦 = 𝑖) = 𝐾−1 𝑧𝑗 , 𝑖 ∈ 0, … , 𝐾 − 1
σ𝑗=0 𝑒

Lecturer : Hongpu Liu Lecture 9-6 PyTorch Tutorial @ SLAM Research Group
Softmax Layer - Example

0.2
0.1

−0.1

Lecturer : Hongpu Liu Lecture 9-7 PyTorch Tutorial @ SLAM Research Group
Softmax Layer - Example

0.2 1.22
0.1 1.11
… Exponent
−0.1 0.90

Lecturer : Hongpu Liu Lecture 9-8 PyTorch Tutorial @ SLAM Research Group
Softmax Layer - Example

0.2 1.22
0.1 1.11
… Exponent
−0.1 0.90

Sum
3.23

Lecturer : Hongpu Liu Lecture 9-9 PyTorch Tutorial @ SLAM Research Group
Softmax Layer - Example

0.2 1.22 0.38


0.1 1.11
… Exponent Divide 0.34
−0.1 0.90 0.28

Sum
3.23

Lecturer : Hongpu Liu Lecture 9-10 PyTorch Tutorial @ SLAM Research Group
Softmax Layer - Example

Softmax

0.2 1.22 0.38


0.1 1.11
… Exponent Divide 0.34
−0.1 0.90 0.28

Sum
3.23

Lecturer : Hongpu Liu Lecture 9-11 PyTorch Tutorial @ SLAM Research Group
Loss function - Cross Entropy


𝒀
0.2 0.38
0.1 0.34
… Softmax
−0.1 0.28

Lecturer : Hongpu Liu Lecture 9-12 PyTorch Tutorial @ SLAM Research Group
Loss function - Cross Entropy


𝒀 𝒀
0.2 0.38 1

One-hot
0.1 0.34 0
… Softmax 𝟏
−0.1 0.28 0

Lecturer : Hongpu Liu Lecture 9-13 PyTorch Tutorial @ SLAM Research Group
Loss function - Cross Entropy


𝒀 𝒀
0.2 0.38 1

One-hot
0.1 0.34 0
… Softmax Loss 𝟏
−0.1 0.28 0

෠ 𝑌 = −𝑌 log 𝑌෠
𝐿𝑜𝑠𝑠 𝑌,

Lecturer : Hongpu Liu Lecture 9-14 PyTorch Tutorial @ SLAM Research Group
Loss function - Cross Entropy

NLLLoss
Negative Log Likelihood Loss


𝒀 𝒀
0.2 0.38 1

One-hot
0.1 0.34 0
… Softmax Log ෡
−𝒀 𝒍𝒐𝒈 𝒀 𝟏
−0.1 0.28 0

Loss

෠ 𝑌 = −𝑌 log 𝑌෠
𝐿𝑜𝑠𝑠 𝑌,

Lecturer : Hongpu Liu Lecture 9-15 PyTorch Tutorial @ SLAM Research Group
Cross Entropy in Numpy

Loss

𝒀 𝒀
0.2 0.38 1

One-hot
0.1 0.34 0
… Softmax Log ෡
−𝒀 𝒍𝒐𝒈 𝒀 𝟏
−0.1 0.28 0

import numpy as np
y = np.array([1, 0, 0])
z = np.array([0.2, 0.1, -0.1])
y_pred = np.exp(z) / np.exp(z).sum()
loss = (- y * np.log(y_pred)).sum()
print(loss)

Lecturer : Hongpu Liu Lecture 9-16 PyTorch Tutorial @ SLAM Research Group
Cross Entropy in Numpy

Loss

𝒀 𝒀
0.2 0.38 1

One-hot
0.1 0.34 0
… Softmax Log ෡
−𝒀 𝒍𝒐𝒈 𝒀 𝟏
−0.1 0.28 0

import numpy as np
y = np.array([1, 0, 0])
z = np.array([0.2, 0.1, -0.1])
y_pred = np.exp(z) / np.exp(z).sum()
loss = (- y * np.log(y_pred)).sum()
print(loss)

Lecturer : Hongpu Liu Lecture 9-17 PyTorch Tutorial @ SLAM Research Group
Cross Entropy in Numpy

Loss

𝒀 𝒀
0.2 0.38 1

One-hot
0.1 0.34 0
… Softmax Log ෡
−𝒀 𝒍𝒐𝒈 𝒀 𝟏
−0.1 0.28 0

import numpy as np
y = np.array([1, 0, 0])
z = np.array([0.2, 0.1, -0.1])
y_pred = np.exp(z) / np.exp(z).sum()
loss = (- y * np.log(y_pred)).sum()
print(loss)

Lecturer : Hongpu Liu Lecture 9-18 PyTorch Tutorial @ SLAM Research Group
Cross Entropy in Numpy

Loss

𝒀 𝒀
0.2 0.38 1

One-hot
0.1 0.34 0
… Softmax Log ෡
−𝒀 𝒍𝒐𝒈 𝒀 𝟏
−0.1 0.28 0

import numpy as np
y = np.array([1, 0, 0])
z = np.array([0.2, 0.1, -0.1])
y_pred = np.exp(z) / np.exp(z).sum()
loss = (- y * np.log(y_pred)).sum()
print(loss)

Lecturer : Hongpu Liu Lecture 9-19 PyTorch Tutorial @ SLAM Research Group
Cross Entropy in PyTorch


𝒀 Loss 𝒀
0.2 0.38 1

One-hot
0.1 0.34 0
… Softmax Log ෡
−𝒀 𝒍𝒐𝒈 𝒀 𝟏
−0.1 0.28 0

Torch.nn.CrossEntropyLoss()

import torch
y = torch.LongTensor([0])
z = torch.Tensor([[0.2, 0.1, -0.1]])
criterion = torch.nn.CrossEntropyLoss()
loss = criterion(z, y)
print(loss)

Lecturer : Hongpu Liu Lecture 9-20 PyTorch Tutorial @ SLAM Research Group
Mini-Batch: batch_size=3

import torch
criterion = torch.nn.CrossEntropyLoss()
Y = torch.LongTensor([2, 0, 1])

Y_pred1 = torch.Tensor([[0.1, 0.2, 0.9],


[1.1, 0.1, 0.2],
[0.2, 2.1, 0.1]])
Y_pred2 = torch.Tensor([[0.8, 0.2, 0.3],
[0.2, 0.3, 0.5],
[0.2, 0.2, 0.5]])

l1 = criterion(Y_pred1, Y) Batch Loss1 = tensor(0.4966)


l2 = criterion(Y_pred2, Y)
print("Batch Loss1 = ", l1.data, "\nBatch Loss2=", l2.data) Batch Loss2 = tensor(1.2389)

Lecturer : Hongpu Liu Lecture 9-21 PyTorch Tutorial @ SLAM Research Group
Exercise 9-1: CrossEntropyLoss vs NLLLoss

• What are the differences?


• Reading the document:
• https://pytorch.org/docs/stable/nn.html#crossentropyloss
• https://pytorch.org/docs/stable/nn.html#nllloss
• Try to know why:
• CrossEntropyLoss <==> LogSoftmax + NLLLoss

Lecturer : Hongpu Liu Lecture 9-22 PyTorch Tutorial @ SLAM Research Group
Back to MNIST Dataset

There are 10 labels in MNIST dataset.

How to design the neural network?

Lecturer : Hongpu Liu Lecture 9-23 PyTorch Tutorial @ SLAM Research Group
MNIST Dataset

0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.1 0.1 0.5 0.5 0.7 0.1 0.7 1.0 1.0 0.5 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.1 0.4 0.6 0.7 1.0 1.0 1.0 1.0 1.0 0.9 0.7 1.0 1.0 0.8 0.3 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.9 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.4 0.3 0.3 0.2 0.2 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.9 1.0 1.0 1.0 1.0 1.0 0.8 0.7 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0.6 0.4 1.0 1.0 0.8 0.0 0.0 0.2 0.6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.6 1.0 0.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.6 1.0 0.8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.8 1.0 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 1.0 0.9 0.6 0.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0.9 1.0 1.0 0.5 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.7 1.0 1.0 0.6 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.4 1.0 1.0 0.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 1.0 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.5 0.7 1.0 1.0 0.8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.6 0.9 1.0 1.0 1.0 1.0 0.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.5 0.9 1.0 1.0 1.0 1.0 0.8 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.3 0.8 1.0 1.0 1.0 1.0 0.8 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.7 0.9 1.0 1.0 1.0 1.0 0.8 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.2 0.7 0.9 1.0 1.0 1.0 1.0 1.0 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.5 1.0 1.0 1.0 0.8 0.5 0.5 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

28 ∗ 28 = 784 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

Lecturer : Hongpu Liu Lecture 9-24 PyTorch Tutorial @ SLAM Research Group
Implementation of classifier to MNIST dataset

Prepare dataset Design model using Class


1 2
Dataset and Dataloader inherit from nn.Module

Construct loss and optimizer Training cycle


3 4
using PyTorch API forward, backward, update

Lecturer : Hongpu Liu Lecture 9-25 PyTorch Tutorial @ SLAM Research Group
Implementation of classifier to MNIST dataset

Prepare dataset Design model using Class


1 2
Dataset and Dataloader inherit from nn.Module

Construct loss and optimizer Training cycle + Test


3 4
using PyTorch API forward, backward, update

Lecturer : Hongpu Liu Lecture 9-26 PyTorch Tutorial @ SLAM Research Group
Implementation – 0. Import Package

import torch
from torchvision import transforms
from torchvision import datasets For constructing DataLoader
from torch.utils.data import DataLoader
import torch.nn.functional as F
import torch.optim as optim

Lecturer : Hongpu Liu Lecture 9-27 PyTorch Tutorial @ SLAM Research Group
Implementation – 0. Import Package

import torch
from torchvision import transforms
from torchvision import datasets
from torch.utils.data import DataLoader
import torch.nn.functional as F For using function relu()
import torch.optim as optim

Lecturer : Hongpu Liu Lecture 9-28 PyTorch Tutorial @ SLAM Research Group
Implementation – 0. Import Package

import torch
from torchvision import transforms
from torchvision import datasets
from torch.utils.data import DataLoader
import torch.nn.functional as F
import torch.optim as optim For constructing Optimizer

Lecturer : Hongpu Liu Lecture 9-29 PyTorch Tutorial @ SLAM Research Group
Implementation – 1. Prepare Dataset

batch_size = 64
transform = transforms.Compose([
transforms.ToTensor(), Convert the PIL Image to Tensor.
transforms.Normalize((0.1307, ), (0.3081, ))
])

train_dataset = datasets.MNIST(root='../dataset/mnist/',
train=True,
PIL Image
download=True,
transform=transform) ℤ28×28 , 𝑝𝑖𝑥𝑒𝑙 ∈ 0, … , 255
train_loader = DataLoader(train_dataset,
shuffle=True,
batch_size=batch_size)
PyTorch Tensor
test_dataset = datasets.MNIST(root='../dataset/mnist/',
train=False,
download=True, ℝ1×28×28 , 𝑝𝑖𝑥𝑙𝑒 ∈ 0,1
transform=transform)
test_loader = DataLoader(test_dataset,
shuffle=False,
batch_size=batch_size)

Lecturer : Hongpu Liu Lecture 9-30 PyTorch Tutorial @ SLAM Research Group
Implementation – 1. Prepare Dataset

batch_size = 64
transform = transforms.Compose([ The parameters are mean and std
transforms.ToTensor(),
transforms.Normalize((0.1307, ), (0.3081, )) respectively. It use formulation
])

train_dataset = datasets.MNIST(root='../dataset/mnist/',
below:
train=True,
download=True,
transform=transform) 𝑷𝒊𝒙𝒆𝒍𝒐𝒓𝒊𝒈𝒊𝒏 − 𝒎𝒆𝒂𝒏
train_loader = DataLoader(train_dataset, 𝑷𝒊𝒙𝒆𝒍𝒏𝒐𝒓𝒎 =
shuffle=True, 𝒔𝒕𝒅
batch_size=batch_size)

test_dataset = datasets.MNIST(root='../dataset/mnist/',
train=False,
download=True,
transform=transform)
test_loader = DataLoader(test_dataset,
shuffle=False,
batch_size=batch_size)

Lecturer : Hongpu Liu Lecture 9-31 PyTorch Tutorial @ SLAM Research Group
Implementation – 1. Prepare Dataset

batch_size = 64
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307, ), (0.3081, ))
])

train_dataset = datasets.MNIST(root='../dataset/mnist/',
train=True,
download=True,
transform=transform)
train_loader = DataLoader(train_dataset,
shuffle=True,
batch_size=batch_size)

test_dataset = datasets.MNIST(root='../dataset/mnist/',
train=False,
download=True,
transform=transform)
test_loader = DataLoader(test_dataset,
shuffle=False,
batch_size=batch_size)

Lecturer : Hongpu Liu Lecture 9-32 PyTorch Tutorial @ SLAM Research Group
Implementation – 2. Design Model

x = x.view(-1, 784)
(𝑁, 784)
(𝑁, 1,28,28) self.l1 = torch.nn.Linear(784, 512)
(𝑁, 512)
x = F.relu(self.l1(x))
(𝑁, 512)
Input Layer
self.l2 = torch.nn.Linear(512, 256)
Linear Layer (𝑁, 256)
x = F.relu(self.l2(x))
ReLU Layer (𝑁, 256)

Output Layer self.l3 = torch.nn.Linear(256, 128)


(𝑁, 128)
x = F.relu(self.l3(x))
(𝑁, 128)
self.l4 = torch.nn.Linear(128, 64)
(𝑁, 64)
x = F.relu(self.l4(x))
(𝑁, 64)
(𝑁, 10) self.l5 = torch.nn.Linear(64, 10)

Lecturer : Hongpu Liu Lecture 9-33 PyTorch Tutorial @ SLAM Research Group
Implementation – 2. Design Model

class Net(torch.nn.Module):
(𝑁, 784)
def __init__(self):
(𝑁, 1,28,28)
super(Net, self).__init__()
(𝑁, 512)
self.l1 = torch.nn.Linear(784, 512)
self.l2 = torch.nn.Linear(512, 256)
(𝑁, 512)
Input Layer self.l3 = torch.nn.Linear(256, 128)
self.l4 = torch.nn.Linear(128, 64)
Linear Layer (𝑁, 256)
self.l5 = torch.nn.Linear(64, 10)
ReLU Layer (𝑁, 256)
def forward(self, x):
Output Layer x = x.view(-1, 784)
(𝑁, 128)
x = F.relu(self.l1(x))
(𝑁, 128)
x = F.relu(self.l2(x))
x = F.relu(self.l3(x))
(𝑁, 64) x = F.relu(self.l4(x))
return self.l5(x)
(𝑁, 64)
(𝑁, 10) model = Net()

Lecturer : Hongpu Liu Lecture 9-34 PyTorch Tutorial @ SLAM Research Group
Implementation – 3. Construct Loss and Optimizer

criterion = torch.nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)


𝒀 Loss 𝒀
0.2 0.38 1

One-hot
0.1 0.34 0
… Softmax Log ෡
−𝒀 𝒍𝒐𝒈 𝒀 𝟏
−0.1 0.28 0

Torch.nn.CrossEntropyLoss()

Lecturer : Hongpu Liu Lecture 9-35 PyTorch Tutorial @ SLAM Research Group
Implementation – 4. Train and Test

def train(epoch):
running_loss = 0.0
for batch_idx, data in enumerate(train_loader, 0):
inputs, target = data
optimizer.zero_grad()

# forward + backward + update


outputs = model(inputs)
loss = criterion(outputs, target)
loss.backward()
optimizer.step()

running_loss += loss.item()
if batch_idx % 300 == 299:
print('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 300))
running_loss = 0.0

Lecturer : Hongpu Liu Lecture 9-36 PyTorch Tutorial @ SLAM Research Group
Implementation – 4. Train and Test

def train(epoch):
running_loss = 0.0
for batch_idx, data in enumerate(train_loader, 0):
inputs, target = data
optimizer.zero_grad()
# forward + backward + update
outputs = model(inputs)
loss = criterion(outputs, target)
loss.backward()
optimizer.step()

running_loss += loss.item()
if batch_idx % 300 == 299:
print('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 300))
running_loss = 0.0

Lecturer : Hongpu Liu Lecture 9-37 PyTorch Tutorial @ SLAM Research Group
Implementation – 4. Train and Test

def train(epoch):
running_loss = 0.0
for batch_idx, data in enumerate(train_loader, 0):
inputs, target = data
optimizer.zero_grad()

# forward + backward + update


outputs = model(inputs)
loss = criterion(outputs, target)
loss.backward()
optimizer.step()
running_loss += loss.item()
if batch_idx % 300 == 299:
print('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 300))
running_loss = 0.0

Lecturer : Hongpu Liu Lecture 9-38 PyTorch Tutorial @ SLAM Research Group
Implementation – 4. Train and Test

def train(epoch):
running_loss = 0.0
for batch_idx, data in enumerate(train_loader, 0):
inputs, target = data
optimizer.zero_grad()

# forward + backward + update


outputs = model(inputs)
loss = criterion(outputs, target)
loss.backward()
optimizer.step()

running_loss += loss.item()
if batch_idx % 300 == 299:
print('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 300))
running_loss = 0.0

Lecturer : Hongpu Liu Lecture 9-39 PyTorch Tutorial @ SLAM Research Group
Implementation – 4. Train and Test

def test():
correct = 0
total = 0
with torch.no_grad():
for data in test_loader:
images, labels = data
outputs = model(images)
_, predicted = torch.max(outputs.data, dim=1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy on test set: %d %%' % (100 * correct / total))

Lecturer : Hongpu Liu Lecture 9-40 PyTorch Tutorial @ SLAM Research Group
Implementation – 4. Train and Test

def test():
correct = 0
total = 0
with torch.no_grad():
for data in test_loader:
images, labels = data
outputs = model(images)
_, predicted = torch.max(outputs.data, dim=1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy on test set: %d %%' % (100 * correct / total))

Lecturer : Hongpu Liu Lecture 9-41 PyTorch Tutorial @ SLAM Research Group
Implementation – 4. Train and Test

def test():
correct = 0
total = 0
with torch.no_grad():
for data in test_loader:
images, labels = data
outputs = model(images)
_, predicted = torch.max(outputs.data, dim=1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy on test set: %d %%' % (100 * correct / total))

Lecturer : Hongpu Liu Lecture 9-42 PyTorch Tutorial @ SLAM Research Group
Implementation – 4. Train and Test

[1, 300] loss: 0.335


[1, 600] loss: 0.154
[1, 900] loss: 0.067
Accuracy on test set: 90 %
[2, 300] loss: 0.048
[2, 600] loss: 0.040
[2, 900] loss: 0.035
if __name__ == '__main__': Accuracy on test set: 93 %
for epoch in range(10): ………………………………
train(epoch) [9, 300] loss: 0.005
test()
[9, 600] loss: 0.006
[9, 900] loss: 0.007
Accuracy on test set: 97 %
[10, 300] loss: 0.005
[10, 600] loss: 0.005
[10, 900] loss: 0.005
Accuracy on test set: 97 %

Lecturer : Hongpu Liu Lecture 9-43 PyTorch Tutorial @ SLAM Research Group
Softmax and CrossEntropyLoss


𝒀 Loss 𝒀
0.2 0.38 1

One-hot
0.1 0.34 0
… Softmax Log ෡
−𝒀 𝒍𝒐𝒈 𝒀 𝟏
−0.1 0.28 0

Torch.nn.CrossEntropyLoss()

Lecturer : Hongpu Liu Lecture 9-44 PyTorch Tutorial @ SLAM Research Group
Exercise 9-2: Classifier Implementation

• Try to implement a classifier for:


• Otto Group Product Classification Challenge
• Dataset: https://www.kaggle.com/c/otto-group-product-classification-
challenge/data

Lecturer : Hongpu Liu Lecture 9-45 PyTorch Tutorial @ SLAM Research Group
PyTorch Tutorial
09. Softmax Classifier

Lecturer : Hongpu Liu Lecture 9-46 PyTorch Tutorial @ SLAM Research Group

You might also like