Pytorch经典入门案例

Pytorch的基本使用

导入torch库,然后用torch.empty(5,3)初始化一个5*3的tensor。
这个tensor里的数字是随机的。
在这里插入图片描述
torch.rand(5,3)里创建的tensor里的数字是零到一的数字。
在这里插入图片描述
创建一个全是0的tensor,类型默认是torch.float32
在这里插入图片描述
还可以自己制定类型
在这里插入图片描述
还可以使用long()函数
在这里插入图片描述
还可以根据已有数据创建tensor。
在这里插入图片描述
也可以从一个已有的tensor建立新的tensor,新建的tensor会有之前的tensor的一些特征,如下图:我用new_ones(5,3)建立了一个全是1的tensor,dtype类型是旧的x中数的类型,当然,我也可以手动指定类型,如torch.double
在这里插入图片描述
利用randn_like函数可以产生与x形状相同的随机tensor
在这里插入图片描述

shape属性表示形状

在这里插入图片描述
tensor的加法
在这里插入图片描述
在这里插入图片描述
还有一种in-place形式的加法:
像这种有下划线的方法,都会改变调用者原有的值。
在这里插入图片描述
在这里插入图片描述
各种类似numpy的索引切片操作都可以在tensor上搞。
在这里插入图片描述
这里的view就相当于numpy的reshape
在这里插入图片描述
在这里插入图片描述
如果只有一个·元素的tensor,可以使用.item转为一个数值。
在这里插入图片描述
转置矩阵
在这里插入图片描述

Numpy和Tensor之间的转化

tensor与numpy可相互转化,并且共同使用一个内存空间,如下图,a和y里的值随便改变一个的话,另一个也会被改变。
在这里插入图片描述
tensor转为numpy也行
在这里插入图片描述

GPU训练

在这里插入图片描述
在这里插入图片描述
numpy是cpu的库,一定要讲tensor从gpu搬到cpu上才能转为numpy
在这里插入图片描述

用numpy手动实现两层神经网络

在这里插入图片描述
代码暂时先忽略偏置

import numpy as np
//N表示有多少输入,D_in表示每个输入有多少维,H表示隐藏层输出有多少维,D_out表示最后输出有多少维
N,D_in,H,D_out=64,1000,100,10
x=np.random.randn(N,D_in)
y=np.random.randn(N,D_out)
w1=np.random.randn(D_in,H)
w2=np.random.randn(H,D_out)
learning_rate=1e-6
for it in range(500):
##定义损失
    h=x.dot(w1)
    h_relu=np.maximum(h,0)
    y_pred=h_relu.dot(w2)
    loss=np.square(y_pred-y).sum()
    print(it,loss)
##求梯度
    grad_y_pred=2.0*(y_pred-y)
    grad_w2=h_relu.T.dot(grad_y_pred)
    grad_h_relu=grad_y_pred.dot(w2.T)
    grad_h=grad_h_relu.copy()
    grad_h[h<0]=0
    grad_w1=x.T.dot(grad_h)
##梯度下降
    w1-=learning_rate*grad_w1
    w2-=learning_rate*grad_w2

在这里插入图片描述
在这里插入图片描述
如上图,损失确实下降了

用pytorch手动实现两层神经网络

import torch
import numpy as np
N,D_in,H,D_out=64,1000,100,10
x=torch.randn(N,D_in)
y=torch.randn(N,D_out)
w1=torch.randn(D_in,H)
w2=torch.randn(H,D_out)
learning_rate=1e-6
for it in range(500):
    h=x.mm(w1)
    h_relu=h.clamp(min=0)
    y_pred=h_relu.mm(w2)
    ##要将tensor转为单个数字
    loss=(y_pred-y).pow(2).sum().item()
    print(it,loss)

    grad_y_pred=2.0*(y_pred-y)
    grad_w2=h_relu.t().mm(grad_y_pred)
    grad_h_relu=grad_y_pred.mm(w2.t())
    grad_h=grad_h_relu.clone()
    grad_h[h<0]=0
    grad_w1=x.t().mm(grad_h)

    w1-=learning_rate*grad_w1
    w2-=learning_rate*grad_w2

在这里插入图片描述
如上图所示,梯度明显在下降
当然,我也可以用torch里面自带的backward进行梯度计算,如下图,tonsor里面的数值得是小数。
在这里插入图片描述
在这里插入图片描述
将手动实现两层神经网络的代码完全用API来实现:

import torch
import numpy as np
N,D_in,H,D_out=64,1000,100,10
x=torch.randn(N,D_in)
y=torch.randn(N,D_out)
w1=torch.randn(D_in,H,requires_grad=True)
w2=torch.randn(H,D_out,requires_grad=True)
learning_rate=1e-6
for it in range(500):
    h=x.mm(w1)
    h_relu=h.clamp(min=0)
    y_pred=h_relu.mm(w2)
    ##要将tensor转为单个数字
    loss=(y_pred-y).pow(2).sum()
    print(it,loss.item())

    loss.backward()
    with torch.no_grad():##为了不让计算图占内存,就用torch.no_grad
      w1-=learning_rate*w1.grad
      w2-=learning_rate*w2.grad
     # 根据pytorch中backward()函数的计算,当网络参量进行反馈时,
     #  梯度是累积计算而不是被替换,但在处理每一个batch时并不需要与其他batch的梯度混合起来累积计算,因此需
     #  要对每个batch调用一遍grad.zero_()将参数梯度置0.
      w1.grad.zero_()
      w2.grad.zero_()

Pytorch的NeuralNetwork库

用NeuralNetwork库实现双层神经网络

import torch

N, D_in, H, D_out = 64, 1000, 100, 10
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)
# Use the nn package to define our model as a sequence of layers. nn.Sequential\n",
# is a Module which contains other Modules, and applies them in sequence to\n",
# produce its output. Each Linear Module computes output from input using a\n",
# linear function, and holds internal Tensors for its weight and bias.\n",
model = torch.nn.Sequential(
 torch.nn.Linear(D_in, H),
 torch.nn.ReLU(),
 torch.nn.Linear(H, D_out)
)
##将model里面的第一层和第三层的权重初始化为正态分布
torch.nn.init.normal_(model[0].weight)
torch.nn.init.normal_(model[2].weight)
# The nn package also contains definitions of popular loss functions; in this
# case we will use Mean Squared Error (MSE) as our loss function.
loss_fn = torch.nn.MSELoss(reduction='sum')
learning_rate = 1e-6

for t in range(500):
    # Forward pass: compute predicted y by passing x to the model. Module objects\n",
    # override the __call__ operator so you can call them like functions. When\n",
    # doing so you pass a Tensor of input data to the Module and it produces\n",
    # a Tensor of output data.\n",
    y_pred = model(x)
    # Compute and print loss. We pass Tensors containing the predicted and true
    # values of y, and the loss function returns a Tensor containing the loss.,
    loss = loss_fn(y_pred, y)
    print(t, loss.item())
    # Zero the gradients before running the backward pass.\n",
    model.zero_grad()
    loss.backward()
    # Update the weights using gradient descent. Each parameter is a Tensor, so\n",
    # we can access its gradients like we did before.\n",
    with torch.no_grad():
     for param in model.parameters():
      param -= learning_rate * param.grad

“这一次我们不再手动更新模型的weights,而是使用optim这个包来帮助我们更新参数,optim这个package提供了各种不同的模型优化方法,包括SGD+momentum, RMSProp, Adam等等”

import torch

N, D_in, H, D_out = 64, 1000, 100, 10
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)
# Use the nn package to define our model as a sequence of layers. nn.Sequential\n",
# is a Module which contains other Modules, and applies them in sequence to\n",
# produce its output. Each Linear Module computes output from input using a\n",
# linear function, and holds internal Tensors for its weight and bias.\n",
model = torch.nn.Sequential(
 torch.nn.Linear(D_in, H),
 torch.nn.ReLU(),
 torch.nn.Linear(H, D_out)
)
#torch.nn.init.normal_(model[0].weight)
#torch.nn.init.normal_(model[2].weight)
# The nn package also contains definitions of popular loss functions; in this
# case we will use Mean Squared Error (MSE) as our loss function.
loss_fn = torch.nn.MSELoss(reduction='sum')
learning_rate = 1e-4
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
for t in range(500):
    # Forward pass: compute predicted y by passing x to the model. Module objects\n",
    # override the __call__ operator so you can call them like functions. When\n",
    # doing so you pass a Tensor of input data to the Module and it produces\n",
    # a Tensor of output data.\n",
    y_pred = model(x)
    # Compute and print loss. We pass Tensors containing the predicted and true
    # values of y, and the loss function returns a Tensor containing the loss.,
    loss = loss_fn(y_pred, y)
    print(t, loss.item())
    # Zero the gradients before running the backward pass.\n",
    optimizer.zero_grad()
    loss.backward()
    # Update the weights using gradient descent. Each parameter is a Tensor, so\n",
    # we can access its gradients like we did before.\n",
    optimizer.step()

【注意】有时我们需要处理初始化参数,有时则不需要。用Adam时,就不需要将参数初始化正态分布,不然结果反而会坏掉。反正自己敲代码去实践一下,就知道大概是怎么回事了。

自定义模型

“我们可以定义一个模型,这个模型继承自nn.Module类。如果需要定义一个比Sequential模型更加复杂的模型,就需要定义nn.Module模型”

 import torch
 class TwoLayerNet(torch.nn.Module):
    def __init__(self, D_in, H, D_out):
    #In the constructor we instantiate two nn.Linear modules and assign them as\n",
    #member variables.\n",
     super(TwoLayerNet, self).__init__()
     self.linear1 = torch.nn.Linear(D_in, H)
     self.linear2 = torch.nn.Linear(H, D_out)
    def forward(self, x):
    #In the forward function we accept a Tensor of input data and we must return\n",
   #a Tensor of output data. We can use Modules defined in the constructor as\n",
    #well as arbitrary operators on Tensors.\n",
    h_relu = self.linear1(x).clamp(min=0)
    y_pred = self.linear2(h_relu)
    return y_pred
    # N is batch size; D_in is input dimension",
    # H is hidden dimension; D_out is output dimension.\n",
    N, D_in, H, D_out = 64, 1000, 100, 10
    "# Create random Tensors to hold inputs and outputs\n",
    x = torch.randn(N, D_in)
    y = torch.randn(N, D_out)
    # Construct our model by instantiating the class defined above\n",
    model = TwoLayerNet(D_in, H, D_out)
    # Construct our loss function and an Optimizer. The call to model.parameters()\n",
    # in the SGD constructor will contain the learnable parameters of the two\n",
    # nn.Linear modules which are members of the model.\n",
    criterion = torch.nn.MSELoss(reduction='sum')
    optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)
    for t in range(500):
     # Forward pass: Compute predicted y by passing x to the model\n",
     y_pred = model(x)
     # Compute and print loss\n",
    loss = criterion(y_pred, y)
    print(t, loss.item())
    # Zero gradients, perform a backward pass, and update the weights.\n",
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

用神经网络玩游戏

FizzBuzz是一个简单的小游戏。游戏规则如下:从1开始往上数数,当遇到3的倍数的时候,说fizz,当遇到5的倍数,说buzz,当遇到15的倍数,就说fizzbuzz,其他情况下则正常数数。

# One-hot encode the desired outputs: [number, \"fizz\", \"buzz\", \"fizzbuzz\"],
import numpy as np
import torch
def fizz_buzz_encode(i):
    if   i % 15 == 0: return 3
    elif i % 5  == 0: return 2
    elif i % 3  == 0: return 1
    else:
        return 0

def fizz_buzz_decode(i, prediction):
  return [str(i), "fizz", "buzz", "fizzbuzz"][prediction]
##我们首先定义模型的输入与输出(训练数据)
NUM_DIGITS = 10
# Represent each input by an array of its binary digits.
def binary_encode(i, num_digits):
        return np.array([i >> d & 1 for d in range(num_digits)])
trX = torch.Tensor([binary_encode(i, NUM_DIGITS) for i in range(101, 2 ** NUM_DIGITS)])
trY = torch.LongTensor([fizz_buzz_encode(i) for i in range(101, 2 ** NUM_DIGITS)])
##然后我们用PyTorch定义模型
# Define the model
NUM_HIDDEN = 100
model = torch.nn.Sequential(
     torch.nn.Linear(NUM_DIGITS, NUM_HIDDEN),
     torch.nn.ReLU(),
     torch.nn.Linear(NUM_HIDDEN, 4)
)

# "- 为了让我们的模型学会FizzBuzz这个游戏,我们需要定义一个损失函数,和一个优化算法。\n",
#     "- 这个优化算法会不断优化(降低)损失函数,使得模型的在该任务上取得尽可能低的损失值。\n",
#     "- 损失值低往往表示我们的模型表现好,损失值高表示我们的模型表现差。\n",
#     "- 由于FizzBuzz游戏本质上是一个分类问题,我们选用Cross Entropyy Loss函数。\n",
#     "- 优化函数我们选用Stochastic Gradient Descent。"
loss_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr = 0.05)

# Start training it\n",
BATCH_SIZE = 128
for epoch in range(10000):
    for start in range(0, len(trX), BATCH_SIZE):
         end = start + BATCH_SIZE
         batchX = trX[start:end]
         batchY = trY[start:end]
         y_pred = model(batchX)
         loss = loss_fn(y_pred, batchY)
         optimizer.zero_grad()
         loss.backward()
         optimizer.step()
      # Find loss on training data
         loss = loss_fn(model(trX), trY).item()
         print('Epoch:', epoch, 'Loss:', loss)
##最后我们用训练好的模型尝试在1到100这些数字上玩FizzBuzz游戏
testX = torch.Tensor([binary_encode(i, NUM_DIGITS) for i in range(1, 101)])
with torch.no_grad():
    testY = model(testX)
predictions = zip(range(1, 101), list(testY.max(1)[1].data.tolist()))
print([fizz_buzz_decode(i, x) for (i, x) in predictions])
print(np.sum(testY.max(1)[1].numpy() == np.array([fizz_buzz_encode(i) for i in range(1,101)])))
testY.max(1)[1].numpy() == np.array([fizz_buzz_encode(i) for i in range(1,101)])

结果如下
在这里插入图片描述

已标记关键词 清除标记
©️2020 CSDN 皮肤主题: 护眼 设计师:闪电赇 返回首页