nn库的基本使用

nn库的原理与基本使用

qq_58286779

734人浏览 · 2024-06-04 10:03:47

qq_58286779 · 2024-06-04 10:03:47 发布

Pytorch的nn库

PyTorch 的 torch.nn 库是用于构建和训练神经网络的核心模块之一。它提供了各种层、损失函数、激活函数、优化器和其他工具，使得构建深度学习模型更加简便和模块化。

import torch
import torch.nn as nn

nn.moudules

nn.Module 是所有神经网络模块的基类。你的模型应该继承这个类，并实现 forward 方法。

nn.MSELoss,nn.BatchNorm2d,nn.GRU,nn.Linear,nn.ReLU

(torch.nn.modules.loss.MSELoss,
 torch.nn.modules.batchnorm.BatchNorm2d,
 torch.nn.modules.rnn.GRU,
 torch.nn.modules.linear.Linear,
 torch.nn.modules.activation.ReLU)

nn.functional

nn.functional.conv2d,nn.functional.relu,nn.functional.mse_loss,nn.functional.batch_norm

(<function torch._VariableFunctionsClass.conv2d>,
 <function torch.nn.functional.relu(input: torch.Tensor, inplace: bool = False) -> torch.Tensor>,
 <function torch.nn.functional.mse_loss(input: torch.Tensor, target: torch.Tensor, size_average: Optional[bool] = None, reduce: Optional[bool] = None, reduction: str = 'mean') -> torch.Tensor>,
 <function torch.nn.functional.batch_norm(input: torch.Tensor, running_mean: Optional[torch.Tensor], running_var: Optional[torch.Tensor], weight: Optional[torch.Tensor] = None, bias: Optional[torch.Tensor] = None, training: bool = False, momentum: float = 0.1, eps: float = 1e-05) -> torch.Tensor>)

比较

状态保存：
torch.nn.Module：保存状态（权重、偏置），可以包含子模块。
torch.nn.functional：不保存状态，通常用于实现自定义的前向计算逻辑。

使用场景：
torch.nn.Module：用于定义整个神经网络模型，包括各层的定义和前向计算。
torch.nn.functional：用于定义具体层的前向计算，特别是当需要自定义计算逻辑时。

示例对比

下面的示例展示了如何使用 torch.nn.Module 和 torch.nn.functional 来实现一个简单的两层全连接网络：

import torch

import torch.nn as nn

class SimpleModel(nn.Module):
    def __init__(self):
        super(SimpleModel, self).__init__()
        self.fc1 = nn.Linear(10, 50)
        self.fc2 = nn.Linear(50, 1)

    def forward(self, x):
        x = self.fc1(x)
        x = torch.relu(x)  # 也可以使用 self.relu = nn.ReLU() 然后 self.relu(x)
        x = self.fc2(x)
        return x

model = SimpleModel()


import torch.nn.functional as F

class SimpleModel(nn.Module):
    def __init__(self):
        super(SimpleModel, self).__init__()
        self.fc1 = nn.Linear(10, 50)
        self.fc2 = nn.Linear(50, 1)

    def forward(self, x):
        x = self.fc1(x)
        x = F.relu(x)  # 使用 F.relu 而不是 nn.ReLU()
        x = self.fc2(x)
        return x

model_v1 = SimpleModel()

nn.parameter

定义可训练的参数
主要用途
在构建自定义的 PyTorch 模型时，有时需要定义一些模型参数（例如权重和偏置），这些参数在训练过程中需要被更新。使用 nn.Parameter 可以实现这一目的。

bias=torch.nn.Parameter(torch.ones(5))

bias

Parameter containing:
tensor([1., 1., 1., 1., 1.], requires_grad=True)

params=nn.ParameterList([nn.Parameter(torch.randn(5,10)) for i in range(5)])
params

ParameterList(
    (0): Parameter containing: [torch.float32 of size 5x10]
    (1): Parameter containing: [torch.float32 of size 5x10]
    (2): Parameter containing: [torch.float32 of size 5x10]
    (3): Parameter containing: [torch.float32 of size 5x10]
    (4): Parameter containing: [torch.float32 of size 5x10]
)

示例

下面是一个简单的示例，展示了如何使用 nn.Parameter 定义自定义模型参数：

import torch
import torch.nn as nn

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        # 定义一个普通的张量
        self.some_tensor = torch.Tensor([1, 2, 3])
        # 将张量转换为模型参数
        self.some_param = nn.Parameter(torch.Tensor([4, 5, 6]))

    def forward(self, x):
        return x * self.some_param

# 创建模型实例
model = MyModel()

# 打印模型参数
print("Model parameters:")
for name, param in model.named_parameters():
    print(f"{name}: {param}")

# 打印非模型参数
print("\nNon-parameter attributes:")
for name, param in model.named_buffers():
    print(f"{name}: {param}")

# 输入张量
input_tensor = torch.Tensor([1, 2, 3])

# 前向传播
output = model(input_tensor)
print("\nOutput:", output)

Model parameters:
some_param: Parameter containing:
tensor([4., 5., 6.], requires_grad=True)

Non-parameter attributes:

Output: tensor([ 4., 10., 18.], grad_fn=<MulBackward0>)

如何使用 nn.Parameter

在构建自定义层时，nn.Parameter 非常有用。例如，我们可以定义一个简单的线性层：

class CustomLinear(nn.Module):
    def __init__(self, in_features, out_features):
        super(CustomLinear, self).__init__()
        # 定义权重和偏置参数
        self.weight = nn.Parameter(torch.randn(out_features, in_features))
        self.bias = nn.Parameter(torch.randn(out_features))

    def forward(self, x):
        return torch.matmul(x, self.weight.t()) + self.bias

# 创建自定义线性层实例
linear = CustomLinear(5, 3).to(device)

# 输入张量
input_tensor = torch.randn(2, 5).to(device)

# 前向传播
output = linear(input_tensor)
print(output)

tensor([[ 1.8776, -1.5554, -2.6914],
        [ 3.2729, -1.3633,  0.5271]], device='cuda:0', grad_fn=<AddBackward0>)

nn.Sequential

nn.Sequential 是 PyTorch 中的一个容器模块，允许将多个层（或模块）按顺序组合在一起，使得构建神经网络时更为简洁和方便。它适合那些结构简单、前向传播过程可以用一系列顺序操作完成的模型。
主要用途
nn.Sequential 的主要用途是创建层的有序排列，并将这些层按顺序组合起来。这使得代码更加简洁易读，尤其是在构建典型的前馈神经网络时。
其他模块结合
OrderedDict 和 nn.Sequential
还可以使用 OrderedDict 来为每一层指定名称，这样在打印模型时更加清晰：

from collections import OrderedDict

layers=OrderedDict([
    ("fc1",nn.Linear(2,4)),
    ("relu1",nn.ReLU()),
    ("fc2",nn.Linear(4,1)),
    ("sigmodi",nn.Sigmoid())
])
model=nn.Sequential(layers)
print(model)

#输入张量
input_tensor=torch.tensor([1.0,2.0])
#输出张量
output=model(input_tensor)
print(output)

Sequential(
  (fc1): Linear(in_features=2, out_features=4, bias=True)
  (relu1): ReLU()
  (fc2): Linear(in_features=4, out_features=1, bias=True)
  (sigmodi): Sigmoid()
)
tensor([0.5363], grad_fn=<SigmoidBackward0>)

# 使用 nn.Sequential 定义一个简单的前馈神经网络
model = nn.Sequential(
    nn.Linear(2, 4),  # 输入层到隐藏层
    nn.ReLU(),        # 激活函数
    nn.Linear(4, 1),  # 隐藏层到输出层
    nn.Sigmoid()      # 输出激活函数
)

# 打印模型结构
print(model)

# 输入张量
input_tensor = torch.tensor([[1.0, 2.0]])

# 前向传播
output = model(input_tensor)
print(output)

Sequential(
  (0): Linear(in_features=2, out_features=4, bias=True)
  (1): ReLU()
  (2): Linear(in_features=4, out_features=1, bias=True)
  (3): Sigmoid()
)
tensor([[0.3993]], grad_fn=<SigmoidBackward0>)

2048 AI社区

有“AI”的1024 = 2048，欢迎大家加入2048 AI社区

更多推荐

一文读懂AI大模型核心术语：从参数到Agent的完整指南

2048 AI社区

结合AI大模型的本地知识库搭建方法总结，大模型入门到精通，收藏这篇就足够了！

2048 AI社区

Paint API之—— Xfermode与PorterDuff详解(三)

本文详解Android中PorterDuff的18种混合模式，包括ADD、CLEAR、DARKEN等，通过公式解析Alpha通道和颜色通道的计算方式。每种模式都配有组合逻辑、处理方式和效果示例说明，如ADD模式会使颜色叠加变亮，CLEAR模式会完全透明化等。文中还提供了与WebView、Socket集成的实战场景，帮助开发者系统掌握PorterDuff混合模式的应用。