手把手教你用Keras Sequential模型搭建神经网络：从零基础到图像分类实战

作为一名在人工智能领域摸爬滚打多年的开发者，我经常被问到一个问题：“老哥，我想快速上手深度学习，应该从哪开始？” 答案很简单——Keras的Sequential模型就是你的最佳起点。这个就像搭积木一样的API设计，让构建神经网络变得和拼乐高一样直观。今天我就带大家用最接地气的方式，从安装环境到完成一个完整的图像分类项目，手把手走一遍全流程。

Coderabo

1161人浏览 · 2025-04-15 11:37:52

Coderabo · 2025-04-15 11:37:52 发布

手把手教你用Keras Sequential模型搭建神经网络：从零基础到图像分类实战

一、写给新人的Keras快速入门指南

二、环境准备与工具安装

在开始写代码之前，咱们先把环境配置好。推荐使用Anaconda创建独立的Python环境：

conda create -n keras_env python=3.8
conda activate keras_env
pip install tensorflow==2.9.0 matplotlib numpy

安装完成后，在Python文件中导入必要的库：

import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np
import matplotlib.pyplot as plt

三、数据预处理实战技巧

我们以经典的MNIST手写数字数据集为例。虽然Keras自带了加载函数，但真正的工程场景中数据往往需要更多处理：

# 加载数据集
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()

# 数据归一化（重要！）
train_images = train_images.reshape((60000, 28, 28, 1)).astype('float32') / 255
test_images = test_images.reshape((10000, 28, 28, 1)).astype('float32') / 255

# 标签编码转换
train_labels = tf.keras.utils.to_categorical(train_labels)
test_labels = tf.keras.utils.to_categorical(test_labels)

四、模型搭建核心步骤详解

Sequential模型就像三明治，咱们一层层叠上去就行：

model = models.Sequential([
    # 卷积层组合：32个3x3的滤波器
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
    # 池化层压缩特征维度
    layers.MaxPooling2D((2,2)),
    
    # 第二组卷积
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D((2,2)),
    
    # 展平层：把二维特征图转换为一维
    layers.Flatten(),
    
    # 全连接层：经典的三明治结构
    layers.Dense(128, activation='relu'),
    # Dropout防止过拟合
    layers.Dropout(0.5),
    
    # 输出层：10个数字类别
    layers.Dense(10, activation='softmax')
])

五、模型编译的大学问

编译模型就像给汽车装方向盘，决定训练的方向：

model.compile(
    optimizer='adam',  # 自适应学习率优化器
    loss='categorical_crossentropy',  # 分类任务标准损失函数
    metrics=['accuracy',  # 主评估指标
             tf.keras.metrics.Precision(),  # 增加精确率指标
             tf.keras.metrics.Recall()]  # 召回率指标
)

六、模型训练的正确姿势

训练模型时要注意这三个关键点：批量大小、迭代次数、验证集划分：

history = model.fit(
    train_images, 
    train_labels,
    epochs=15,  # 全数据集迭代次数
    batch_size=256,  # 每次训练样本量
    validation_split=0.2,  # 自动划分20%作为验证集
    verbose=1  # 显示进度条
)

七、模型评估与可视化分析

训练完成后，咱们需要像老中医一样"望闻问切"：

# 绘制训练曲线
plt.figure(figsize=(12,5))
plt.subplot(1,2,1)
plt.plot(history.history['accuracy'], label='训练准确率')
plt.plot(history.history['val_accuracy'], label='验证准确率')
plt.title('准确率变化曲线')
plt.legend()

plt.subplot(1,2,2)
plt.plot(history.history['loss'], label='训练损失')
plt.plot(history.history['val_loss'], label='验证损失')
plt.title('损失值变化曲线')
plt.legend()
plt.show()

# 测试集最终评估
test_loss, test_acc, test_precision, test_recall = model.evaluate(test_images, test_labels)
print(f'\n测试集准确率：{test_acc:.4f}')

八、模型保存与部署方案

训练好的模型要妥善保存，这里提供三种常用方式：

# 保存完整模型（架构+权重+优化器状态）
model.save('mnist_model.h5')

# 仅保存权重
model.save_weights('model_weights.h5')

# TensorFlow SavedModel格式（适合部署）
tf.saved_model.save(model, 'saved_model/')

九、完整实战代码示例

把前面的步骤整合成一个可直接运行的完整示例：

# 环境设置
import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np
import matplotlib.pyplot as plt

# 数据准备
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()
train_images = train_images.reshape((60000, 28, 28, 1)).astype('float32') / 255
test_images = test_images.reshape((10000, 28, 28, 1)).astype('float32') / 255
train_labels = tf.keras.utils.to_categorical(train_labels)
test_labels = tf.keras.utils.to_categorical(test_labels)

# 模型构建
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D((2,2)),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(10, activation='softmax')
])

# 模型编译
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# 模型训练
history = model.fit(train_images, train_labels,
                    epochs=15,
                    batch_size=256,
                    validation_split=0.2)

# 结果可视化
plt.plot(history.history['accuracy'], label='训练准确率')
plt.plot(history.history['val_accuracy'], label='验证准确率')
plt.xlabel('训练轮次')
plt.ylabel('准确率')
plt.legend()
plt.show()

# 模型评估
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f'测试准确率: {test_acc:.4f}')

# 保存模型
model.save('mnist_cnn_model.h5')

十、常见问题与调优策略

根据多年踩坑经验，总结几个典型问题：

过拟合怎么办？

增加Dropout层（建议0.3-0.5）
添加L2正则化
使用数据增强（旋转/平移图像）

# 数据增强示例
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
    rotation_range=10,
    width_shift_range=0.1,
    height_shift_range=0.1)

模型不收敛怎么处理？

检查输入数据是否归一化
适当降低学习率
尝试不同的优化器（如RMSprop）

如何提升准确率？

增加卷积层深度
使用更复杂的网络结构（如ResNet模块）
调整超参数（batch_size建议设为2的幂次）

通过这个完整的教程，相信你已经掌握了使用Keras Sequential模型的核心方法。记住，深度学习就像学骑自行车——理论再熟不如亲自上手练几次。现在就去动手修改代码中的参数，观察模型表现的变化吧！

2048 AI社区

有“AI”的1024 = 2048，欢迎大家加入2048 AI社区

更多推荐

中心化平台终将落寞，去中心化Agent协议主宰未来信息交互

预测Agent是具备主动感知、预判、决策、执行能力的AI智能体，区别于传统被动响应的AI工具，它能基于用户偏好、历史行为、环境数据，提前预判需求，自主完成信息检索、内容筛选、任务协作、决策辅助等操作。核心能力：主动感知、趋势预判、自主执行、持续迭代典型场景：个性化信息推送、风险预警、需求前置满足、多任务协同调度本质变革：从“人找信息”变成“信息找人”，从被动交互变成主动服务“小龙虾时代”是Agen