【LLaMa-Factory】监督微调训练方法

训练阶段，可选: rm(reward modeling), pt(pretrain), sft(Supervised Fine-Tuning), PPO, DPO, KTO, ORPO。提供了微调时的配置示例。该配置指定了模型参数、微调方法参数、数据集参数以及评估参数等。您需要根据自身需求自行配置。可选: freeze, lora, full。数据集模板，请保证数据集模板与模型相对应。true用于

weixin_41784510

2130人浏览 · 2024-11-06 17:20:08

weixin_41784510 · 2024-11-06 17:20:08 发布

命令行

您可以使用以下命令进行微调：

llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml

examples/train_lora/llama3_lora_sft.yaml 提供了微调时的配置示例。该配置指定了模型参数、微调方法参数、数据集参数以及评估参数等。您需要根据自身需求自行配置。

### examples/train_lora/llama3_lora_sft.yaml
model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct

stage: sft
do_train: true
finetuning_type: lora
lora_target: all

dataset: identity,alpaca_en_demo
template: llama3
cutoff_len: 1024
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16

output_dir: saves/llama3-8b/lora/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true

per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
num_train_epochs: 3.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000

val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500

备注

模型 model_name_or_path 、数据集 dateset 需要存在且与 template 相对应。

重要训练参数
名称	描述
model_name_or_path	模型名称或路径
stage	训练阶段，可选: rm(reward modeling), pt(pretrain), sft(Supervised Fine-Tuning), PPO, DPO, KTO, ORPO
do_train	true用于训练, false用于评估
finetuning_type	微调方式。可选: freeze, lora, full
lora_target	采取LoRA方法的目标模块，默认值为 `all`。
dataset	使用的数据集，使用”,”分隔多个数据集
template	数据集模板，请保证数据集模板与模型相对应。
output_dir	输出路径
logging_steps	日志输出步数间隔
save_steps	模型断点保存间隔
overwrite_output_dir	是否允许覆盖输出目录
per_device_train_batch_size	每个设备上训练的批次大小
gradient_accumulation_steps	梯度积累步数
max_grad_norm	梯度裁剪阈值
learning_rate	学习率
lr_scheduler_type	学习率曲线，可选 `linear`, `cosine`, `polynomial`, `constant` 等。
num_train_epochs	训练周期数
bf16	是否使用 bf16 格式
warmup_ratio	学习率预热比例
warmup_steps	学习率预热步数
push_to_hub	是否推送模型到 Huggingface

摘自SFT 训练 - LLaMA Factory 方面后面查找阅读。

2048 AI社区

有“AI”的1024 = 2048，欢迎大家加入2048 AI社区

更多推荐

【AI&开发者】开发者与AI高效协作交流案例

拆解问题、明确目标（SMART）、列出约束。

2048 AI社区

InsCodeAI全解

InsCodeAI的出现，标志着一个新时代的开启。它将开发者从记忆API、编写样板代码、繁琐调试的泥潭中托举出来，让我们得以站在一个更高的维度上去思考软件的本质、架构的艺术和创新的边界。让我们携手InsCodeAI，共同探索人机协同编程的无限可能，将更多天马行空的创意，更快、更可靠地变为现实。面对一段复杂难懂的遗留代码，或者自己很久以前写的“魔法代码”，InsCodeAI的代码解释功能堪称神器。2