VLLM启动报错：ValueError: The model‘s max seq len (19008) is larger than the maximum number of tokens tha

ValueError: The model's max seq len (19008) is larger than the maximum number of tokens that can be stored in KV cache (3840). Try increasing `gpu_memory_utilization` or decreasing `max_model_len` whe

h1773655323

7438人浏览 · 2024-04-24 13:00:13

h1773655323 · 2024-04-24 13:00:13 发布

遇到报错：

ValueError: The model’s max seq len (19008) is larger than the maximum number of tokens that can be stored in KV cache (3840). Try increasing gpu_memory_utilization or decreasing max_model_len when initializing the engine.

问题原因：

在某些情况下，可能会因为GPU内存限制而需要调整模型的最大序列长度。如果遇到以上错误，说明序列长度超出了GPU的KV缓存限制。

解决办法：

我的报错中是KV cache (3840),那么我在启动命令最后加入：

--max-model-len 3840

2048 AI社区

有“AI”的1024 = 2048，欢迎大家加入2048 AI社区

更多推荐

IDEA EAP 2026.1新特性：全面支持 Wayland，为开发者带来了革命性体验

2048 AI社区

当文学变成“流水线罐头”：45 分钟生成一部小说，AI 正在血洗亚马逊排行榜

2048 AI社区

《C++进阶之STL》【set/map 使用介绍】

cplusplus网站上关于C++的set容器在这里插入图片描述在这里插入图片描述在这里插入图片描述代码语言：javascriptAI代码解释关于 C++ STL 中set容器的模板参数说明：元素类型（T： set 底层存储的关键字类型，需保证该类型支持比较操作（默认需支持运算符）比较器（Compare，默认less<T>：用于定义元素间的排序规则。若T不支持默认比较（如：自定义类），或需自定义排