spring ai实现多路召回检索增强advisor
本文探讨了在Spring AI框架中实现RAG(检索增强生成)检索的优化方案。针对现有检索增强器功能单一、组合复杂的问题,实现更全面、高效的混合检索增强方法。通过定义基础检索器抽象类并实现多种检索器(如BM25关键词检索、语义向量相似度检索等),结合检索器聚合工厂动态创建检索器列表。核心是实现了HybridRetrievalAdvisor,支持多路并行检索、文档去重合并和重排序,最终生成简洁高效的
实现rag检索增强,往往需要利用多路召回提高文档召回率,同时使用如重排序等手段提高文档准确率。spring-ai中提供了多种检索增强器的实现,但定义的检索来源设定都较为单一,或者支持定义检索器但已有的检索器功能较少,无法完成复杂组合,常见需求场景是检索来源包含本地文档、数据库、向量库,检索后需重排序或特定排序,最终的文档列表才能作为提示词上下文。
spring-ai框架已有的并不完美的方案
针对上述需求,那多组合几个检索增强advisor不就能解决问题了吗?
答案是否定的,spring ai中,每一种增强器对应了一个提示词工程,最后提示词会按格式组合,多路检索可能会产生如下格式(示例),即使调整各advisor的order顺序,提示词也是重复片段较多,即在已有的实现中,对文档的归集或者DocumentJoiner实现并不友好,生成的提示词非常冗余。
系统提示词...... 略
# 下列未检索增强提示词组合示例
# part1
Context information is below.
---------------------
语义相似匹配找出的文档1
语义相似匹配找出的文档2
语义相似匹配找出的文档3
---------------------
Given the context information and no prior knowledge, answer the query.
Follow these rules:
1. If the answer is not in the context, just say that you don't know.
2. Avoid statements like "Based on the context..." or "The provided information...".
Query: 用户问题
# part2
Context information is below.
---------------------
重排序advisor匹配找出的文档1
重排序advisor匹配找出的文档2
---------------------
Given the context information and no prior knowledge, answer the query.
Follow these rules:
1. If the answer is not in the context, just say that you don't know.
2. Avoid statements like "Based on the context..." or "The provided information...".
Query: 用户问题
# part3
Query: 用户问题
Read the question again: 用户问题
Context information is below, surrounded by ---------------------
---------------------
其他advisor检索的文档1
其他advisor检索的文档2
---------------------
Given the context and provided history information and not prior knowledge,
reply to the user comment. If the answer is not in the context, inform
the user that you can't answer the question.
# part4
Query: 用户问题
Read the question again: 用户问题
Context information is below, surrounded by ---------------------
---------------------
bm25找出的文档1
bm25找出的文档2
---------------------
Given the context and provided history information and not prior knowledge,
reply to the user comment. If the answer is not in the context, inform
the user that you can't answer the question.
框架中的实现缺陷
本文的spring-ai描述和拓展实现,基于spring-ai 1.1.0-M4版本,spring-ai-alibaba 1.0.0.4版本。
在spring-ai中,DocumentRetriever支持自定义实现,即我们完全可以不用针对不同的算法,实现对应的检索器,使用一个统一的增强器来检索文档,框架已有的实现中CompositeDocumentRetriever 配合一个DocumentRetrievalAdvisor 似乎是个不错的选择,但阅读源码后会发现,方案并不完美,因为这个Advisor不支持重排序和自定义文档组合或文档处理配置,而这个检索器呢
- 实现方式比较简单,自定义参数困难
- 纯串行执行,性能较低
- 文档组合方式不支持对拓展问题的检索方式
即不支持将query拆分为多个query,分多次多检索器匹配 - 文档排序和权重判断逻辑缺失等等
所以我们还是需要自己实现一种较为全面的混合检索增强。
构建多类检索器和多路检索增强
基于 DocumentRetriever 定义一个抽象检索器
@RequiredArgsConstructor
public abstract class BaseDocumentRetriever implements DocumentRetriever {
protected final DocumentQueryContext documentQueryContext;
protected final ChatRagProperties chatRagProperties;
public List<Document> retrieve(@NotNull List<Query> query) {
//查询拓展 顺序执行
List<Document> results = CollUtil.newArrayList();
query.parallelStream().forEach(item -> {
if (item != null && StrUtil.isNotBlank(item.text())) {
results.addAll(retrieve(item));
}
});
return results;
}
protected Filter.Expression computeRequestFilterExpression(Query query, String filterExpression) {
var contextFilterExpression = query.context().get(filterExpression);
if (contextFilterExpression != null) {
if (contextFilterExpression instanceof Filter.Expression) {
return (Filter.Expression) contextFilterExpression;
} else if (StringUtils.hasText(contextFilterExpression.toString())) {
return new FilterExpressionTextParser().parse(contextFilterExpression.toString());
}
}
// 构建复杂的文档过滤条件
Map<String, Object> expressionParams = DocumentQueryUtils.convertDocumentQueryMap(documentQueryContext);
return VectorStoreFilterUtil.buildFilterExpression(expressionParams);
}
}
按需实现各类文档来源检索器
如定义bm25关键词检索、库表语义向量相似检索、本地文档向量检索、云文档检索等等。
如
BM23关键词+元数据检索
/**
* 使用BM23关键词+元数据检索获取关联文档
*
* @author endcy
* @date 2025/12/4
*/
@Slf4j
public class Bm25DocumentRetriever extends BaseDocumentRetriever {
private final VectorStoreService vectorStoreService;
public Bm25DocumentRetriever(VectorStoreService vectorStoreService,
ChatRagProperties chatRagProperties,
DocumentQueryContext documentQueryContext) {
super(documentQueryContext, chatRagProperties);
this.vectorStoreService = vectorStoreService;
}
@NotNull
@Override
public List<Document> retrieve(@NotNull Query query) {
String question = documentQueryContext.getReReadingQuestion();
if (StrUtil.isBlank(question) || StrUtil.isBlank(documentQueryContext.getOriginalQuestion())) {
return List.of();
}
// bm25检索
List<VectorDocument> bm25Documents = vectorStoreService.retrieveWithTsQuery(documentQueryContext,
chatRagProperties.getBm25TopK(), chatRagProperties.getBm25SimilarityThreshold());
return DocumentConvertUtils.vectorConvertDocument(bm25Documents);
}
}
语义向量相似度检索
/**
* 使用语义向量相似度检索获取关联文档
*
* @author endcy
* @date 2025/12/4
*/
@Slf4j
public class VectorDocumentRetriever extends BaseDocumentRetriever {
private final VectorStore vectorStore;
public VectorDocumentRetriever(VectorStore vectorStore,
ChatRagProperties chatRagProperties,
DocumentQueryContext documentQueryContext) {
super(documentQueryContext, chatRagProperties);
this.vectorStore = vectorStore;
}
@NotNull
@Override
public List<Document> retrieve(@NotNull Query query) {
Assert.notNull(query, "query cannot be null");
var requestFilterExpression = computeRequestFilterExpression(query, VectorStoreDocumentRetriever.FILTER_EXPRESSION);
SearchRequest searchRequest;
if (requestFilterExpression != null) {
searchRequest = SearchRequest.builder()
.query(documentQueryContext.getReReadingQuestion())
.filterExpression(requestFilterExpression)
.similarityThreshold(chatRagProperties.getSimilarityThreshold())
.topK(chatRagProperties.getSimilarityTopK())
.build();
} else {
searchRequest = SearchRequest.builder()
.query(documentQueryContext.getReReadingQuestion())
.similarityThreshold(chatRagProperties.getSimilarityThreshold())
.topK(chatRagProperties.getSimilarityTopK())
.build();
}
return this.vectorStore.similaritySearch(searchRequest);
}
}
本地文档元数据+相似度检索
/**
* 本地文档 元数据+相似度检索获取关联文档
*
* @author endcy
* @date 2025/12/4
*/
@Slf4j
public class LocalDocumentRetriever extends BaseDocumentRetriever {
public static final String LOCAL_FILTER_EXPRESSION = "qa_filter_expression";
private final VectorStore vectorStore;
public LocalDocumentRetriever(VectorStore vectorStore,
ChatRagProperties chatRagProperties,
DocumentQueryContext documentQueryContext) {
super(documentQueryContext, chatRagProperties);
this.vectorStore = vectorStore;
}
@NotNull
@Override
public List<Document> retrieve(@NotNull Query query) {
Assert.notNull(query, "query cannot be null");
var requestFilterExpression = computeRequestFilterExpression(query, LOCAL_FILTER_EXPRESSION);
SearchRequest searchRequest;
if (requestFilterExpression != null) {
searchRequest = SearchRequest.builder()
.query(documentQueryContext.getReReadingQuestion())
.filterExpression(requestFilterExpression)
.similarityThreshold(chatRagProperties.getSimilarityThreshold())
.topK(chatRagProperties.getSimilarityTopK())
.build();
} else {
searchRequest = SearchRequest.builder()
.query(documentQueryContext.getReReadingQuestion())
.similarityThreshold(chatRagProperties.getSimilarityThreshold())
.topK(chatRagProperties.getSimilarityTopK())
.build();
}
return this.vectorStore.similaritySearch(searchRequest);
}
}
阿里云百炼空间云文档检索
/**
* 阿里云百炼空间云文档检索
*
* @author endcy
* @date 2025/12/4
*/
public class AliDocumentRetriever extends BaseDocumentRetriever {
private final DashScopeConnectionProperties dashScopeConnectionProperties;
public AliDocumentRetriever(DashScopeConnectionProperties dashScopeConnectionProperties,
ChatRagProperties chatRagProperties,
DocumentQueryContext documentQueryContext) {
super(documentQueryContext, chatRagProperties);
this.dashScopeConnectionProperties = dashScopeConnectionProperties;
}
/**
* 阿里百炼空间云文档
* 这里有个消耗token的风险,一个问题不用查询拓展,直接丢阿里即可
*
* @param query The query to use for retrieving documents .
* @return .
*/
@NotNull
@Override
public List<Document> retrieve(@NotNull Query query) {
if (!BooleanUtil.isTrue(chatRagProperties.getEnableAliDashScopeIndex())) {
return CollUtil.newArrayList();
}
DashScopeApi dashScopeApi = DashScopeApi.builder()
.apiKey(dashScopeConnectionProperties.getApiKey())
.build();
DashScopeDocumentRetrieverOptions options = DashScopeDocumentRetrieverOptions.builder()
.withIndexName(chatRagProperties.getAliDashScopeKnowledgeIndex())
.build();
String pipelineId = dashScopeApi.getPipelineIdByName(options.getIndexName());
if (pipelineId == null) {
throw new DashScopeException("Index:" + options.getIndexName() + " NotExist");
}
return dashScopeApi.retriever(pipelineId, query.text(), options);
}
}
检索器聚合工厂
/**
* 检索器聚合工厂
*
* @author endcy
* @date 2025/12/4
*/
@Slf4j
@RequiredArgsConstructor
@Component
public class AdvisorRetrieverFactory {
private final PgVectorStore pgVectorVectorStore;
private final SimpleVectorStore localVectorStore;
private final ChatRagProperties chatRagProperties;
private final VectorStoreService vectorStoreService;
private final DashScopeConnectionProperties dashScopeConnectionProperties;
public List<BaseDocumentRetriever> dynamicCreateRetrievers(DocumentQueryContext documentParams, IntentResult intentResult) {
List<BaseDocumentRetriever> documentRetrievers = CollUtil.newArrayList();
List<PossibleSourceTypeEnum> dataScopeList = intentResult.getDataScopeList();
if (dataScopeList == null) {
dataScopeList = CollUtil.newArrayList();
}
for (PossibleSourceTypeEnum dataScope : dataScopeList) {
switch (dataScope) {
case UNKNOWN -> log.debug("无参考数据");
case LOCAL -> documentRetrievers.add(new LocalDocumentRetriever(localVectorStore, chatRagProperties, documentParams));
case VECTOR -> {
documentRetrievers.add(new VectorDocumentRetriever(pgVectorVectorStore, chatRagProperties, documentParams));
documentRetrievers.add(new Bm25DocumentRetriever(vectorStoreService, chatRagProperties, documentParams));
}
case CLOUD -> documentRetrievers.add(new AliDocumentRetriever(dashScopeConnectionProperties, chatRagProperties, documentParams));
default -> log.info("{} 无参考数据", dataScope);
}
}
return documentRetrievers;
}
}
实现混合检索增强advisor
/**
* 自定义文档来源检索器的检索增强器
* 包含了 本地向量化文档、pg库表文档bm25关键词检索 + pg库表语义向量相似检索 + 重排序的Advisor
* 原CompositeDocumentRetriever实现多路检索 需要额外重排序 且串行检索效率较低
*
* @author endcy
* @date 2025/12/4
* @see com.alibaba.cloud.ai.advisor.RetrievalRerankAdvisor
* @see com.alibaba.cloud.ai.advisor.CompositeDocumentRetriever
*/
@Slf4j
public class HybridRetrievalAdvisor implements BaseAdvisor {
private final RerankModel rerankModel;
private final ChatRagProperties chatRagProperties;
private final List<BaseDocumentRetriever> documentRetrievers;
private final QueryExpander queryExpander;
public static final String RETRIEVED_DOCUMENTS = "rag_retrieved_documents";
public HybridRetrievalAdvisor(RerankModel rerankModel,
ChatRagProperties chatRagProperties,
List<BaseDocumentRetriever> documentRetrievers,
QueryExpander queryExpander) {
Assert.notNull(rerankModel, "The rerankModel must not be null!");
Assert.notNull(chatRagProperties, "The chatRagProperties must not be null!");
this.documentRetrievers = documentRetrievers;
this.queryExpander = queryExpander;
this.rerankModel = rerankModel;
this.chatRagProperties = chatRagProperties;
}
@Override
public int getOrder() {
return EnergyAiConstant.HYBRID_ADVISOR_ORDER;
}
protected List<Document> doRerank(ChatClientRequest request, List<Document> documents) {
if (CollectionUtils.isEmpty(documents)) {
return documents;
}
var rerankRequest = new RerankRequest(request.prompt().getUserMessage().getText(), documents);
RerankResponse response = rerankModel.call(rerankRequest);
log.debug("reranked documents: {}", response);
if (CollUtil.isEmpty(response.getResults())) {
return documents;
}
return response.getResults()
.stream()
.filter(doc -> doc != null && doc.getScore() >= chatRagProperties.getRerankMinScore())
.sorted(Comparator.comparingDouble(DocumentWithScore::getScore).reversed())
.map(DocumentWithScore::getOutput)
.collect(Collectors.toList());
}
/**
* 文档合并
* 生成的Document id不能为空
*
* @param documentsList .
* @return .
*/
private static List<Document> mergeDocuments(List<List<Document>> documentsList) {
if (CollUtil.isEmpty(documentsList)) {
return CollUtil.newArrayList();
}
//将 documentsList 合并为一个文档List并去重
return new ArrayList<>(documentsList.stream()
.flatMap(List::stream)
.collect(Collectors.toMap(
Document::getId,
Function.identity(),
(existing, replacement) -> existing
))
.values());
}
private static TaskExecutor buildDefaultTaskExecutor() {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setThreadNamePrefix("ai-advisor-");
taskExecutor.setCorePoolSize(4);
taskExecutor.setMaxPoolSize(8);
taskExecutor.setTaskDecorator(new ContextPropagatingTaskDecorator());
taskExecutor.initialize();
return taskExecutor;
}
@NotNull
@Override
public ChatClientRequest before(ChatClientRequest request, @NotNull AdvisorChain advisorChain) {
var context = request.context();
var userMessage = request.prompt().getUserMessage();
Query originalQuery = Query.builder()
.text(request.prompt().getUserMessage().getText())
.history(request.prompt().getInstructions())
.context(context)
.build();
var userQuery = request.prompt().getUserMessage().getText();
var enableSplits = StrUtil.length(userQuery) > chatRagProperties.getQuerySplitsWordNum() && queryExpander != null;
List<Query> querySplits = enableSplits ? queryExpander.expand(originalQuery) : List.of(originalQuery);
String augmentedUserText;
if (CollUtil.isNotEmpty(documentRetrievers)) {
// 按检索器分割任务
List<List<Document>> documentsList = documentRetrievers.stream()
.map(retriever -> CompletableFuture.supplyAsync(() -> retriever.retrieve(querySplits), buildDefaultTaskExecutor()))
.toList()
.stream()
.map(CompletableFuture::join)
.toList();
//文档合并
List<Document> mergedDocuments = mergeDocuments(documentsList);
context.put(RETRIEVED_DOCUMENTS, mergedDocuments);
// 重排序
List<Document> rerankedDocuments = doRerank(request, mergedDocuments);
String documentContext = rerankedDocuments.stream()
.map(Document::getText)
.collect(Collectors.joining(System.lineSeparator()));
augmentedUserText = EnergyAiConstant.DEFAULT_PROMPT_TEMPLATE.render(Map.of("query", userMessage.getText(), "question_answer_context", documentContext));
} else {
augmentedUserText = EnergyAiConstant.EMPTY_PROMPT_TEMPLATE.render(Map.of("query", userMessage.getText()));
}
// 增强提示词
return request.mutate()
.prompt(request.prompt().augmentUserMessage(augmentedUserText))
.context(context)
.build();
}
@NotNull
@Override
public ChatClientResponse after(ChatClientResponse chatClientResponse, @NotNull AdvisorChain advisorChain) {
ChatResponse.Builder chatResponseBuilder;
if (chatClientResponse.chatResponse() == null) {
chatResponseBuilder = ChatResponse.builder();
} else {
chatResponseBuilder = ChatResponse.builder().from(chatClientResponse.chatResponse());
}
chatResponseBuilder.metadata(RETRIEVED_DOCUMENTS, chatClientResponse.context().get(RETRIEVED_DOCUMENTS));
return ChatClientResponse.builder()
.chatResponse(chatResponseBuilder.build())
.context(chatClientResponse.context())
.build();
}
}
定义混合检索增强生成方法
public HybridRetrievalAdvisor createHybridRetrievalAdvisor(DocumentQueryContext documentParams, IntentResult intentResult) {
List<BaseDocumentRetriever> documentRetrievers = advisorRetrieverFactory.dynamicCreateRetrievers(documentParams, intentResult);
return new HybridRetrievalAdvisor(rerankModel, chatRagProperties, documentRetrievers, multiQueryExpander);
}
基础数据定义 DocumentQueryContext documentParams表示用户关联问题数据,包含通过接口参数定义或意图分析获取的文档元数据过滤条件。
/**
* 知识库文档 元数据查询条件
*
* @author endcy
* @date 2025/12/4
*/
@Getter
@Setter
@NoArgsConstructor
@JsonInclude(JsonInclude.Include.NON_NULL)
public class DocumentQueryContext implements Serializable {
@Serial
private static final long serialVersionUID = 1183263090210064149L;
private Long id;
/**
* 知识领域类型
*
* @see KnowledgeScopeTypeEnum
*/
private String scopeType;
/**
* 知识业务模块
*
* @see KnowledgeBusinessTypeEnum
*/
private String businessType;
/**
* 内容分组id,如租户id
*/
private Long groupId;
/**
* 内容来源
*
* @see DocSourceTypeEnum
*/
private String sourceType;
/**
* 是否公开
*/
private Boolean enablePublic;
/**
* 来源路径
*/
private String sourcePath;
/**
* 原始问题
*/
private String originalQuestion;
/**
* 重写问题
*/
private String reReadingQuestion;
}
IntentResult intentResult表示意图分析结果
/**
* 意图结果
*
* @author endcy
* @date 2025/12/4
*/
@Data
public class IntentResult {
/**
* 知识领域类型 预留,由调用端传入
*
* @see KnowledgeScopeTypeEnum
*/
private String scopeType;
/**
* 业务领域类型
*
* @see KnowledgeBusinessTypeEnum
*/
private String businessType;
private Long chatId;
private String userMessage;
/**
* 意图分离数据来源判断
*
* @see PossibleSourceTypeEnum
*/
List<PossibleSourceTypeEnum> dataScopeList;
}
具体属性不做过多叙述,意图分析同,通过微调模型或常规大模型做问题初步分析获取意图分析结果,如获取领域类型、可能的数据来源(据此来按需构造检索器列表)
检索增强调用方式
问题重读、日志增强、查询重写、问题重读、意图识别、聊天内容内存存储以及默认ChatClient commonChatClient的实现不做过多叙述,
可参考开源工程 https://github.com/endcy/base-ai-assistant
/**
* 和 RAG 知识库进行对话
* scopeType对应知识库文档范围,理论最佳实践应该是有一个本地微调模型,能将用户问题归类,即根据不同场景选择不同的知识库
*/
public String doChatWithRag(String scopeType, Long groupId, String message, @NonNull Long chatId) {
// 查询重写
String rewrittenMessage = queryRewriter.doQueryRewrite(message);
DocumentQueryContext documentParams = new DocumentQueryContext();
documentParams.setOriginalQuestion(message);
documentParams.setReReadingQuestion(rewrittenMessage);
// 默认文档范围
scopeType = StrUtil.blankToDefault(scopeType, "用户客服");
documentParams.setScopeType(scopeType);
if (groupId != null) {
documentParams.setGroupId(groupId);
}
// 意图识别,根据意图查询知识库、或者根据意图调用工具
IntentResult intentResult = intentAnalysisAgent.analyzeQuestion(chatId, scopeType, rewrittenMessage);
if (BooleanUtil.isTrue(chatRagProperties.getEnableIntentAnalysis())) {
KnowledgeBusinessTypeEnum businessType = KnowledgeBusinessTypeEnum.create(intentResult.getBusinessType());
if (!KnowledgeBusinessTypeEnum.UNKNOWN.equals(businessType)) {
documentParams.setBusinessType(businessType.getType());
}
}
ContextUserRecordDTO userRecord = ContextUserRecordDTO.builder()
.chatId(chatId)
.groupId(groupId)
.scopeType(scopeType)
.businessType(documentParams.getBusinessType())
.question(message)
.build();
userRecordService.insert(userRecord);
List<Message> existingMessages = messageWindowChatMemory.get(chatId.toString());
log.info("###### Chat memory for {}: {} messages size", chatId, existingMessages.size());
List<Advisor> dataResourceAdvisors = CollUtil.newArrayList();
dataResourceAdvisors.add(chatClientAdvisorFactory.createHybridRetrievalAdvisor(documentParams, intentResult));
ChatResponse chatResponse = commonChatClient
.prompt()
.user(rewrittenMessage)
.messages(existingMessages)
.toolCallbacks(mcpToolCallbacks.getToolCallbacks())
.toolCallbacks(ragTools)
.advisors(getAdvisorSpecConsumer(chatId))
.advisors(dataResourceAdvisors)
.call()
.chatResponse();
String content = null;
if (chatResponse != null) {
content = chatResponse.getResult().getOutput().getText();
userRecordService.updateAnswerById(userRecord.getId(), content);
}
if (log.isDebugEnabled()) {
//频度最高的调用 使用debug级别
log.debug("content: {}", content);
}
return content;
}
自此,提示词将更加简洁且清晰,检索增强效率更高更准确。
生成的提示词将类似如下格式
系统提示词......
User Request: 重写的用户问题
Read the question again: 重写的用户问题
Context information is below, surrounded by ---------------------
---------------------
多路检索 1路文档1
多路检索 1路文档2
多路检索 2路文档1
---------------------
Given the context if exists and provided history information and not prior knowledge, reply to the user comment.
If the answer belongs to the professional field of this system but is not in the context,
inform the user that you can't answer the question.
开源工程 base-ai-assistant
基于spring-boot、spring-ai、spring-ai-alibaba实现的RAG、MCP、Agent智能体基础服务框架应用;智能客服、智能运维、智能助手、简单工作流/垂直领域智能体的基础应用架构版本,按需拓展。
更多推荐


所有评论(0)