LangChain4j Prompt对话机器人

未完待续

引言

之前,使用Spring AI对接大模型实现了对话机器人的功能:Spring AI实现一个简单的对话机器人,spring-boot与langchain4j整合可以实现同样的功能。

spring-boot与langchain4j整合,可以采用集成底层API(popular integrations)的方式,也有集成高层API(declarative AI Services)的方式,这里先后使用底层和高层API进行集成和测试。

1.底层API实现对话

引入spring-boot 3.5.4,langchain4j-bom。截至目前,官网上langchain4j-bom的最高版本是1.8.0,均需要jdk17+

<parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>3.5.4</version>
</parent>

<properties>
    <maven.compiler.source>21</maven.compiler.source>
    <maven.compiler.target>21</maven.compiler.target>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j-bom</artifactId>
            <version>1.8.0</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

<repositories>
    <repository>
        <name>Central Portal Snapshots</name>
        <id>central-portal-snapshots</id>
        <url>https://central.sonatype.com/repository/maven-snapshots/</url>
        <releases>
            <enabled>false</enabled>
        </releases>
        <snapshots>
            <enabled>true</enabled>
        </snapshots>
    </repository>
</repositories>

以对接OpenAI及支持该协议的大模型为例,添加底层API依赖langchain4j-open-ai-spring-boot-starter

<dependencies>

    <dependency>
        <groupId>dev.langchain4j</groupId>
        <artifactId>langchain4j-open-ai-spring-boot-starter</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-test</artifactId>
        <scope>test</scope>
    </dependency>

    <dependency>
        <groupId>org.projectlombok</groupId>
        <artifactId>lombok</artifactId>
        <scope>provided</scope>
    </dependency>

</dependencies>

1.1 阻塞式ChatModel

使用OpenAI协议对接DeepSeek大模型,更多详细的模型参数介绍见:https://docs.langchain4j.dev/tutorials/model-parameters

langchain4j:
  open-ai:
    chat-model:
      base-url: https://api.deepseek.com
      api-key: ${OPEN_API_KEY}
      model-name: deepseek-reasoner
      log-requests: true
      log-responses: true
      return-thinking: true


server:
  port: 8080

logging:
  level:
    dev.langchain4j: debug #需要设置日志级别

有些配置项不支持填写在配置文件,因此还可以通过配置类进行配置

package org.example.config;

import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.openai.OpenAiChatModel;
import org.springframework.context.annotation.Configuration;

@Configuration
public class LangChainConfig {

    public ChatModel chatModel() {
      
        return OpenAiChatModel.builder()
                .baseUrl("https://api.deepseek.com")
                .apiKey(System.getProperty("OPEN_API_KEY"))
                .modelName("deepseek-reasoner")
                .maxRetries(3)
                .logRequests(true)
                .logResponses(true)
                .returnThinking(true)
                .build();
    }
}

然后可以直接使用ChatModel实现Prompt对话,并返回消耗的Token数,ChatModel是一种阻塞式的API,需要等待大模型回复完成将结果一次性返回

package org.example.controller;

import dev.langchain4j.data.message.ChatMessage;
import dev.langchain4j.data.message.SystemMessage;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.model.output.TokenUsage;
import jakarta.annotation.Resource;
import lombok.extern.slf4j.Slf4j;

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Flux;

import java.util.Arrays;
import java.util.List;

@RestController
@RequestMapping("chat")
@Slf4j
public class ChatController {

    @Resource
    private ChatModel chatModel;

    @GetMapping("chat")
    public String chat(String msg) {

        List<ChatMessage> messages = Arrays.asList(
                SystemMessage.from("你是一个数学老师,用简单易懂的方式解释数学概念。"),
                UserMessage.from(msg)
        );

        ChatResponse chatResponse = chatModel.chat(messages);
        TokenUsage tokenUsage = chatResponse.tokenUsage();
        log.info("token usage: {}", tokenUsage);

        return chatResponse.aiMessage().text();

    }


}

1.2 流式StreamingChatModel

StreamingChatModel是一种非阻塞式的API,不需要等待大模型回复完成将结果一次性返回,而是实时返回大模型生成的片段,直到全部返回。

pom.xml中新增支持流式返回的依赖

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-reactor</artifactId>
</dependency>

配置文件application.yml需要新增流式的streaming-chat-model配置

langchain4j:
  open-ai:
    streaming-chat-model:
      base-url: https://api.deepseek.com
      api-key: ${OPEN_API_KEY}
      model-name: deepseek-reasoner
      log-requests: true
      log-responses: true
      return-thinking: true

同样可以通过配置类进行配置

package org.example.config;

import dev.langchain4j.model.openai.OpenAiStreamingChatModel;
import dev.langchain4j.model.chat.StreamingChatModel;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class LangChainConfig {

    @Bean
    public StreamingChatModel chatModel() {
        return OpenAiStreamingChatModel.builder()
                .baseUrl("https://api.deepseek.com")
                .apiKey(System.getProperty("OPEN_API_KEY"))
                .modelName("deepseek-reasoner")
                .logRequests(true)
                .logResponses(true)
                .returnThinking(true)
                .build();
    }
}

流式API是由StreamingChatModel类来实现,在web环境下,需要配合Spring的Flux来使用,在下面方法回调触发时调用相应的Flux的方法,像Spring AI那样将Flux对象返回。

  • onPartialResponse 实时返回大模型生成的片段,调用sink.next()实时输出到浏览器
  • onPartialThinking 实时返回大模型推理过程,调用sink.next()实时输出到浏览器
  • onCompleteResponse 大模型生成完成,调用sink.complete()结束流的输出,还可以对消耗的token进行统计
  • onError 出错,记录错误信息,调用sink.complete()结束流的输出
package org.example.controller;

import dev.langchain4j.data.message.ChatMessage;
import dev.langchain4j.data.message.SystemMessage;
import dev.langchain4j.data.message.UserMessage;

import dev.langchain4j.model.chat.StreamingChatModel;
import dev.langchain4j.model.chat.response.*;
import dev.langchain4j.model.output.TokenUsage;
import jakarta.annotation.Resource;
import lombok.extern.slf4j.Slf4j;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Flux;

import java.util.Arrays;
import java.util.List;

@RestController
@RequestMapping("chat")
@Slf4j
public class StreamController {

    @Resource
    private StreamingChatModel streamingChatModel;

    @GetMapping(value = "streaming", produces = "text/html; charset=utf-8")
    public Flux<String> streaming(String msg) {

        List<ChatMessage> messages = Arrays.asList(
                SystemMessage.from("你是一个数学老师,用简单易懂的方式解释数学概念。"),
                UserMessage.from(msg)
        );

        return Flux.create(sink -> {
            streamingChatModel.chat(messages, new StreamingChatResponseHandler() {

                @Override
                public void onPartialResponse(PartialResponse partialResponse, PartialResponseContext context) {
                    sink.next(partialResponse.text());
                }

                @Override
                public void onPartialThinking(PartialThinking partialThinking) {
                    sink.next("<span style='color:red;'>" + partialThinking.text() + "</span>");
                }

                @Override
                public void onCompleteResponse(ChatResponse completeResponse) {
                    TokenUsage tokenUsage = completeResponse.tokenUsage();
                    log.info("token usage: {}", tokenUsage);
                    sink.complete();
                }

                @Override
                public void onError(Throwable error) {
                    error.printStackTrace();
                    sink.complete();
                }
            });
        });

    }
}

2.高层API实现对话

使用高层API,需要在底层API基础上,额外引入这个依赖

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-spring-boot-starter</artifactId>
</dependency>

2.1 阻塞式对话

新建一个接口,将调用大模型的方法声明在里面

package org.example.ai;

public interface AiAssistant {

    String chat(String prompt);

}

配置类将一个或多个ChatModel对象注入容器的基础上,再通过AiServices将刚刚定义的AiAssistant注入容器,并注入ChatModel对象

package org.example.config;

import dev.langchain4j.model.openai.OpenAiChatModel;
import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.service.AiServices;
import org.example.ai.AiAssistant;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class LangChainConfig {

    @Bean
    public AiAssistant aiAssistant(ChatModel chatModel) {
        return AiServices.builder(AiAssistant.class)
                .chatModel(chatModel)
                .build();
    }

}


然后直接注入AiAssistant到对应类,并调用方法即可

package org.example.controller;

import jakarta.annotation.Resource;
import org.example.ai.AiAssistant;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
@RequestMapping("high-chat")
public class HighChatController {

    @Resource
    private AiAssistant aiAssistant;

    @GetMapping("chat")
    public String chat(String msg) {
        return aiAssistant.chat(msg);

    }
}

实际上,高层API可以使用接口类加注解的方式进行配置,通过@AiService注解标注为操作大模型的接口类,会直接被实例化,无需在配置类中再去通过AiServices.builder进行实例化

package org.example.ai;

import dev.langchain4j.data.message.ChatMessage;
import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.spring.AiService;
import dev.langchain4j.service.spring.AiServiceWiringMode;

@AiService(
        //如需手动配置模型,需要设置属性:AiServiceWiringMode.EXPLICIT
        wiringMode = AiServiceWiringMode.EXPLICIT,
        //如需手动配置模型,要指定具体使用哪个模型,例如:chatModel = "deepseek"
        chatModel = "chatModel"
)
public interface AiAssistant {

    //@SystemMessage("") system提示词
    String chat(String prompt);

}

2.2 流式对话

  1. 同底层API的流式一样,也要引入langchain4j-reactor依赖
  2. 同样需要先将一个StreamingChatModel的对象注入容器
  3. @AiService注解中大模型属性名使用streamingChatModel,然后调用StreamAssistant的方法即可,Controller中可以直接将Flux对象返回
package org.example.ai;

import dev.langchain4j.service.spring.AiService;
import dev.langchain4j.service.spring.AiServiceWiringMode;
import reactor.core.publisher.Flux;

@AiService(
        wiringMode = AiServiceWiringMode.EXPLICIT,
        streamingChatModel = "streamingChatModel"
)
public interface StreamAssistant {
    //@SystemMessage("")
    Flux<String> chat(String prompt);
}
@Resource
private StreamAssistant streamAssistant;

@GetMapping(value = "chat", produces = "text/html; charset=utf-8")
public Flux<String> chat(String msg) {
    return streamAssistant.chat(msg);

}

3.对话记忆ChatMemory

关于会话记忆的概念等,已经在:Spring AI实现一个简单的对话机器人一文中讲到。

先明确langchain4j中的两个概念,记忆和历史

  • 历史(History) 历史记录会完整保存用户与人工智能之间的所有消息。历史记录就是用户在用户界面中看到的内容,它代表了实际发生过的所有对话。

  • 记忆(Memory) 保留一些信息,这些信息会呈现给LLM,使其表现得好像“记住”了对话。记忆与历史记录截然不同。根据所使用的内存算法,它可以以各种方式修改历史记录:例如,删除一些消息、汇总多条消息、汇总单个消息、移除消息中不重要的细节、向消息中注入额外信息(用于RAG算法)或指令(用于结构化输出)等等。

langchain4j目前仅提供记忆管理,不提供历史记录管理。如需要保留完整的历史记录,要手动操作。

langchain4j通过ChatMemory实现记忆缓存,因为一段长对话含有的信息很多,如果不加以修剪,会产生很多冗余,甚至超过一次对话的Token大小限制,因此langchain4j对ChatMemory设计了两种实现:

  • MessageWindowChatMemory 一个比较简单的实现,作为一个滑动窗口,只保留最近的N多个记录
  • TokenWindowChatMemory 保留最近的N多个Token,通过TokenCountEstimator计算会话的令牌数

3.1 底层API实现对话记忆

这里以MessageWindowChatMemory为例,配置类中新增配置

package org.example.config;

import dev.langchain4j.memory.ChatMemory;
import dev.langchain4j.memory.chat.ChatMemoryProvider;
import dev.langchain4j.memory.chat.MessageWindowChatMemory;
import dev.langchain4j.store.memory.chat.ChatMemoryStore;
import dev.langchain4j.store.memory.chat.InMemoryChatMemoryStore;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class LangChainConfig {

    /**
     * 采用内存存储
     */
    @Bean
    public ChatMemoryStore chatMemoryStore() {
        return new InMemoryChatMemoryStore();
    }

    /**
     * ChatMemoryProvider类,每次根据不同对话ID生成专属的ChatMemory对象
     */
    @Bean
    public ChatMemoryProvider chatMemoryProvider () {
        return new ChatMemoryProvider() {
            @Override
            public ChatMemory get(Object id) {
                return MessageWindowChatMemory.builder()
                        .id(id)
                        .maxMessages(1000)
                        .chatMemoryStore( chatMemoryStore() )
                        .build();
            }
        };
    }

}

Controller中,注入ChatMemoryProvider对象,将和大模型的对话改造升级为支持记忆的

每次对话,将用户提问和大模型回答都进行保存,关联到同一个会话ID

package org.example.controller;

import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.data.message.ChatMessage;
import dev.langchain4j.data.message.SystemMessage;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.memory.ChatMemory;
import dev.langchain4j.memory.chat.ChatMemoryProvider;
import dev.langchain4j.model.chat.StreamingChatModel;
import dev.langchain4j.model.chat.response.*;
import dev.langchain4j.model.output.TokenUsage;
import jakarta.annotation.Resource;
import lombok.extern.slf4j.Slf4j;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Flux;

import java.util.Arrays;
import java.util.List;

@RestController
@RequestMapping("memory-chat")
@Slf4j
public class MemoryController {


    @Resource
    private StreamingChatModel streamingChatModel;

    @Resource
    private ChatMemoryProvider chatMemoryProvider;

    @GetMapping(value = "streaming", produces = "text/html; charset=utf-8")
    public Flux<String> streaming(String msg, String msgId) {

        // 将问题保存到当前对话记忆
        ChatMemory chatMemory = chatMemoryProvider.get(msgId);
        chatMemory.add(UserMessage.from(msg));

        return Flux.create(sink -> {
            streamingChatModel.chat(chatMemory.messages(), new StreamingChatResponseHandler() {

                @Override
                public void onPartialResponse(PartialResponse partialResponse, PartialResponseContext context) {
                    sink.next(partialResponse.text());
                }

                @Override
                public void onPartialThinking(PartialThinking partialThinking) {
                    sink.next("<span style='color:red;'>" + partialThinking.text() + "</span>");
                }

                @Override
                public void onCompleteResponse(ChatResponse completeResponse) {
                    TokenUsage tokenUsage = completeResponse.tokenUsage();
                    log.info("token usage: {}", tokenUsage);

                    // 大模型回答完毕,将大模型的回答也添加进当前对话记忆
                    AiMessage aiMessage = completeResponse.aiMessage();
                    chatMemory.add(aiMessage);

                    sink.complete();
                }

                @Override
                public void onError(Throwable error) {
                    error.printStackTrace();
                    sink.complete();
                }
            });
        });

    }

}

"如果文章对您有帮助,可以请作者喝杯咖啡吗?"

微信二维码

微信支付

支付宝二维码

支付宝


LangChain4j Prompt对话机器人
https://blog.liuzijian.com/post/langchain4j/2025/11/04/langchain4j-prompt/
作者
Liu Zijian
发布于
2025年11月4日
许可协议