OpenAI-Compatible Language Models
Many services and tools expose OpenAI-compatible APIs. The general approach to using them with LangChain4j is:
-
Identify the Base URL: Find the API endpoint for the service. This often ends in
/v1
. -
Obtain an API Key: If the service requires authentication, get an API key. If the service is local and doesn't require a key, put a placeholder as the
apiKey
parameter. -
Specify the Model Name: Determine the correct model name to use for the service. This is often required.
-
Configure
OpenAiChatModel
orOpenAiStreamingChatModel
:ChatModel model = OpenAiChatModel.builder()
.baseUrl("YOUR_API_BASE_URL") // e.g., "http://localhost:8000/v1"
.apiKey("YOUR_API_KEY_OR_PLACEHOLDER") // e.g., "sk-yourkey" or "none"
.modelName("MODEL_NAME_AS_PER_PROVIDER_DOCS") // e.g., "gpt-3.5-turbo" or custom name
// Add other configurations like temperature, timeout, etc. as needed
.logRequests(true)
.logResponses(true)
.build();
Below we provide specific examples for popular OpenAI-compatible APIs, including Groq, Docker Model Runner, GPT4All, Ollama, and LM Studio.
Contents:
- Prerequisites for Using OpenAI-Compatible Language Models
- Groq
- Docker Model Runner
- GPT4All
- Ollama
- LM Studio
Prerequisites for Using OpenAI-Compatible Language Models
LangChain4j's OpenAI module can be used with various OpenAI-compatible APIs, including local and cloud-based solutions. For each of the models below, we show how to create a ChatModel
that you can then use to chat with the model, just like in the standard OpenAI examples.
First, make sure you have the OpenAI module in your pom.xml
or Gradle build file:
Plain Java
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai</artifactId>
<version>1.2.0</version>
</dependency>
Spring Boot
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai-spring-boot-starter</artifactId>
<version>1.2.0-beta8</version>
</dependency>
Groq
Deployment: SaaS (Key Required)
Description: Groq offers very fast inference for LLMs.
Setup: To use Groq, you'll need an API key from GroqCloud.
Configure LangChain4j's OpenAiChatModel
or OpenAiStreamingChatModel
:
ChatModel model = OpenAiChatModel.builder()
.baseUrl("https://api.groq.com/openai/v1")
.apiKey(System.getenv("GROQ_API_KEY")) // Or your actual key
.modelName("llama3-8b-8192") // Or any other model offered by Groq, e.g., mixtral-8x7b-32768, llama3-70b-8192
.temperature(0.0)
.build();
You can find available model names on the Groq models page.
Docker Model Runner
Deployment: Local
Description: Docker Model Runner allows you to run LLMs locally using Docker desktop (uses llama.cpp
under the hood and can use your CPU). This is useful for development, testing, or offline use. Works on Mac and Windows.
Setup:
- Have Docker Desktop installed
- Enable the Docker Model Runner feature in Docker Desktop (Settings > Experimental Features > Enable Docker Model Runner)
- Just below that, check "Enable host-side TCP support".
- Pull a model using the Docker Model Runner CLI, e.g.,
docker model pull ai/qwen3
or any other model from this list.
Example for ai/qwen3
(more info about the model here):
ChatModel model = OpenAiChatModel.builder()
.baseUrl("http://localhost:12434/engines/llama.cpp/v1")
.modelName("ai/qwen3")
.build();
Some models support tool calling, see details on the docker model page.
GPT4All
Deployment: Local
Description: GPT4All provides a desktop application to run open-source LLMs locally on your machine. It can also expose an OpenAI-compatible API.
Setup:
- Download and install GPT4All from https://gpt4all.io/.
- Launch GPT4All and download the desired model(s) through its UI, eg.
llama-3.2-1b-instruct
. - Enable the "Web Server" mode in GPT4All settings ("Settings" > "Application" > under Advanced: "Enable Local API Server").
- Note the IP address and port displayed in GPT4All (typically
http://localhost:4891/v1
). - Configure LangChain4j:
ChatModel model = OpenAiChatModel.builder()
.baseUrl("http://localhost:4891/v1")
.modelName("llama-3.2-1b-instruct") // The model name might be derived from the model loaded in GPT4All UI or configurable. Check GPT4All docs.
.build();
Ollama
While LangChain4j has a dedicated langchain4j-ollama
module (see Ollama docs), you can also use the OpenAI module to connect to Ollama's OpenAI-compatible endpoint as shown above.
Deployment: Local
Description: Ollama allows you to run open-source large language models, such as Llama 3, Mistral, and others, locally. It provides an OpenAI-compatible API endpoint.
Setup:
- Install Ollama from https://ollama.ai/.
- Pull a model using the command line:
ollama pull <model_name>
(e.g.,ollama pull gemma3
). - Ensure Ollama is running. It serves an OpenAI-compatible API at
http://localhost:11434/v1/
. - Configure LangChain4j:
ChatModel model = OpenAiChatModel.builder()
.baseUrl("http://localhost:11434/v1/")
.modelName("gemma3")
.build();
Examples:
- For OpenAI-compatible endpoint usage, adapt general OpenAI examples.
- Using the dedicated Ollama module: langchain4j-examples/.../OllamaChatModelExamples.java
LM Studio
Deployment: Local
Description: LM Studio provides a UI to discover, download, and run local LLMs. It also features an OpenAI-compatible local server.
Setup:
- Download and install LM Studio from https://lmstudio.ai/.
- Download your desired model(s) through the LM Studio UI (Search tab), for example
smollm2-135m-instruct
. - Go to the "Developer" tab (icon like
>_
on the left) and toggle the server status on to 'running' - When the server runs, you get to see the address on the top right (e.g.,
http://127.0.0.1:1234
). Alternatively, the cURL call will give you the full URL. - LMStudio as for now does not support HTTP2, hence we need to enforce the use of HTTP1.1. For that, we need to add the correct maven or gradle dependency:
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-http-client-jdk</artifactId>
<version>1.2.0</version>
</dependency>
- Configure LangChain4j and specify the
httpClientBuilder
import java.net.http.HttpClient;
import dev.langchain4j.http.client.jdk.JdkHttpClientBuilder;
import dev.langchain4j.http.client.jdk.JdkHttpClient;
...
HttpClient.Builder httpClientBuilder = HttpClient.newBuilder()
.version(HttpClient.Version.HTTP_1_1) ;
JdkHttpClientBuilder jdkHttpClientBuilder = JdkHttpClient.builder()
.httpClientBuilder(httpClientBuilder);
ChatModel model = OpenAiChatModel.builder()
.baseUrl("http://127.0.0.1:1234/v1")
.modelName("smollm2-135m-instruct")
.httpClientBuilder(jdkHttpClientBuilder)
.build();