OpenAI-Compatible Language Models

Many services and tools expose OpenAI-compatible APIs. The general approach to using them with LangChain4j is:

Identify the Base URL: Find the API endpoint for the service. This often ends in /v1.
Obtain an API Key: If the service requires authentication, get an API key. If the service is local and doesn't require a key, put a placeholder as the apiKey parameter.
Specify the Model Name: Determine the correct model name to use for the service. This is often required.

Configure OpenAiChatModel or OpenAiStreamingChatModel:

ChatModel model = OpenAiChatModel.builder()
        .baseUrl("YOUR_API_BASE_URL") // e.g., "http://localhost:8000/v1"
        .apiKey("YOUR_API_KEY_OR_PLACEHOLDER") // e.g., "sk-yourkey" or "none"
        .modelName("MODEL_NAME_AS_PER_PROVIDER_DOCS") // e.g., "gpt-3.5-turbo" or custom name
        // Add other configurations like temperature, timeout, etc. as needed
        .logRequests(true)
        .logResponses(true)
        .build();

Below we provide specific examples for popular OpenAI-compatible APIs, including Groq, Docker Model Runner, GPT4All, Ollama, and LM Studio.

Prerequisites for Using OpenAI-Compatible Language Models

LangChain4j's OpenAI module can be used with various OpenAI-compatible APIs, including local and cloud-based solutions. For each of the models below, we show how to create a ChatModel that you can then use to chat with the model, just like in the standard OpenAI examples.

First, make sure you have the OpenAI module in your pom.xml or Gradle build file:

Plain Java

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-open-ai</artifactId>
    <version>1.9.1</version>
</dependency>

Spring Boot

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-open-ai-spring-boot-starter</artifactId>
    <version>1.9.1-beta17</version>
</dependency>

Groq

Deployment: SaaS (Key Required)

Description: Groq offers very fast inference for LLMs.

Setup: To use Groq, you'll need an API key from GroqCloud.

Configure LangChain4j's OpenAiChatModel or OpenAiStreamingChatModel:

ChatModel model = OpenAiChatModel.builder()
        .baseUrl("https://api.groq.com/openai/v1")
        .apiKey(System.getenv("GROQ_API_KEY")) // Or your actual key
        .modelName("llama3-8b-8192") // Or any other model offered by Groq, e.g., mixtral-8x7b-32768, llama3-70b-8192
        .temperature(0.0)
        .build();

You can find available model names on the Groq models page.

Docker Model Runner

Deployment: Local

Description: Docker Model Runner allows you to run LLMs locally using Docker desktop (uses llama.cpp under the hood and can use your CPU). This is useful for development, testing, or offline use. Works on Mac and Windows.

Setup:

Have Docker Desktop installed
Enable the Docker Model Runner feature in Docker Desktop (Settings > Experimental Features > Enable Docker Model Runner)
Just below that, check "Enable host-side TCP support".
Pull a model using the Docker Model Runner CLI, e.g., docker model pull ai/qwen3 or any other model from this list.

Example for ai/qwen3 (more info about the model here):

ChatModel model = OpenAiChatModel.builder()
        .baseUrl("http://localhost:12434/engines/llama.cpp/v1")
        .modelName("ai/qwen3")
        .build();

Some models support tool calling, see details on the docker model page.

GPT4All

Deployment: Local

Description: GPT4All provides a desktop application to run open-source LLMs locally on your machine. It can also expose an OpenAI-compatible API.

Setup:

Download and install GPT4All from https://gpt4all.io/.
Launch GPT4All and download the desired model(s) through its UI, eg. llama-3.2-1b-instruct.
Enable the "Web Server" mode in GPT4All settings ("Settings" > "Application" > under Advanced: "Enable Local API Server").
Note the IP address and port displayed in GPT4All (typically http://localhost:4891/v1).
Configure LangChain4j:

ChatModel model = OpenAiChatModel.builder()
        .baseUrl("http://localhost:4891/v1")
        .modelName("llama-3.2-1b-instruct") // The model name might be derived from the model loaded in GPT4All UI or configurable. Check GPT4All docs.
        .build();

Ollama

While LangChain4j has a dedicated langchain4j-ollama module (see Ollama docs), you can also use the OpenAI module to connect to Ollama's OpenAI-compatible endpoint as shown above.

Deployment: Local

Description: Ollama allows you to run open-source large language models, such as Llama 3, Mistral, and others, locally. It provides an OpenAI-compatible API endpoint.

Setup:

Install Ollama from https://ollama.ai/.
Pull a model using the command line: ollama pull <model_name> (e.g., ollama pull gemma3).
Ensure Ollama is running. It serves an OpenAI-compatible API at http://localhost:11434/v1/.
Configure LangChain4j:

ChatModel model = OpenAiChatModel.builder()
        .baseUrl("http://localhost:11434/v1/")
        .modelName("gemma3")
        .build();

Examples:

For OpenAI-compatible endpoint usage, adapt general OpenAI examples.
Using the dedicated Ollama module: langchain4j-examples/.../OllamaChatModelExamples.java

LM Studio

Deployment: Local

Description: LM Studio provides a UI to discover, download, and run local LLMs. It also features an OpenAI-compatible local server.

Setup:

Download and install LM Studio from https://lmstudio.ai/.
Download your desired model(s) through the LM Studio UI (Search tab), for example smollm2-135m-instruct.
Go to the "Developer" tab (icon like >_ on the left) and toggle the server status on to 'running'
When the server runs, you get to see the address on the top right (e.g., http://127.0.0.1:1234). Alternatively, the cURL call will give you the full URL.
LMStudio as for now does not support HTTP2, hence we need to enforce the use of HTTP1.1. For that, we need to add the correct maven or gradle dependency:

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-http-client-jdk</artifactId>
    <version>1.9.1</version>
</dependency>

Configure LangChain4j and specify the httpClientBuilder

import java.net.http.HttpClient;
import dev.langchain4j.http.client.jdk.JdkHttpClientBuilder;
import dev.langchain4j.http.client.jdk.JdkHttpClient;

...

HttpClient.Builder httpClientBuilder = HttpClient.newBuilder()
        .version(HttpClient.Version.HTTP_1_1) ;

JdkHttpClientBuilder jdkHttpClientBuilder = JdkHttpClient.builder()
        .httpClientBuilder(httpClientBuilder);

ChatModel model = OpenAiChatModel.builder()
        .baseUrl("http://127.0.0.1:1234/v1")
        .modelName("smollm2-135m-instruct")
        .httpClientBuilder(jdkHttpClientBuilder)
        .build();

Contents:​

Prerequisites for Using OpenAI-Compatible Language Models​

Plain Java​

Spring Boot​

Groq​

Docker Model Runner​

GPT4All​

Ollama​

LM Studio​

Contents:

Prerequisites for Using OpenAI-Compatible Language Models

Plain Java

Spring Boot

Groq

Docker Model Runner

GPT4All

Ollama

LM Studio