Guardrails

note

Guardrails is an experimental feature. Its API and behavior might change in future versions.

Guardrails are mechanisms that let you validate the input and output of the LLM to ensure it meets your expectations. You can do some of the following things with guardrails:

Verify the user input is not out of scope
Ensure the input meets some criteria before calling the LLM (i.e. guard against a prompt injection attack)
Ensure the output format is correct (i.e. it is a JSON document with the correct schema)
Ensure the LLM output is coherent with business rules and constraints (i.e. if this is a chatbot of company X, the response should not contain any reference to a competitor Y).
Detect hallucinations

Those are just examples. You can do many other things with guardrails.

note

Guardrails are only available when using AI Services. They are a higher-level construct that can not be applied to a ChatModel or StreamingChatModel.

Guardrails ;

The implementation was originally done in the Quarkus LangChain4j extension and was backported here.

Implementing Guardrails

Ideally, guardrail implementations should follow the single responsibility principle, meaning that each guardrail class should validate one thing. Then, chain guardrails together to guard against multiple things.

The order of guardrails in the chain is important. The first guardrail in the chain to fail will trigger the overall failure. Ensure guardrails that catch the most failures are early in the chain, whereas more specific guardrails that may fail very infrequently are towards the end of the chain.

Also keep in mind that guardrails can themselves call other services or even invoke other LLM interactions. If these kinds of guardrails have an execution penalty or monetary cost associated with them, make sure you take that into account. You might want to put more expensive guardrails towards the end of the chain.

note

The term expensive can mean that something takes some time to execute or has a monetary value associated with it.

Input Guardrails

Input guardrails are functions invoked before the LLM is called. Failing an input guardrail prevents the LLM from being called. Input guardrails are the last step prior to calling the LLM. They are invoked after any RAG operations have happened.

Implementing Input Guardrails

Input guardrails are implemented by implementing the InputGuardrail interface. The InputGuardrail interface has two variants of the validate method, at least one of which needs to be implemented:

InputGuardrailResult validate(UserMessage userMessage);
InputGuardrailResult validate(InputGuardrailRequest params);

The first variant is used for simple guardrails, or when the guardrail only needs access to the UserMessage.

The second variant is for more complex guardrails that need more information, such as the chat memory/history, user message template, augmentation results, or variables that were passed to the template. See InputGuardrailRequest for more information.

Some examples of things you could do:

Check that there are enough documents in the augmentation results
Ensure the user is not asking the same question multiple times
Mitigate potential prompt injection attack

Input guardrails can be used whether the operation is synchronous or asynchronous/streaming.

Input Guardrail Outcomes

Input guardrails can have the following outcomes. There are helper methods on the InputGuardrail interface that can provide the outcomes:

Outcome	Helper method on `InputGuardrail`	Description
*success*	`success()`	- The input is valid. - The next guardrail in the chain is executed. - The LLM is called if the last guardrail passes.
*success with alternate result*	`successWith(String)`	Similar to *success* except the user message is altered before proceeding to the next step (next guardrail in the chain or calling the LLM).
*failure*	`failure(String)` or `failure(String, Throwable)`	- The input is invalid but the next guardrails in the chain continue to be executed in order to accumulate all possible validation problems. - The LLM is not called. - If a `Throwable` is passed, consumers can catch `InputGuardrailException` and check the `cause`. It will be the `Throwable` passed here.
*fatal*	`fatal(String)` or `fatal(String, Throwable)`	- The input is invalid and execution is halted with an `InputGuardrailException`. - The LLM is not called. - If a `Throwable` is passed, consumers can catch `InputGuardrailException` and check the `cause`. It will be the `Throwable` passed here.

Declaring Input Guardrails

There are several ways to declare input guardrails, listed here in order of precedence:

InputGuardrail implementation class names or instances set directly on the AiServices builder.
@InputGuardrails annotations placed on an individual AI Service method.
@InputGuardrails annotation placed on an AI Service class. Regardless of how they are declared, input guardrails are always executed in the order they appear in the list.

`AiServices` builder

InputGuardrail implementation class names or instances set directly on the AiServices builder have the highest precedence, meaning if it is declared in any other ways, the one declared directly on the builder will be the one used.

public interface Assistant {
    String chat(String question);
    String doSomethingElse(String question);
}

var assistant = AiServices.builder(Assistant.class)
    .chatModel(chatModel)
    .inputGuardrailClasses(FirstInputGuardrail.class, SecondInputGuardrail.class)
    .build();

public interface Assistant {
    String chat(String question);
    String doSomethingElse(String question);
}

var assistant = AiServices.builder(Assistant.class)
    .chatModel(chatModel)
    .inputGuardrails(new FirstInputGuardrail(), new SecondInputGuardrail())
    .build();

In the first scenario, classes that implement InputGuardrail are passed. New instances of these classes are created dynamically using reflection.

info

The way classes are converted to instances can be customized. For example, frameworks that use dependency injection (like Quarkus or Spring) can use extension points to provide instances based on how they manage class instances rather than creating new instances via reflection each time.

Annotation on individual AI Service methods

@InputGuardrails annotations placed on an individual AI Service methods have the next highest precedence.

public interface Assistant {
    @InputGuardrails({ FirstInputGuardrail.class, SecondInputGuardrail.class })
    String chat(String question);
    
    String doSomethingElse(String question);
}

var assistant = AiServices.create(Assistant.class, chatModel);

In this example, only the chat method has guardrails.

On the chat method, FirstInputGuardrail is invoked first.
Only if it is successful will the LLM be called.
SecondInputGuardrail will only be invoked if FirstInputGuardrail does not result in a fatal result.
Either FirstInputGuardrail or SecondInputGuardrail could re-write the user message.
If FirstInputGuardrail re-writes the user message, then SecondInputGuardrail will receive the new user message as input.

The doSomethingElse method does not have any guardrails.

Annotation on the AI Service class

@InputGuardrails annotation placed on an AI Service class has the lowest precedence.

@InputGuardrails({ FirstInputGuardrail.class, SecondInputGuardrail.class })
public interface Assistant {
    String chat(String question);
    String doSomethingElse(String question);
}

var assistant = AiServices.create(Assistant.class, chatModel);

In this example, both the chat and doSomethingElse methods have the guardrails.

Just like in the previous example, FirstInputGuardrail is invoked first.
Only if it is successful will the LLM be called.
SecondInputGuardrail will only be invoked if FirstInputGuardrail does not result in a fatal result.
Either FirstInputGuardrail or SecondInputGuardrail could re-write the user message.
If FirstInputGuardrail re-writes the user message, then SecondInputGuardrail will receive the new user message as input.

Unit Testing Input Guardrails

There are some unit testing utilities based on AssertJ in the langchain4j-test module.

Maven
Gradle (Groovy)
Gradle (Kotlin)

<dependency>
  <groupId>dev.langchain4j</groupId>
  <artifactId>langchain4j-test</artifactId>
  <scope>test</scope>
</dependency>

testImplementation 'dev.langchain4j:langchain4j-test'

testImplementation("dev.langchain4j:langchain4j-test")

Once you have the dependency, you can perform these kinds of validations:

import static dev.langchain4j.test.guardrail.GuardrailAssertions.assertThat;

import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.guardrail.GuardrailResult.Result;

class Tests { 
    MyInputGuardrail inputGuardrail = new MyInputGuardrail();
    
    @Test 
    void test() {
        var userMessage = UserMessage.from("Some user message");
        var result = inputGuardrail.validate(userMessage);
        
        // These are just some examples of what you can do
        assertThat(result)
                .isSuccessful()
                .hasResult(Result.FATAL)
                .hasFailures()
                .hasSingleFailureWithMessage("Prompt injection detected")
                .assertSingleFailureSatisfied(failure -> assertThat(failure)...)
                .withFailures().....
    }
}

info

See the GuardrailAssertions and InputGuardrailResultAssert classes for more details.

Out-of-the-box Input Guardrails

There are several common use cases where implementations of an input guardrail are provided by LangChain4j:

Guardrail class Description

MessageModeratorInputGuardrail An input guardrail that validates user messages using a ModerationModel to detect potentially harmful, inappropriate, or policy-violating content.
- Checks incoming messages for hate speech, violence, self-harm, sexual content, or other categories defined by the moderation model.
- If the message is flagged, validation fails with a fatal result, preventing the message from being processed further.
- Useful for ensuring user inputs comply with content policies before being sent to an LLM.

Guardrail class	Description
`MessageModeratorInputGuardrail`	An input guardrail that validates user messages using a `ModerationModel` to detect potentially harmful, inappropriate, or policy-violating content. - Checks incoming messages for hate speech, violence, self-harm, sexual content, or other categories defined by the moderation model. - If the message is flagged, validation fails with a fatal result, preventing the message from being processed further. - Useful for ensuring user inputs comply with content policies before being sent to an LLM.

Output Guardrails

Output guardrails are functions executed after the LLM has produced its output. Failing an output guardrail allows for more advanced scenarios, such as retrying or reprompting, to help improve the response. They are invoked after all other operations, including function/tool calls, have happened.

Implementing Output Guardrails

Similar to input guardrails, output guardrails are implemented by implementing the OutputGuardrail interface. The OutputGuardrail interface has two variants of the validate method, at least one of which needs to be implemented:

OutputGuardrailResult validate(AiMessage responseFromLLM);
OutputGuardrailResult validate(OutputGuardrailRequest params);

The first variant is used for simple guardrails, or when the guardrail only needs access to the resulting AiMessage.

The second variant is for more complex guardrails that need more information, such as the entire chat response, chat memory/history, user message template, or variables that were passed to the template. See OutputGuardrailRequest for more information.

Some examples of things you could do:

Ensure the output format is correct (i.e. it is a JSON document with the correct schema)
Detect an LLM hallucination
Validate that the LLM response contains certain information

Output Guardrail Outcomes

Output guardrails can have the following outcomes. There are helper methods on the OutputGuardrail interface that can provide the outcomes:

Outcome	Helper method on `OutputGuardrail`	Description
*success*	`success()`	- The output is valid. - The next guardrail in the chain is executed. If the last guardrail passes the output is returned to the caller.
*success with rewrite*	`successWith(String)` or `successWith(String, Object)`	-Similar to *success* except the output isn't valid in its original form and has been rewritten to make it valid. - The next guardrail is executed against the rewritten output. If the last guardrail passes the output is returned to the caller.
*failure*	`failure(String)` or `failure(String, Throwable)`	- The output is invalid but the next guardrails in the chain continue to be executed in order to accumulate all possible validation problems. - The validation failure is returned to the user as an `OutputGuardrailException`.
*fatal*	`fatal(String)` or `fatal(String, Throwable)`	The output is invalid and execution is halted with an `OutputGuardrailException` thrown to the caller.
*fatal with retry*	`retry(String)` or `retry(String, Throwable)`	- Similar to *fatal* except the LLM is called again with the same prompt and chat history as the original call. - If the failure persists after a configurable number of retries then execution is halted with an `OutputGuardrailException` thrown to the caller. - If the guardrail passes after a retry, the entire chain of guardrails are re-executed from the beginning.
*fatal with reprompt*	`reprompt(String, String)` or `reprompt(String, Throwable, String)`	- Similar to *fatal with retry* except the LLM is called again with a new prompt supplied by the guardrail. - In this situation, the guardrail supplies an additional message to append to the previous user message, then sends a new request to the LLM with the new user message and original chat history. - If the failure persists after a configurable number of retries then execution is halted with an `OutputGuardrailException` thrown to the caller. - If the guardrail passes after a reprompt, the entire chain of guardrails are re-executed from the beginning.

Declaring Output Guardrails

There are several ways to declare output guardrails, listed here in order of precedence:

OutputGuardrail implementation class names or instances set directly on the AiServices builder.
@OutputGuardrails annotations placed on an individual AI Service method.
@OutputGuardrails annotation placed on an AI Service class.

Regardless of how they are declared, output guardrails are always executed in the order they appear in the list.

`AiServices` builder

OutputGuardrail implementation class names or instances set directly on the AiServices builder have the highest precedence, meaning if it is declared in any other ways, the one declared on the builder will be the one used.

public interface Assistant {
    String chat(String question);
    String doSomethingElse(String question);
}

var assistant = AiServices.builder(Assistant.class)
    .chatModel(chatModel)
    .outputGuardrailClasses(FirstOutputGuardrail.class, SecondOutputGuardrail.class)
    .build();

public interface Assistant {
    String chat(String question);
    String doSomethingElse(String question);
}

var assistant = AiServices.builder(Assistant.class)
    .chatModel(chatModel)
    .outputGuardrails(new FirstOutputGuardrail(), new SecondOutputGuardrail())
    .build();

In the first scenario, classes that implement OutputGuardrail are passed. New instances of these classes are created dynamically using reflection.

info

Annotation on individual AI Service methods

@OutputGuardrails annotations placed on ndividual AI Service methods have the next highest precendence.

public interface Assistant {
    @OutputGuardrails({ FirstOutputGuardrail.class, SecondOutputGuardrail.class })
    String chat(String question);
    
    String doSomethingElse(String question);
}

var assistant = AiServices.create(Assistant.class, chatModel);

In this example, only the chat method has guardrails.

On the chat method, FirstOutputGuardrail is invoked first.
Only if it is successful will the result be returned to the caller. SecondOutputGuardrail will only be invoked if FirstOutputGuardrail does not result in a fatal, fatal with retry, or fatal with reprompt result.
SecondOutputGuardrail will receive the output of FirstOutputGuardrail.
If SecondOutputGuardrail succeeds after a retry or reprompt, then both FirstOutputGuardrail and SecondOutputGuardrail are re-executed.

The doSomethingElse method does not have any guardrails.

Annotation on the AI Service class

@OutputGuardrails annotation placed on an AI Service class has the lowest precedence.

@OutputGuardrails({ FirstOutputGuardrail.class, SecondOutputGuardrail.class })
public interface Assistant {
    String chat(String question);
    String doSomethingElse(String question);
}

var assistant = AiServices.create(Assistant.class, chatModel);

In this example, both the chat and doSomethingElse methods have the guardrails.

Just like in the previous example, FirstOutputGuardrail is invoked first.
Only if it is successful will the result be returned to the caller. SecondOutputGuardrail will only be invoked if FirstOutputGuardrail does not result in a fatal, fatal with retry, or fatal with reprompt result.
SecondOutputGuardrail will receive the output of FirstOutputGuardrail.
If SecondOutputGuardrail succeeds after a retry or reprompt, then both FirstOutputGuardrail and SecondOutputGuardrail are re-executed.

Configuration

Output guardrails have the following additional configuration that can be supplied:

Configuration	Description
`maxRetries`	- The maximum number of retries for an output guardrail when performing a retry or reprompt. - Defaults to `2`. - Set to `0` to disable retries.

Annotation on individual AI Service methods

public interface MethodLevelAssistant {
    @OutputGuardrails(
            value = { FirstOutputGuardrail.class, SecondOutputGuardrail.class },
            maxRetries = 10
    )
    String chat(String question);
}

var assistant = AiServices.create(MethodLevelAssistant.class, chatModel);

Annotation on the AI Service class

@OutputGuardrails(
        value = { FirstOutputGuardrail.class, SecondOutputGuardrail.class },
        maxRetries = 10
)
public interface ClassLevelAssistant {
    String chat(String question);
}

var assistant = AiServices.create(ClassLevelAssistant.class, chatModel);

`AiServices` builder

public interface Assistant {
    String chat(String message);
}

var outputGuardrailsConfig = OutputGuardrailsConfig.builder()
        .maxRetries(10)
        .build();

var assistant = AiServices.builder(Assistant.class)
        .chatModel(chatModel)
        .outputGuardrailsConfig(outputGuardrailsConfig)
        .outputGuardrailClasss(FirstOutputGuardrail.class, SecondOutputGuardrail.class)
        .build();

Output Guardrails on Streaming Responses

Output guardrails can also work for operations with streaming responses:

public interface StreamingAssistant {
    @OutputGuardrails({ FirstOutputGuardrail.class, SecondOutputGuardrail.class })
    TokenStream streamingChat(String message);
}

In this scenario, the output guardrails will be executed once the entire stream is complete, or more specifically, when TokenStream.onCompleteResponse is called. onPartialResponse will be buffered and replayed once the guardrails succeed.

In the situation where a retry or reprompt in the chain eventually succeeds, then the entire chain is re-executed synchronously. Each guardrail will be re-executed one after the other in the original order. Once the chain completes the result is passed into TokenStream.onCompleteResponse.

Out-of-the-box Output Guardrails

There are several common use cases where implementations of an output guardrail are provided by LangChain4j:

Guardrail class Description

JsonExtractorOutputGuardrail An output guardrail that will check whether or not a response can be successfully deserialized from JSON to an object of a certain type.
- Uses a Jackson ObjectMapper to try and deserialize an object.
- The LLM is reprompted if the response can't be deserialized into the expected object type.
- Can be used as-is, or can be extended and customized (there are several protected methods that can be overridden to customize behavior).

Guardrail class	Description
`JsonExtractorOutputGuardrail`	An output guardrail that will check whether or not a response can be successfully deserialized from JSON to an object of a certain type. - Uses a Jackson ObjectMapper to try and deserialize an object. - The LLM is reprompted if the response can't be deserialized into the expected object type. - Can be used as-is, or can be extended and customized (there are several `protected` methods that can be overridden to customize behavior).

Unit Testing Output Guardrails

There are some unit testing utilities based on AssertJ in the langchain4j-test module.

Maven
Gradle (Groovy)
Gradle (Kotlin)

<dependency>
  <groupId>dev.langchain4j</groupId>
  <artifactId>langchain4j-test</artifactId>
  <scope>test</scope>
</dependency>

testImplementation 'dev.langchain4j:langchain4j-test'

testImplementation("dev.langchain4j:langchain4j-test")

Once you have the dependency, you can perform these kinds of validations:

import static dev.langchain4j.test.guardrail.GuardrailAssertions.assertThat;

import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.guardrail.GuardrailResult.Result;

class Tests { 
    MyOutputGuardrail outputGuardrail = new MyOutputGuardrail();
    
    @Test 
    void test() {
        var aiMessage = AiMessage.from("Some output");
        var result = outputGuardrail.validate(aiMessage);
        
        // These are just some examples of what you can do
        assertThat(result)
                .isSuccessful()
                .hasResult(Result.FATAL)
                .hasFailures()
                .hasSingleFailureWithMessage("Hallucination detected!")
                .hasSingleFailureWithMessageAndReprompt("Hallucination detected!", "Please LLM don't hallucinate!")
                .assertSingleFailureSatisfied(failure -> assertThat(failure)...)
                .withFailures().....
    }
}

info

See the GuardrailAssertions and OutputGuardrailResultAssert classes for more details.

Mixing and matching

You can mix and match input and output guardrails however you like!

public class MyObjectJsonOutputGuardrail extends JsonExtractorOutputGuardrail<MyObject> {
    public MyObjectJsonOutputGuardrail() {
        super(MyObject.class);
    }
}

@InputGuardrails({ FirstInputGuardrail.class, SecondInputGuardrail.class })
@OutputGuardrails(value = SomeOutputGuardrail.class, maxRetries = 5)
public interface Assistant {
    String chat(String message);
    
    @InputGuardrails(PromptInjectionGuardrail.class)
    @OutputGuardrails(MyObjectJsonOutputGuardrail.class)
    MyObject chatAndReturnJson(String message);
}

var outputGuardrailsConfig = OutputGuardrailsConfig.builder()
        .maxRetries(10)
        .build();

var assistant = AiServices.builder(Assistant.class)
        .chatModel(chatModel)
        .inputGuardrails(new AnotherInputGuardrail())
        .outputGuardrailsConfig(outputGuardrailsConfig)
        .build();

In this example, all the methods on the Assistant have a single input guardrail, AnotherInputGuardrail, because it is set on the AiServices builder. Additionally, all the output guardrails have a maxRetries value == 10, because the config is also set on the AiServices builder.

The chat method has a single output guardrail, SomeOutputGuardrail, with a maxRetries value == 10.

The chatAndReturnJson method a single output guardrail, MyObjectJsonOutputGuardrail with a maxRetries value == 10.

Extension points

The guardrail system was built in a composable way so it can be extended and reused in other downstream frameworks (such as Quarkus or Spring Boot). This section describes some of the extension points or "hooks" that are provided.

All of these extension points utilize the Java Service Provider Interface (Java SPI).

Extension point interface	Purpose
`ClassInstanceFactory`	Provides instances of classes. - Intended to delegate instance creation/retrieval to some other means. - If not provided, uses reflection to create an instance using the default constructor. - Other frameworks (like Quarkus or Spring) may use their own bean containers to provide instances of classes. Those frameworks would provide an implementation. - A Quarkus implementation may look something like `CDIClassInstanceFactory` - A Spring implementation may look something like `ApplicationContextClassInstanceFactory`
`ClassMetadataProviderFactory`	Provides access to class metadata. - Used to scan the methods on `AiService` interfaces, and find and process the `@InputGuardrails`/`@OutputGuardrails` annotations. - `ReflectionBasedClassMetadataProviderFactory` is the default implementation if no others are found, providing class metadata using reflection.
`GuardrailServiceBuilderFactory`	Provides builder instances for building `GuardrailService` instances. An application or framework would implement this if they needed to customize the way they build `GuardrailService` instances.
`InputGuardrailsConfigBuilderFactory`	- SPI for overriding and/or extending the default `InputGuardrailsConfigBuilder` - Other frameworks may provide their own implementation with extra additional configuration for input guardrails. - Would also allow other frameworks to drive input guardrail configuration via some other mechanism (i.e. a properties file).
`OutputGuardrailsConfigBuilderFactory`	- SPI for overriding and/or extending the default `OutputGuardrailsConfigBuilder` - Other frameworks may provide their own implementation with extra additional configuration for output guardrails. - Would also allow other frameworks to drive output guardrail configuration via some other mechanism (i.e. a properties file).
`InputGuardrailExecutorBuilderFactory`	- SPI for overriding and/or extending the default `InputGuardrailExecutorBuilder responsible for building [`InputGuardrailExecutor`](https://github.com/langchain4j/langchain4j/blob/main/langchain4j-core/src/main/java/dev/langchain4j/guardrail/InputGuardrailExecutor.java) instances.
`OutputGuardrailExecutorBuilderFactory`	- SPI for overriding and/or extending the default `OutputGuardrailExecutorBuilder responsible for building [`OutputGuardrailExecutor`](https://github.com/langchain4j/langchain4j/blob/main/langchain4j-core/src/main/java/dev/langchain4j/guardrail/OutputGuardrailExecutor.java) instances.

Implementing Guardrails​

Input Guardrails​

Implementing Input Guardrails​

Input Guardrail Outcomes​

Declaring Input Guardrails​

AiServices builder​

Annotation on individual AI Service methods​

Annotation on the AI Service class​

Unit Testing Input Guardrails​

Out-of-the-box Input Guardrails​

Output Guardrails​

Implementing Output Guardrails​

Output Guardrail Outcomes​

Declaring Output Guardrails​

AiServices builder​

Annotation on individual AI Service methods​

Annotation on the AI Service class​

Configuration​

Annotation on individual AI Service methods​

Annotation on the AI Service class​

AiServices builder​

Output Guardrails on Streaming Responses​

Out-of-the-box Output Guardrails​

Unit Testing Output Guardrails​

Mixing and matching​

Extension points​

Implementing Guardrails

Input Guardrails

Implementing Input Guardrails

Input Guardrail Outcomes

Declaring Input Guardrails

`AiServices` builder

Annotation on individual AI Service methods

Annotation on the AI Service class

Unit Testing Input Guardrails

Out-of-the-box Input Guardrails

Output Guardrails

Implementing Output Guardrails

Output Guardrail Outcomes

Declaring Output Guardrails

`AiServices` builder

Annotation on individual AI Service methods

Annotation on the AI Service class

Configuration

Annotation on individual AI Service methods

Annotation on the AI Service class

`AiServices` builder

Output Guardrails on Streaming Responses

Out-of-the-box Output Guardrails

Unit Testing Output Guardrails

Mixing and matching

Extension points