Skip to main content

Guardrails

note

Guardrails is an experimental feature. Its API and behavior might change in future versions.

Guardrails are mechanisms that let you validate the input and output of the LLM to ensure it meets your expectations. You can do some of the following things with guardrails:

  • Verify the user input is not out of scope
  • Ensure the input meets some criteria before calling the LLM (i.e. guard against a prompt injection attack)
  • Ensure the output format is correct (i.e. it is a JSON document with the correct schema)
  • Ensure the LLM output is coherent with business rules and constraints (i.e. if this is a chatbot of company X, the response should not contain any reference to a competitor Y).
  • Detect hallucinations

Those are just examples. You can do many other things with guardrails.

note

Guardrails are only available when using AI Services. They are a higher-level construct that can not be applied to a ChatModel or StreamingChatModel.

GuardrailsGuardrails;

The implementation was originally done in the Quarkus LangChain4j extension and was backported here.

Implementing Guardrails

Ideally, guardrail implementations should follow the single responsibility principle, meaning that each guardrail class should validate one thing. Then, chain guardrails together to guard against multiple things.

The order of guardrails in the chain is important. The first guardrail in the chain to fail will trigger the overall failure. Ensure guardrails that catch the most failures are early in the chain, whereas more specific guardrails that may fail very infrequently are towards the end of the chain.

Also keep in mind that guardrails can themselves call other services or even invoke other LLM interactions. If these kinds of guardrails have an execution penalty or monetary cost associated with them, make sure you take that into account. You might want to put more expensive guardrails towards the end of the chain.

note

The term expensive can mean that something takes some time to execute or has a monetary value associated with it.

Input Guardrails

Input guardrails are functions invoked before the LLM is called. Failing an input guardrail prevents the LLM from being called. Input guardrails are the last step prior to calling the LLM. They are invoked after any RAG operations have happened.

Implementing Input Guardrails

Input guardrails are implemented by implementing the InputGuardrail interface. The InputGuardrail interface has two variants of the validate method, at least one of which needs to be implemented:

InputGuardrailResult validate(UserMessage userMessage);
InputGuardrailResult validate(InputGuardrailRequest params);

The first variant is used for simple guardrails, or when the guardrail only needs access to the UserMessage.

The second variant is for more complex guardrails that need more information, such as the chat memory/history, user message template, augmentation results, or variables that were passed to the template. See InputGuardrailRequest for more information.

Some examples of things you could do:

  • Check that there are enough documents in the augmentation results
  • Ensure the user is not asking the same question multiple times
  • Mitigate potential prompt injection attack

Input guardrails can be used whether the operation is synchronous or asynchronous/streaming.

Input Guardrail Outcomes

Input guardrails can have the following outcomes. There are helper methods on the InputGuardrail interface that can provide the outcomes:

OutcomeHelper method on InputGuardrailDescription
successsuccess()- The input is valid.
- The next guardrail in the chain is executed.
- The LLM is called if the last guardrail passes.
success with alternate resultsuccessWith(String)Similar to success except the user message is altered before proceeding to the next step (next guardrail in the chain or calling the LLM).
failurefailure(String) or failure(String, Throwable)- The input is invalid but the next guardrails in the chain continue to be executed in order to accumulate all possible validation problems.
- The LLM is not called.
fatalfatal(String) or fatal(String, Throwable)- The input is invalid and execution is halted with an InputGuardrailException.
- The LLM is not called.

Declaring Input Guardrails

There are several ways to declare input guardrails, listed here in order of precedence:

  1. InputGuardrail implementation class names or instances set directly on the AiServices builder.
  2. @InputGuardrails annotations placed on an individual AI Service method.
  3. @InputGuardrails annotation placed on an AI Service class. Regardless of how they are declared, input guardrails are always executed in the order they appear in the list.

AiServices builder

InputGuardrail implementation class names or instances set directly on the AiServices builder have the highest precedence, meaning if it is declared in any other ways, the one declared directly on the builder will be the one used.

public interface Assistant {
String chat(String question);
String doSomethingElse(String question);
}

var assistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.inputGuardrailClasses(FirstInputGuardrail.class, SecondInputGuardrail.class)
.build();

or

public interface Assistant {
String chat(String question);
String doSomethingElse(String question);
}

var assistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.inputGuardrails(new FirstInputGuardrail(), new SecondInputGuardrail())
.build();

In the first scenario, classes that implement InputGuardrail are passed. New instances of these classes are created dynamically using reflection.

info

The way classes are converted to instances can be customized. For example, frameworks that use dependency injection (like Quarkus or Spring) can use extension points to provide instances based on how they manage class instances rather than creating new instances via reflection each time.

Annotation on individual AI Service methods

@InputGuardrails annotations placed on an individual AI Service methods have the next highest precedence.

public interface Assistant {
@InputGuardrails({ FirstInputGuardrail.class, SecondInputGuardrail.class })
String chat(String question);

String doSomethingElse(String question);
}

var assistant = AiServices.create(Assistant.class, chatModel);

In this example, only the chat method has guardrails.

  • On the chat method, FirstInputGuardrail is invoked first.
  • Only if it is successful will the LLM be called.
  • SecondInputGuardrail will only be invoked if FirstInputGuardrail does not result in a fatal result.
  • Either FirstInputGuardrail or SecondInputGuardrail could re-write the user message.
  • If FirstInputGuardrail re-writes the user message, then SecondInputGuardrail will receive the new user message as input.

The doSomethingElse method does not have any guardrails.

Annotation on the AI Service class

@InputGuardrails annotation placed on an AI Service class has the lowest precedence.

@InputGuardrails({ FirstInputGuardrail.class, SecondInputGuardrail.class })
public interface Assistant {
String chat(String question);
String doSomethingElse(String question);
}

var assistant = AiServices.create(Assistant.class, chatModel);

In this example, both the chat and doSomethingElse methods have the guardrails.

  • Just like in the previous example, FirstInputGuardrail is invoked first.
  • Only if it is successful will the LLM be called.
  • SecondInputGuardrail will only be invoked if FirstInputGuardrail does not result in a fatal result.
  • Either FirstInputGuardrail or SecondInputGuardrail could re-write the user message.
  • If FirstInputGuardrail re-writes the user message, then SecondInputGuardrail will receive the new user message as input.

Unit Testing Input Guardrails

There are some unit testing utilities based on AssertJ in the langchain4j-test module.

<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-test</artifactId>
<scope>test</scope>
</dependency>

Once you have the dependency, you can perform these kinds of validations:

import static dev.langchain4j.test.guardrail.GuardrailAssertions.assertThat;

import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.guardrail.GuardrailResult.Result;

class Tests {
MyInputGuardrail inputGuardrail = new MyInputGuardrail();

@Test
void test() {
var userMessage = UserMessage.from("Some user message");
var result = inputGuardrail.validate(userMessage);

// These are just some examples of what you can do
assertThat(result)
.isSuccessful()
.hasResult(Result.FATAL)
.hasFailures()
.hasSingleFailureWithMessage("Prompt injection detected")
.assertSingleFailureSatisfied(failure -> assertThat(failure)...)
.withFailures().....
}
}
info

See the GuardrailAssertions and InputGuardrailResultAssert classes for more details.

Output Guardrails

Output guardrails are functions executed after the LLM has produced its output. Failing an output guardrail allows for more advanced scenarios, such as retrying or reprompting, to help improve the response. They are invoked after all other operations, including function/tool calls, have happened.

Implementing Output Guardrails

Similar to input guardrails, output guardrails are implemented by implementing the OutputGuardrail interface. The OutputGuardrail interface has two variants of the validate method, at least one of which needs to be implemented:

OutputGuardrailResult validate(AiMessage responseFromLLM);
OutputGuardrailResult validate(OutputGuardrailRequest params);

The first variant is used for simple guardrails, or when the guardrail only needs access to the resulting AiMessage.

The second variant is for more complex guardrails that need more information, such as the entire chat response, chat memory/history, user message template, or variables that were passed to the template. See OutputGuardrailRequest for more information.

Some examples of things you could do:

  • Ensure the output format is correct (i.e. it is a JSON document with the correct schema)
  • Detect an LLM hallucination
  • Validate that the LLM response contains certain information

Output Guardrail Outcomes

Output guardrails can have the following outcomes. There are helper methods on the OutputGuardrail interface that can provide the outcomes:

OutcomeHelper method on OutputGuardrailDescription
successsuccess()- The output is valid.
- The next guardrail in the chain is executed. If the last guardrail passes the output is returned to the caller.
success with rewritesuccessWith(String) or successWith(String, Object)-Similar to success except the output isn't valid in its original form and has been rewritten to make it valid.
- The next guardrail is executed against the rewritten output. If the last guardrail passes the output is returned to the caller.
failurefailure(String) or failure(String, Throwable)- The output is invalid but the next guardrails in the chain continue to be executed in order to accumulate all possible validation problems.
- The validation failure is returned to the user as an OutputGuardrailException.
fatalfatal(String) or fatal(String, Throwable)The output is invalid and execution is halted with an OutputGuardrailException thrown to the caller.
fatal with retryretry(String) or retry(String, Throwable)- Similar to fatal except the LLM is called again with the same prompt and chat history as the original call.
- If the failure persists after a configurable number of retries then execution is halted with an OutputGuardrailException thrown to the caller.
- If the guardrail passes after a retry, the entire chain of guardrails are re-executed from the beginning.
fatal with repromptreprompt(String, String) or reprompt(String, Throwable, String)- Similar to fatal with retry except the LLM is called again with a new prompt supplied by the guardrail.
- In this situation, the guardrail supplies an additional message to append to the previous user message, then sends a new request to the LLM with the new user message and original chat history.
- If the failure persists after a configurable number of retries then execution is halted with an OutputGuardrailException thrown to the caller.
- If the guardrail passes after a reprompt, the entire chain of guardrails are re-executed from the beginning.

Declaring Output Guardrails

There are several ways to declare output guardrails, listed here in order of precedence:

  1. OutputGuardrail implementation class names or instances set directly on the AiServices builder.
  2. @OutputGuardrails annotations placed on an individual AI Service method.
  3. @OutputGuardrails annotation placed on an AI Service class.

Regardless of how they are declared, output guardrails are always executed in the order they appear in the list.

AiServices builder

OutputGuardrail implementation class names or instances set directly on the AiServices builder have the highest precedence, meaning if it is declared in any other ways, the one declared on the builder will be the one used.

public interface Assistant {
String chat(String question);
String doSomethingElse(String question);
}

var assistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.outputGuardrailClasses(FirstOutputGuardrail.class, SecondOutputGuardrail.class)
.build();

or

public interface Assistant {
String chat(String question);
String doSomethingElse(String question);
}

var assistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.outputGuardrails(new FirstOutputGuardrail(), new SecondOutputGuardrail())
.build();

In the first scenario, classes that implement OutputGuardrail are passed. New instances of these classes are created dynamically using reflection.

info

The way classes are converted to instances can be customized. For example, frameworks that use dependency injection (like Quarkus or Spring) can use extension points to provide instances based on how they manage class instances rather than creating new instances via reflection each time.

Annotation on individual AI Service methods

@OutputGuardrails annotations placed on ndividual AI Service methods have the next highest precendence.

public interface Assistant {
@OutputGuardrails({ FirstOutputGuardrail.class, SecondOutputGuardrail.class })
String chat(String question);

String doSomethingElse(String question);
}

var assistant = AiServices.create(Assistant.class, chatModel);

In this example, only the chat method has guardrails.

  • On the chat method, FirstOutputGuardrail is invoked first.
  • Only if it is successful will the result be returned to the caller. SecondOutputGuardrail will only be invoked if FirstOutputGuardrail does not result in a fatal, fatal with retry, or fatal with reprompt result.
  • SecondOutputGuardrail will receive the output of FirstOutputGuardrail.
  • If SecondOutputGuardrail succeeds after a retry or reprompt, then both FirstOutputGuardrail and SecondOutputGuardrail are re-executed.

The doSomethingElse method does not have any guardrails.

Annotation on the AI Service class

@OutputGuardrails annotation placed on an AI Service class has the lowest precedence.

@OutputGuardrails({ FirstOutputGuardrail.class, SecondOutputGuardrail.class })
public interface Assistant {
String chat(String question);
String doSomethingElse(String question);
}

var assistant = AiServices.create(Assistant.class, chatModel);

In this example, both the chat and doSomethingElse methods have the guardrails.

  • Just like in the previous example, FirstOutputGuardrail is invoked first.
  • Only if it is successful will the result be returned to the caller. SecondOutputGuardrail will only be invoked if FirstOutputGuardrail does not result in a fatal, fatal with retry, or fatal with reprompt result.
  • SecondOutputGuardrail will receive the output of FirstOutputGuardrail.
  • If SecondOutputGuardrail succeeds after a retry or reprompt, then both FirstOutputGuardrail and SecondOutputGuardrail are re-executed.

Configuration

Output guardrails have the following additional configuration that can be supplied:

ConfigurationDescription
maxRetries- The maximum number of retries for an output guardrail when performing a retry or reprompt.
- Defaults to 2.
- Set to 0 to disable retries.
Annotation on individual AI Service methods
public interface MethodLevelAssistant {
@OutputGuardrails(
value = { FirstOutputGuardrail.class, SecondOutputGuardrail.class },
maxRetries = 10
)
String chat(String question);
}

var assistant = AiServices.create(MethodLevelAssistant.class, chatModel);
Annotation on the AI Service class
@OutputGuardrails(
value = { FirstOutputGuardrail.class, SecondOutputGuardrail.class },
maxRetries = 10
)
public interface ClassLevelAssistant {
String chat(String question);
}

var assistant = AiServices.create(ClassLevelAssistant.class, chatModel);
AiServices builder
public interface Assistant {
String chat(String message);
}

var outputGuardrailsConfig = OutputGuardrailsConfig.builder()
.maxRetries(10)
.build();

var assistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.outputGuardrailsConfig(outputGuardrailsConfig)
.outputGuardrailClasss(FirstOutputGuardrail.class, SecondOutputGuardrail.class)
.build();

Output Guardrails on Streaming Responses

Output guardrails can also work for operations with streaming responses:

public interface StreamingAssistant {
@OutputGuardrails({ FirstOutputGuardrail.class, SecondOutputGuardrail.class })
TokenStream streamingChat(String message);
}

In this scenario, the output guardrails will be executed once the entire stream is complete, or more specifically, when TokenStream.onCompleteResponse is called. onPartialResponse will be buffered and replayed once the guardrails succeed.

In the situation where a retry or reprompt in the chain eventually succeeds, then the entire chain is re-executed synchronously. Each guardrail will be re-executed one after the other in the original order. Once the chain completes the result is passed into TokenStream.onCompleteResponse.

Out-of-the-box Output Guardrails

There are several common use cases where implementations of an output guardrail are provided by LangChain4j:

Guardrail classDescription
JsonExtractorOutputGuardrailAn output guardrail that will check whether or not a response can be successfully deserialized from JSON to an object of a certain type.
- Uses a Jackson ObjectMapper to try and deserialize an object.
- The LLM is reprompted if the response can't be deserialized into the expected object type.
- Can be used as-is, or can be extended and customized (there are several protected methods that can be overridden to customize behavior).

Unit Testing Output Guardrails

There are some unit testing utilities based on AssertJ in the langchain4j-test module.

<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-test</artifactId>
<scope>test</scope>
</dependency>

Once you have the dependency, you can perform these kinds of validations:

import static dev.langchain4j.test.guardrail.GuardrailAssertions.assertThat;

import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.guardrail.GuardrailResult.Result;

class Tests {
MyOutputGuardrail outputGuardrail = new MyOutputGuardrail();

@Test
void test() {
var aiMessage = AiMessage.from("Some output");
var result = outputGuardrail.validate(aiMessage);

// These are just some examples of what you can do
assertThat(result)
.isSuccessful()
.hasResult(Result.FATAL)
.hasFailures()
.hasSingleFailureWithMessage("Hallucination detected!")
.hasSingleFailureWithMessageAndReprompt("Hallucination detected!", "Please LLM don't hallucinate!")
.assertSingleFailureSatisfied(failure -> assertThat(failure)...)
.withFailures().....
}
}
info

See the GuardrailAssertions and OutputGuardrailResultAssert classes for more details.

Mixing and matching

You can mix and match input and output guardrails however you like!

public class MyObjectJsonOutputGuardrail extends JsonExtractorOutputGuardrail<MyObject> {
public MyObjectJsonOutputGuardrail() {
super(MyObject.class);
}
}

@InputGuardrails({ FirstInputGuardrail.class, SecondInputGuardrail.class })
@OutputGuardrails(value = SomeOutputGuardrail.class, maxRetries = 5)
public interface Assistant {
String chat(String message);

@InputGuardrails(PromptInjectionGuardrail.class)
@OutputGuardrails(MyObjectJsonOutputGuardrail.class)
MyObject chatAndReturnJson(String message);
}

var outputGuardrailsConfig = OutputGuardrailsConfig.builder()
.maxRetries(10)
.build();

var assistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.inputGuardrails(new AnotherInputGuardrail())
.outputGuardrailsConfig(outputGuardrailsConfig)
.build();

In this example, all the methods on the Assistant have a single input guardrail, AnotherInputGuardrail, because it is set on the AiServices builder. Additionally, all the output guardrails have a maxRetries value == 10, because the config is also set on the AiServices builder.

The chat method has a single output guardrail, SomeOutputGuardrail, with a maxRetries value == 10.

The chatAndReturnJson method a single output guardrail, MyObjectJsonOutputGuardrail with a maxRetries value == 10.

Extension points

The guardrail system was built in a composable way so it can be extended and reused in other downstream frameworks (such as Quarkus or Spring Boot). This section describes some of the extension points or "hooks" that are provided.

All of these extension points utilize the Java Service Provider Interface (Java SPI).

Extension point interfacePurpose
ClassInstanceFactoryProvides instanceos of classes.
- Intended to delegate instance creation/retrieval to some other means.
- If not provided, uses reflection to create an instance using the default constructor.
- Other frameworks (like Quarkus or Spring) may use their own bean containers to provide instances of classes. Those frameworks would provide an implementation.
- A Quarkus implementation may look something like CDIClassInstanceFactory
- A Spring implementation may look something like ApplicationContextClassInstanceFactory
ClassMetadataProviderFactoryProvides access to class metadata.
- Used to scan the methods on AiService interfaces, and find and process the @InputGuardrails/@OutputGuardrails annotations.
- ReflectionBasedClassMetadataProviderFactory is the default implementation if no others are found, providing class metadata using reflection.
InputGuardrailsConfigBuilderFactory- SPI for overriding and/or extending the default InputGuardrailsConfigBuilder
- Other frameworks may provide their own implementation with extra additional configuration for input guardrails.
- Would also allow other frameworks to drive input guardrail configuration via some other mechanism (i.e. a properties file).
OutputGuardrailsConfigBuilderFactory- SPI for overriding and/or extending the default OutputGuardrailsConfigBuilder
- Other frameworks may provide their own implementation with extra additional configuration for output guardrails.
- Would also allow other frameworks to drive output guardrail configuration via some other mechanism (i.e. a properties file).