Observability
LLM Observability
Certain implementations of ChatLanguageModel
and StreamingChatLanguageModel
(see "Observability" column") allow configuring ChatModelListener
(s) to listen for events such as:
- Requests to the LLM
- Response from the LLM
- Errors
These events include various attributes, as described in the OpenTelemetry Generative AI Semantic Conventions, such as:
- Request:
- Model
- Temperature
- Top P
- Max Tokens
- Messages
- Tools
- Response:
- ID
- Model
- Token Usage
- Finish Reason
- Assistant Message
Here is an example of using ChatModelListener
:
ChatModelListener listener = new ChatModelListener() {
@Override
public void onRequest(ChatModelRequestContext requestContext) {
ChatModelRequest request = requestContext.request();
Map<Object, Object> attributes = requestContext.attributes();
...
}
@Override
public void onResponse(ChatModelResponseContext responseContext) {
ChatModelResponse response = responseContext.response();
ChatModelRequest request = responseContext.request();
Map<Object, Object> attributes = responseContext.attributes();
...
}
@Override
public void onError(ChatModelErrorContext errorContext) {
Throwable error = errorContext.error();
ChatModelRequest request = errorContext.request();
ChatModelResponse partialResponse = errorContext.partialResponse();
Map<Object, Object> attributes = errorContext.attributes();
...
}
};
ChatLanguageModel model = OpenAiChatModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.modelName(GPT_4_O_MINI)
.listeners(List.of(listener))
.build();
model.generate("Tell me a joke about Java");
The attributes
map allows passing information between the onRequest
, onResponse
, and onError
methods.