Class AiServiceTokenStream
- All Implemented Interfaces:
- TokenStream
- 
Constructor SummaryConstructorsConstructorDescriptionAiServiceTokenStream(AiServiceTokenStreamParameters parameters) Creates a new instance ofAiServiceTokenStreamwith the given parameters.
- 
Method SummaryModifier and TypeMethodDescriptionbeforeToolExecution(Consumer<BeforeToolExecution> beforeToolExecutionHandler) The provided consumer will be invoked right before a tool is executed.All errors during streaming will be ignored (but will be logged with a WARN log level).onCompleteResponse(Consumer<ChatResponse> completionHandler) The provided consumer will be invoked when a language model finishes streaming the final chat response, as opposed to the intermediate response (seeTokenStream.onIntermediateResponse(Consumer)).The provided consumer will be invoked when an error occurs during streaming.onIntermediateResponse(Consumer<ChatResponse> intermediateResponseHandler) The provided consumer will be invoked when a language model finishes streaming the intermediate chat response, as opposed to the final response (seeTokenStream.onCompleteResponse(Consumer)).onPartialResponse(Consumer<String> partialResponseHandler) The provided consumer will be invoked every time a new partial textual response (usually a single token) from a language model is available.The provided consumer will be invoked every time a new partial textual response (usually a single token) from a language model is available.onPartialThinking(Consumer<PartialThinking> partialThinkingHandler) The provided consumer will be invoked every time a new partial thinking/reasoning text (usually a single token) from a language model is available.The provided consumer will be invoked every time a new partial thinking/reasoning text (usually a single token) from a language model is available.onRetrieved(Consumer<List<Content>> contentsHandler) The provided consumer will be invoked if anyContents are retrieved usingRetrievalAugmentor.onToolExecuted(Consumer<ToolExecution> toolExecutionHandler) The provided consumer will be invoked right after a tool is executed.voidstart()Completes the current token stream building and starts processing.
- 
Constructor Details- 
AiServiceTokenStreamCreates a new instance ofAiServiceTokenStreamwith the given parameters.- Parameters:
- parameters- the parameters for creating the token stream
 
 
- 
- 
Method Details- 
onPartialResponseDescription copied from interface:TokenStreamThe provided consumer will be invoked every time a new partial textual response (usually a single token) from a language model is available.Either this or the TokenStream.onPartialResponseWithContext(BiConsumer)callback can be used if you want to consume tokens as soon as they become available.- Specified by:
- onPartialResponsein interface- TokenStream
- Parameters:
- partialResponseHandler- lambda that will be invoked when a model generates a new partial textual response
- Returns:
- token stream instance used to configure or start stream processing
- See Also:
 
- 
onPartialResponseWithContextpublic TokenStream onPartialResponseWithContext(BiConsumer<PartialResponse, PartialResponseContext> handler) Description copied from interface:TokenStreamThe provided consumer will be invoked every time a new partial textual response (usually a single token) from a language model is available.Either this or the TokenStream.onPartialResponse(Consumer)callback can be used if you want to consume tokens as soon as they become available.- Specified by:
- onPartialResponseWithContextin interface- TokenStream
- Parameters:
- handler- lambda that will be invoked when a model generates a new partial textual response
- Returns:
- token stream instance used to configure or start stream processing
- See Also:
 
- 
onPartialThinkingDescription copied from interface:TokenStreamThe provided consumer will be invoked every time a new partial thinking/reasoning text (usually a single token) from a language model is available.Either this or the TokenStream.onPartialThinkingWithContext(BiConsumer)callback can be used if you want to consume thinking tokens as soon as they become available.- Specified by:
- onPartialThinkingin interface- TokenStream
- Parameters:
- partialThinkingHandler- lambda that will be invoked when a model generates a new partial thinking/reasoning text
- Returns:
- token stream instance used to configure or start stream processing
- See Also:
 
- 
onPartialThinkingWithContextpublic TokenStream onPartialThinkingWithContext(BiConsumer<PartialThinking, PartialThinkingContext> handler) Description copied from interface:TokenStreamThe provided consumer will be invoked every time a new partial thinking/reasoning text (usually a single token) from a language model is available.Either this or the TokenStream.onPartialThinking(Consumer)callback can be used if you want to consume thinking tokens as soon as they become available.- Specified by:
- onPartialThinkingWithContextin interface- TokenStream
- Parameters:
- handler- lambda that will be invoked when a model generates a new partial thinking/reasoning text
- Returns:
- token stream instance used to configure or start stream processing
- See Also:
 
- 
onRetrievedDescription copied from interface:TokenStreamThe provided consumer will be invoked if anyContents are retrieved usingRetrievalAugmentor.The invocation happens before any call is made to the language model. - Specified by:
- onRetrievedin interface- TokenStream
- Parameters:
- contentsHandler- lambda that consumes all retrieved contents
- Returns:
- token stream instance used to configure or start stream processing
 
- 
onIntermediateResponseDescription copied from interface:TokenStreamThe provided consumer will be invoked when a language model finishes streaming the intermediate chat response, as opposed to the final response (seeTokenStream.onCompleteResponse(Consumer)). Intermediate chat responses containToolExecutionRequests, AI service will execute them after returning from this consumer.- Specified by:
- onIntermediateResponsein interface- TokenStream
- Parameters:
- intermediateResponseHandler- lambda that consumes intermediate chat responses
- Returns:
- token stream instance used to configure or start stream processing
- See Also:
 
- 
beforeToolExecutionDescription copied from interface:TokenStreamThe provided consumer will be invoked right before a tool is executed.- Specified by:
- beforeToolExecutionin interface- TokenStream
- Parameters:
- beforeToolExecutionHandler- lambda that consumes- BeforeToolExecution
- Returns:
- token stream instance used to configure or start stream processing
 
- 
onToolExecutedDescription copied from interface:TokenStreamThe provided consumer will be invoked right after a tool is executed.The invocation happens after the tool method has finished and before any other tool is executed. - Specified by:
- onToolExecutedin interface- TokenStream
- Parameters:
- toolExecutionHandler- lambda that consumes- ToolExecution
- Returns:
- token stream instance used to configure or start stream processing
 
- 
onCompleteResponseDescription copied from interface:TokenStreamThe provided consumer will be invoked when a language model finishes streaming the final chat response, as opposed to the intermediate response (seeTokenStream.onIntermediateResponse(Consumer)).Please note that ChatResponse.tokenUsage()contains aggregate token usage across all calls to the LLM. It is a sum ofChatResponse.tokenUsage()s of all intermediate responses (TokenStream.onIntermediateResponse(Consumer)).- Specified by:
- onCompleteResponsein interface- TokenStream
- Parameters:
- completionHandler- lambda that will be invoked when language model finishes streaming
- Returns:
- token stream instance used to configure or start stream processing
- See Also:
 
- 
onErrorDescription copied from interface:TokenStreamThe provided consumer will be invoked when an error occurs during streaming.- Specified by:
- onErrorin interface- TokenStream
- Parameters:
- errorHandler- lambda that will be invoked when an error occurs
- Returns:
- token stream instance used to configure or start stream processing
 
- 
ignoreErrorsDescription copied from interface:TokenStreamAll errors during streaming will be ignored (but will be logged with a WARN log level).- Specified by:
- ignoreErrorsin interface- TokenStream
- Returns:
- token stream instance used to configure or start stream processing
 
- 
startpublic void start()Description copied from interface:TokenStreamCompletes the current token stream building and starts processing.Will send a request to LLM and start response streaming. - Specified by:
- startin interface- TokenStream
 
 
-