java.lang.Object

dev.langchain4j.service.AiServiceTokenStream

All Implemented Interfaces:: TokenStream

public class AiServiceTokenStream extends Object implements TokenStream

Constructor Summary

Constructors

Constructor

Description

AiServiceTokenStream(AiServiceTokenStreamParameters parameters)

Creates a new instance of AiServiceTokenStream with the given parameters.
Method Summary

Modifier and Type

Method

Description

TokenStream

beforeToolExecution(Consumer<BeforeToolExecution> beforeToolExecutionHandler)

The provided consumer will be invoked right before a tool is executed.

TokenStream

ignoreErrors()

All errors during streaming will be ignored (but will be logged with a WARN log level).

TokenStream

onCompleteResponse(Consumer<ChatResponse> completionHandler)

The provided consumer will be invoked when a language model finishes streaming the final chat response, as opposed to the intermediate response (see TokenStream.onIntermediateResponse(Consumer)).

TokenStream

onError(Consumer<Throwable> errorHandler)

The provided consumer will be invoked when an error occurs during streaming.

TokenStream

onIntermediateResponse(Consumer<ChatResponse> intermediateResponseHandler)

The provided consumer will be invoked when a language model finishes streaming the intermediate chat response, as opposed to the final response (see TokenStream.onCompleteResponse(Consumer)).

TokenStream

onPartialResponse(Consumer<String> partialResponseHandler)

The provided consumer will be invoked every time a new partial textual response (usually a single token) from a language model is available.

TokenStream

onPartialResponseWithContext(BiConsumer<PartialResponse, PartialResponseContext> handler)

The provided consumer will be invoked every time a new partial textual response (usually a single token) from a language model is available.

TokenStream

onPartialThinking(Consumer<PartialThinking> partialThinkingHandler)

The provided consumer will be invoked every time a new partial thinking/reasoning text (usually a single token) from a language model is available.

TokenStream

onPartialThinkingWithContext(BiConsumer<PartialThinking, PartialThinkingContext> handler)

The provided consumer will be invoked every time a new partial thinking/reasoning text (usually a single token) from a language model is available.

TokenStream

onRetrieved(Consumer<List<Content>> contentsHandler)

The provided consumer will be invoked if any Contents are retrieved using RetrievalAugmentor.

TokenStream

onToolExecuted(Consumer<ToolExecution> toolExecutionHandler)

The provided consumer will be invoked right after a tool is executed.

void

start()

Completes the current token stream building and starts processing.

Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- AiServiceTokenStream
  
  public AiServiceTokenStream(AiServiceTokenStreamParameters parameters)
  
  Creates a new instance of AiServiceTokenStream with the given parameters.
  
  Parameters:
  
  parameters - the parameters for creating the token stream
Method Details
- onPartialResponse
  public TokenStream onPartialResponse(Consumer<String> partialResponseHandler)
  
  Description copied from interface: TokenStream
  
  The provided consumer will be invoked every time a new partial textual response (usually a single token) from a language model is available.
  Either this or the TokenStream.onPartialResponseWithContext(BiConsumer) callback can be used if you want to consume tokens as soon as they become available.
  
  Specified by:
  
  onPartialResponse in interface TokenStream
  
  Parameters:
  
  partialResponseHandler - lambda that will be invoked when a model generates a new partial textual response
  
  Returns:
  
  token stream instance used to configure or start stream processing
  
  See Also:
  
  TokenStream.onPartialResponseWithContext(BiConsumer)
- onPartialResponseWithContext
  public TokenStream onPartialResponseWithContext(BiConsumer<PartialResponse, PartialResponseContext> handler)
  
  Description copied from interface: TokenStream
  
  The provided consumer will be invoked every time a new partial textual response (usually a single token) from a language model is available.
  Either this or the TokenStream.onPartialResponse(Consumer) callback can be used if you want to consume tokens as soon as they become available.
  
  Specified by:
  
  onPartialResponseWithContext in interface TokenStream
  
  Parameters:
  
  handler - lambda that will be invoked when a model generates a new partial textual response
  
  Returns:
  
  token stream instance used to configure or start stream processing
  
  See Also:
  
  TokenStream.onPartialResponse(Consumer)
- onPartialThinking
  public TokenStream onPartialThinking(Consumer<PartialThinking> partialThinkingHandler)
  
  Description copied from interface: TokenStream
  
  The provided consumer will be invoked every time a new partial thinking/reasoning text (usually a single token) from a language model is available.
  Either this or the TokenStream.onPartialThinkingWithContext(BiConsumer) callback can be used if you want to consume thinking tokens as soon as they become available.
  
  Specified by:
  
  onPartialThinking in interface TokenStream
  
  Parameters:
  
  partialThinkingHandler - lambda that will be invoked when a model generates a new partial thinking/reasoning text
  
  Returns:
  
  token stream instance used to configure or start stream processing
  
  See Also:
  
  TokenStream.onPartialThinkingWithContext(BiConsumer)
- onPartialThinkingWithContext
  public TokenStream onPartialThinkingWithContext(BiConsumer<PartialThinking, PartialThinkingContext> handler)
  
  Description copied from interface: TokenStream
  
  The provided consumer will be invoked every time a new partial thinking/reasoning text (usually a single token) from a language model is available.
  Either this or the TokenStream.onPartialThinking(Consumer) callback can be used if you want to consume thinking tokens as soon as they become available.
  
  Specified by:
  
  onPartialThinkingWithContext in interface TokenStream
  
  Parameters:
  
  handler - lambda that will be invoked when a model generates a new partial thinking/reasoning text
  
  Returns:
  
  token stream instance used to configure or start stream processing
  
  See Also:
  
  TokenStream.onPartialThinking(Consumer)
- onRetrieved
  
  public TokenStream onRetrieved(Consumer<List<Content>> contentsHandler)
  
  Description copied from interface: TokenStream
  
  The provided consumer will be invoked if any Contents are retrieved using RetrievalAugmentor.
  The invocation happens before any call is made to the language model.
  
  Specified by:
  
  onRetrieved in interface TokenStream
  
  Parameters:
  
  contentsHandler - lambda that consumes all retrieved contents
  
  Returns:
  
  token stream instance used to configure or start stream processing
- onIntermediateResponse
  public TokenStream onIntermediateResponse(Consumer<ChatResponse> intermediateResponseHandler)
  
  Description copied from interface: TokenStream
  
  The provided consumer will be invoked when a language model finishes streaming the intermediate chat response, as opposed to the final response (see TokenStream.onCompleteResponse(Consumer)). Intermediate chat responses contain ToolExecutionRequests, AI service will execute them after returning from this consumer.
  
  Specified by:
  
  onIntermediateResponse in interface TokenStream
  
  Parameters:
  
  intermediateResponseHandler - lambda that consumes intermediate chat responses
  
  Returns:
  
  token stream instance used to configure or start stream processing
  
  See Also:
  
  TokenStream.onCompleteResponse(Consumer)
- beforeToolExecution
  
  public TokenStream beforeToolExecution(Consumer<BeforeToolExecution> beforeToolExecutionHandler)
  
  Description copied from interface: TokenStream
  
  The provided consumer will be invoked right before a tool is executed.
  
  Specified by:
  
  beforeToolExecution in interface TokenStream
  
  Parameters:
  
  beforeToolExecutionHandler - lambda that consumes BeforeToolExecution
  
  Returns:
  
  token stream instance used to configure or start stream processing
- onToolExecuted
  
  public TokenStream onToolExecuted(Consumer<ToolExecution> toolExecutionHandler)
  
  Description copied from interface: TokenStream
  
  The provided consumer will be invoked right after a tool is executed.
  The invocation happens after the tool method has finished and before any other tool is executed.
  
  Specified by:
  
  onToolExecuted in interface TokenStream
  
  Parameters:
  
  toolExecutionHandler - lambda that consumes ToolExecution
  
  Returns:
  
  token stream instance used to configure or start stream processing
- onCompleteResponse
  public TokenStream onCompleteResponse(Consumer<ChatResponse> completionHandler)
  
  Description copied from interface: TokenStream
  
  The provided consumer will be invoked when a language model finishes streaming the final chat response, as opposed to the intermediate response (see TokenStream.onIntermediateResponse(Consumer)).
  Please note that ChatResponse.tokenUsage() contains aggregate token usage across all calls to the LLM. It is a sum of ChatResponse.tokenUsage()s of all intermediate responses (TokenStream.onIntermediateResponse(Consumer)).
  
  Specified by:
  
  onCompleteResponse in interface TokenStream
  
  Parameters:
  
  completionHandler - lambda that will be invoked when language model finishes streaming
  
  Returns:
  
  token stream instance used to configure or start stream processing
  
  See Also:
  
  TokenStream.onIntermediateResponse(Consumer)
- onError
  
  public TokenStream onError(Consumer<Throwable> errorHandler)
  
  Description copied from interface: TokenStream
  
  The provided consumer will be invoked when an error occurs during streaming.
  
  Specified by:
  
  onError in interface TokenStream
  
  Parameters:
  
  errorHandler - lambda that will be invoked when an error occurs
  
  Returns:
  
  token stream instance used to configure or start stream processing
- ignoreErrors
  
  public TokenStream ignoreErrors()
  
  Description copied from interface: TokenStream
  
  All errors during streaming will be ignored (but will be logged with a WARN log level).
  
  Specified by:
  
  ignoreErrors in interface TokenStream
  
  Returns:
  
  token stream instance used to configure or start stream processing
- start
  
  public void start()
  
  Description copied from interface: TokenStream
  
  Completes the current token stream building and starts processing.
  Will send a request to LLM and start response streaming.
  
  Specified by:
  
  start in interface TokenStream

Class AiServiceTokenStream

Constructor Summary

Method Summary

Methods inherited from class Object

Constructor Details

AiServiceTokenStream

Method Details

onPartialResponse

onPartialResponseWithContext

onPartialThinking

onPartialThinkingWithContext

onRetrieved

onIntermediateResponse

beforeToolExecution

onToolExecuted

onCompleteResponse

onError

ignoreErrors

start