Class AzureOpenAiTokenizer

java.lang.Object
dev.langchain4j.model.azure.AzureOpenAiTokenizer
All Implemented Interfaces:
Tokenizer

public class AzureOpenAiTokenizer extends Object implements Tokenizer
This class can be used to estimate the cost (in tokens) before calling OpenAI or when using streaming. Magic numbers present in this class were found empirically while testing. There are integration tests in place that are making sure that the calculations here are very close to that of OpenAI.
  • Constructor Details

    • AzureOpenAiTokenizer

      public AzureOpenAiTokenizer()
      Creates an instance of the AzureOpenAiTokenizer for the "gpt-3.5-turbo" model. It should be suitable for most OpenAI models, as most of them use the same cl100k_base encoding (except for GPT-4o).
    • AzureOpenAiTokenizer

      public AzureOpenAiTokenizer(AzureOpenAiChatModelName modelName)
      Creates an instance of the AzureOpenAiTokenizer for a given AzureOpenAiChatModelName.
    • AzureOpenAiTokenizer

      public AzureOpenAiTokenizer(AzureOpenAiEmbeddingModelName modelName)
      Creates an instance of the AzureOpenAiTokenizer for a given AzureOpenAiEmbeddingModelName.
    • AzureOpenAiTokenizer

      public AzureOpenAiTokenizer(AzureOpenAiLanguageModelName modelName)
      Creates an instance of the AzureOpenAiTokenizer for a given AzureOpenAiLanguageModelName.
    • AzureOpenAiTokenizer

      public AzureOpenAiTokenizer(String modelName)
      Creates an instance of the AzureOpenAiTokenizer for a given model name.
  • Method Details

    • estimateTokenCountInText

      public int estimateTokenCountInText(String text)
      Description copied from interface: Tokenizer
      Estimates the count of tokens in the given text.
      Specified by:
      estimateTokenCountInText in interface Tokenizer
      Parameters:
      text - the text.
      Returns:
      the estimated count of tokens.
    • estimateTokenCountInMessage

      public int estimateTokenCountInMessage(ChatMessage message)
      Description copied from interface: Tokenizer
      Estimates the count of tokens in the given message.
      Specified by:
      estimateTokenCountInMessage in interface Tokenizer
      Parameters:
      message - the message.
      Returns:
      the estimated count of tokens.
    • estimateTokenCountInMessages

      public int estimateTokenCountInMessages(Iterable<ChatMessage> messages)
      Description copied from interface: Tokenizer
      Estimates the count of tokens in the given messages.
      Specified by:
      estimateTokenCountInMessages in interface Tokenizer
      Parameters:
      messages - the messages.
      Returns:
      the estimated count of tokens.
    • estimateTokenCountInToolSpecifications

      public int estimateTokenCountInToolSpecifications(Iterable<ToolSpecification> toolSpecifications)
      Description copied from interface: Tokenizer
      Estimates the count of tokens in the given tool specifications.
      Specified by:
      estimateTokenCountInToolSpecifications in interface Tokenizer
      Parameters:
      toolSpecifications - the tool specifications.
      Returns:
      the estimated count of tokens.
    • estimateTokenCountInForcefulToolSpecification

      public int estimateTokenCountInForcefulToolSpecification(ToolSpecification toolSpecification)
      Description copied from interface: Tokenizer
      Estimates the count of tokens in the given tool specification.
      Specified by:
      estimateTokenCountInForcefulToolSpecification in interface Tokenizer
      Parameters:
      toolSpecification - the tool specification.
      Returns:
      the estimated count of tokens.
    • encode

      public List<Integer> encode(String text)
    • encode

      public List<Integer> encode(String text, int maxTokensToEncode)
    • decode

      public String decode(List<Integer> tokens)
    • estimateTokenCountInToolExecutionRequests

      public int estimateTokenCountInToolExecutionRequests(Iterable<ToolExecutionRequest> toolExecutionRequests)
      Description copied from interface: Tokenizer
      Estimates the count of tokens in the given tool execution requests.
      Specified by:
      estimateTokenCountInToolExecutionRequests in interface Tokenizer
      Parameters:
      toolExecutionRequests - the tool execution request.
      Returns:
      the estimated count of tokens.
    • estimateTokenCountInForcefulToolExecutionRequest

      public int estimateTokenCountInForcefulToolExecutionRequest(ToolExecutionRequest toolExecutionRequest)
      Description copied from interface: Tokenizer
      Estimates the count of tokens in the given tool execution request.
      Specified by:
      estimateTokenCountInForcefulToolExecutionRequest in interface Tokenizer
      Parameters:
      toolExecutionRequest - the tool execution request.
      Returns:
      the estimated count of tokens.