dev.langchain4j.classification.EmbeddingModelTextClassifier<L>

Type Parameters:: L - The type of the label (e.g., String, Enum, etc.)

All Implemented Interfaces:: TextClassifier<L>

public class EmbeddingModelTextClassifier<L> extends Object implements TextClassifier<L>

A TextClassifier that uses an EmbeddingModel and predefined examples to perform classification. Classification is done by comparing the embedding of the text being classified with the embeddings of predefined examples. The classification quality improves with a greater number of examples for each label. Examples can be easily generated with the help of an LLM.

Example:


 enum Sentiment {
     POSITIVE, NEUTRAL, NEGATIVE
 }

  Map<Sentiment, List<String>> examples = Map.of(
     POSITIVE, List.of("This is great!", "Wow, awesome!"),
     NEUTRAL,  List.of("Well, it's fine", "It's ok"),
     NEGATIVE, List.of("It is pretty bad", "Worst experience ever!")
 );

 EmbeddingModel embeddingModel = new AllMiniLmL6V2QuantizedEmbeddingModel();

 TextClassifier<Sentiment> classifier = new EmbeddingModelTextClassifier<>(embeddingModel, examples);

 List<Sentiment> sentiments = classifier.classify("Awesome!");
 System.out.println(sentiments); // [POSITIVE]

Constructor Summary

Constructors

Constructor

Description

EmbeddingModelTextClassifier(EmbeddingModel embeddingModel, Map<L,? extends Collection<String>> examplesByLabel)

Creates a classifier with the default values for maxResults (1), minScore (0) and meanToMaxScoreRatio (0.5).

EmbeddingModelTextClassifier(EmbeddingModel embeddingModel, Map<L,? extends Collection<String>> examplesByLabel, int maxResults, double minScore, double meanToMaxScoreRatio)

Creates a classifier.
Method Summary

Modifier and Type

Method

Description

ClassificationResult<L>

classifyWithScores(String text)

Classifies the given text and returns labels with scores.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface dev.langchain4j.classification.TextClassifier
classify, classify, classify, classifyWithScores, classifyWithScores

Constructor Details
- EmbeddingModelTextClassifier
  
  public EmbeddingModelTextClassifier(EmbeddingModel embeddingModel, Map<L,? extends Collection<String>> examplesByLabel)
  
  Creates a classifier with the default values for maxResults (1), minScore (0) and meanToMaxScoreRatio (0.5).
  
  Parameters:
  
  embeddingModel - The embedding model used for embedding both the examples and the text to be classified.
  
  examplesByLabel - A map containing examples of texts for each label. The more examples, the better. Examples can be easily generated by the LLM.
- EmbeddingModelTextClassifier
  
  public EmbeddingModelTextClassifier(EmbeddingModel embeddingModel, Map<L,? extends Collection<String>> examplesByLabel, int maxResults, double minScore, double meanToMaxScoreRatio)
  
  Creates a classifier.
  
  Parameters:
  
  embeddingModel - The embedding model used for embedding both the examples and the text to be classified.
  
  examplesByLabel - A map containing examples of texts for each label. The more examples, the better. Examples can be easily generated by the LLM.
  
  maxResults - The maximum number of labels to return for each classification.
  
  minScore - The minimum similarity score required for classification, in the range [0..1]. Labels scoring lower than this value will be discarded.
  
  meanToMaxScoreRatio - A ratio, in the range [0..1], between the mean and max scores used for calculating the final score. During classification, the embeddings of examples for each label are compared to the embedding of the text being classified. This results in two metrics: the mean and max scores. The mean score is the average similarity score for all examples associated with a given label. The max score is the highest similarity score, corresponding to the example most similar to the text being classified. A value of 0 means that only the mean score will be used for ranking labels. A value of 0.5 means that both scores will contribute equally to the final score. A value of 1 means that only the max score will be used for ranking labels.
Method Details
- classifyWithScores
  
  public ClassificationResult<L> classifyWithScores(String text)
  
  Description copied from interface: TextClassifier
  
  Classifies the given text and returns labels with scores.
  
  Specified by:
  
  classifyWithScores in interface TextClassifier<L>
  
  Parameters:
  
  text - Text to classify.
  
  Returns:
  
  a result object containing a list of labels with corresponding scores. Can contain zero, one, or multiple labels.

Class EmbeddingModelTextClassifier<L>

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface dev.langchain4j.classification.TextClassifier

Constructor Details

EmbeddingModelTextClassifier

EmbeddingModelTextClassifier

Method Details

classifyWithScores