Class AzureCosmosDbMongoVCoreEmbeddingStore

java.lang.Object
dev.langchain4j.store.embedding.azure.cosmos.mongo.vcore.AzureCosmosDbMongoVCoreEmbeddingStore
All Implemented Interfaces:
EmbeddingStore<TextSegment>

public class AzureCosmosDbMongoVCoreEmbeddingStore extends Object implements EmbeddingStore<TextSegment>
Represents an Azure CosmosDB Mongo vCore as an embedding store.

More info to set up MongoDb as vectorDatabase.

  • Constructor Details

    • AzureCosmosDbMongoVCoreEmbeddingStore

      public AzureCosmosDbMongoVCoreEmbeddingStore(com.mongodb.client.MongoClient mongoClient, String connectionString, String databaseName, String collectionName, String indexName, String applicationName, com.mongodb.client.model.CreateCollectionOptions createCollectionOptions, Boolean createIndex, String kind, Integer numLists, Integer dimensions, Integer m, Integer efConstruction, Integer efSearch)
      Parameters:
      mongoClient - - mongoClient for the Azure CosmosDB Mongo vCore
      connectionString - - connection string required to connect to Azure Cosmos Mongo vCore
      databaseName - - databaseName for the mongoDb vCore
      collectionName - - collection name for the mongoDB vCore
      indexName - - index name for the mongoDB vCore collection
      applicationName - - application name for the client for tracking and logging
      createCollectionOptions - - options for creating a collection
      createIndex - - set to true if you want the application to create an index, or false if you want to create it manually.
      kind - - Type of vector index to create. Possible options are: - vector-ivf - vector-hnsw: available as a preview feature only, to enable visit https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/preview-features
      numLists - - This integer is the number of clusters that the inverted file (IVF) index uses to group the vector data. We recommend that numLists is set to documentCount/1000 for up to 1 million documents and to sqrt(documentCount) for more than 1 million documents. Using a numLists value of 1 is akin to performing brute-force search, which has limited performance.
      dimensions - - Number of dimensions for vector similarity. The maximum number of supported dimensions is 2000.
      m - - used only for vector -hnsw. The max number of connections per layer (16 by default, minimum value is 2, maximum value is 100). Higher m is suitable for datasets with high dimensionality and/or high accuracy requirements.
      efConstruction - - used only for vector -hnsw. The size of the dynamic candidate list for constructing the graph (64 by default, minimum value is 4, maximum value is 1000). Higher ef_construction will result in better index quality and higher accuracy, but it will also increase the time required to build the index. ef_construction has to be at least 2 * m.
      efSearch - - used only for vector -hnsw. The size of the dynamic candidate list for search (40 by default). A higher value provides better recall at the cost of speed.
  • Method Details