Class IVFIndexBuilder
This index builder allows to configure an Inverted File Flat index on the
embedding column of the EmbeddingTable
.
Inverted File Flat (IVF): index is the only type of Neighbor Partition vector index supported. Inverted File Flat Index (IVF Flat or simply IVF) is a partitioned-based index which balance high search quality with reasonable speed.
-
Field Summary
Fields -
Method Summary
Modifier and TypeMethodDescriptionbuild()
Builds the index object configured by this builder.createOption
(CreateOption createOption) Configures the option to create (or not create) an index.degreeOfParallelism
(int degreeOfParallelism) Configures the degree of parallelism of the index.minVectorsPerPartition
(int minVectorsPerPartition) Configures the target minimum number of vectors per partition.Sets the index name.neighborPartitions
(int neighborPartitions) Configures the number of neighbor partitions.samplePerPartition
(int samplePerPartition) Configures the total number of vectors that are passed to the clustering algorithm.targetAccuracy
(int targetAccuracy) Configures the target accuracy.
-
Field Details
-
indexName
The name of the index, or null if no name was set.
-
-
Method Details
-
targetAccuracy
Configures the target accuracy.- Parameters:
targetAccuracy
- Percentage value.- Returns:
- This builder.
- Throws:
IllegalArgumentException
- If the target accuracy not between 1 and 100.
-
degreeOfParallelism
Configures the degree of parallelism of the index.- Parameters:
degreeOfParallelism
- The degree of parallelism.- Returns:
- This builder.
-
neighborPartitions
Configures the number of neighbor partitions.This is a IVF Specific Parameters. It determines the number of centroid partitions that are created by the index.
- Parameters:
neighborPartitions
- The number of neighbor partitions.- Returns:
- This builder.
- Throws:
IllegalArgumentException
- If the number of neighbor partitions is not between 1 and 10000000, or if the vector type is not IVF.
-
samplePerPartition
Configures the total number of vectors that are passed to the clustering algorithm.This is a IVF Specific Parameters. It decides the total number of vectors that are passed to the clustering algorithm (number of samples per partition times the number of neighbor partitions).
Note, that passing all the vectors would significantly increase the total time to create the index. Instead, aim to pass a subset of vectors that can capture the data distribution.
- Parameters:
samplePerPartition
- The total number of vectors that are passed to the clustering algorithm.- Returns:
- This builder.
- Throws:
IllegalArgumentException
- If the number of samples per partition is lower than 1.
-
minVectorsPerPartition
public IVFIndexBuilder minVectorsPerPartition(int minVectorsPerPartition) throws IllegalArgumentException Configures the target minimum number of vectors per partition.This is a IVF Specific Parameters. It represents the target minimum number of vectors per partition. Aim to trim out any partition that can end up with fewer than 100 vectors. This may result in lesser number of centroids. Its values can range from 0 (no trimming of centroids) to num_vectors (would result in 1 neighbor partition).
- Parameters:
minVectorsPerPartition
- The target minimum number of vectors per partition.- Returns:
- This builder.
- Throws:
IllegalArgumentException
- If the target minimum number of vectors per partition is lower than 0.
-
build
Builds the index object configured by this builder.- Returns:
- The index object.
-
createOption
Configures the option to create (or not create) an index. The default isCreateOption.CREATE_IF_NOT_EXISTS
, which means that an index will be created if an index with the same name does not already exist.- Parameters:
createOption
- The create option.- Returns:
- This builder.
- Throws:
IllegalArgumentException
- If createOption is null.
-
name
Sets the index name.- Parameters:
indexName
- The name of the index.- Returns:
- This builder.
-