Class HnswGraphBuilder

java.lang.Object
org.apache.lucene.util.hnsw.HnswGraphBuilder

public final class HnswGraphBuilder extends Object
Builder for HNSW graph. See HnswGraph for a gloss on the algorithm and the meaning of the hyperparameters.
  • Field Details

  • Constructor Details

    • HnswGraphBuilder

      public HnswGraphBuilder(RandomAccessVectorValuesProducer vectors, VectorSimilarityFunction similarityFunction, int M, int beamWidth, long seed) throws IOException
      Reads all the vectors from a VectorValues, builds a graph connecting them by their dense ordinals, using the given hyperparameter settings, and returns the resulting graph.
      Parameters:
      vectors - the vectors whose relations are represented by the graph - must provide a different view over those vectors than the one used to add via addGraphNode.
      M - – graph fanout parameter used to calculate the maximum number of connections a node can have – M on upper layers, and M * 2 on the lowest level.
      beamWidth - the size of the beam search to use when finding nearest neighbors.
      seed - the seed for a random number generator used during graph construction. Provide this to ensure repeatable construction.
      Throws:
      IOException
  • Method Details

    • build

      public OnHeapHnswGraph build(RandomAccessVectorValues vectors) throws IOException
      Reads all the vectors from two copies of a random access VectorValues. Providing two copies enables efficient retrieval without extra data copying, while avoiding collision of the returned values.
      Parameters:
      vectors - the vectors for which to build a nearest neighbors graph. Must be an independet accessor for the vectors
      Throws:
      IOException
    • setInfoStream

      public void setInfoStream(InfoStream infoStream)
      Set info-stream to output debugging information *
    • addGraphNode

      void addGraphNode(int node, float[] value) throws IOException
      Inserts a doc with vector value to the graph
      Throws:
      IOException
    • printGraphBuildStatus

      private long printGraphBuildStatus(int node, long start, long t)
    • addDiverseNeighbors

      private void addDiverseNeighbors(int level, int node, NeighborQueue candidates) throws IOException
      Throws:
      IOException
    • selectAndLinkDiverse

      private void selectAndLinkDiverse(NeighborArray neighbors, NeighborArray candidates, int maxConnOnLevel) throws IOException
      Throws:
      IOException
    • popToScratch

      private void popToScratch(NeighborQueue candidates)
    • diversityCheck

      private boolean diversityCheck(float[] candidate, float score, NeighborArray neighbors, RandomAccessVectorValues vectorValues) throws IOException
      Parameters:
      candidate - the vector of a new candidate neighbor of a node n
      score - the score of the new candidate and node n, to be compared with scores of the candidate and n's neighbors
      neighbors - the neighbors selected so far
      vectorValues - source of values used for making comparisons between candidate and existing neighbors
      Returns:
      whether the candidate is diverse given the existing neighbors
      Throws:
      IOException
    • findWorstNonDiverse

      private int findWorstNonDiverse(NeighborArray neighbors) throws IOException
      Find first non-diverse neighbour among the list of neighbors starting from the most distant neighbours
      Throws:
      IOException
    • getRandomGraphLevel

      private static int getRandomGraphLevel(double ml, SplittableRandom random)