hannoy 🗼

hannoy is a key-value backed HNSW implementation based on arroy.

Motivation

Many popular HNSW libraries are built in memory, meaning you need enough RAM to store all the vectors you're indexing. Instead, hannoy uses LMDB — a memory-mapped KV store — as a storage backend. This is more well-suited for machines running multiple programs, or cases where the dataset you're indexing won't fit in memory. LMDB also supports non-blocking concurrent reads by design, meaning its safe to query the index in multi-threaded environments.

Features

Supported metrics: euclidean, cosine, manhattan, hamming, as well as quantized counterparts.
Multithreaded builds using rayon
Build index on disk to enable indexing big datasets that won't fit into memory using LMDB
Compressed bitmaps to store graph edges with minimal overhead, adding overhead of only ~200 bytes per vector
Dynamic document insertions and deletions

Missing Features

Python support
GPU-accelerated indexing

Usage

Here's a quick demo:

use hannoy::{distances::Cosine, Database, Reader, Result, Writer};
use heed::EnvOpenOptions;
use rand::{rngs::StdRng, SeedableRng};

fn main() -> Result<()> {
    const DIM: usize = 3;
    let vecs: Vec<[f32; DIM]> = vec![[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]];

    let env = unsafe {
        EnvOpenOptions::new()
            .map_size(1024 * 1024 * 1024 * 1) // 1GiB
            .open("./")
    }
    .unwrap();

    let mut wtxn = env.write_txn().unwrap();
    let db: Database<Cosine> = env.create_database(&mut wtxn, None)?;
    let writer: Writer<Cosine> = Writer::new(db, 0, DIM);

    // insert into lmdb
    writer.add_item(&mut wtxn, 0, &vecs[0])?;
    writer.add_item(&mut wtxn, 1, &vecs[1])?;
    writer.add_item(&mut wtxn, 2, &vecs[2])?;

    // ...and build hnsw
    let mut rng = StdRng::seed_from_u64(42);

    let mut builder = writer.builder(&mut rng);
    builder.ef_construction(100).build::<16,32>(&mut wtxn)?;
    wtxn.commit()?;

    // search hnsw using a new lmdb read transaction
    let rtxn = env.read_txn()?;
    let reader = Reader::<Cosine>::open(&rtxn, 0, db)?;

    let query = vec![0.0, 1.0, 0.0];
    let nns = reader.nns(1).ef_search(10).by_vector(&rtxn, &query)?;

    dbg!("{:?}", &nns);
    Ok(())
}

Tips and tricks

Reducing cold start latencies

Search in an hnsw always traverses from the top to bottom layers of the graph, so we know a priori some vectors will be needed. We can hint to the kernel that these vectors (and their neighbours) should be loaded into RAM using madvise to speed up search.

Doing so can reduce cold-start latencies by several milliseconds, and is configured through the HANNOY_READER_PREFETCH_MEMORY environment variable.

E.g. prefetching 10MiB of vectors into RAM.

export HANNOY_READER_PREFETCH_MEMORY=10485760

Name		Name	Last commit message	Last commit date
Latest commit History 132 Commits
.github/workflows		.github/workflows
assets		assets
benches		benches
docs/benchmarks		docs/benchmarks
example		example
src		src
.gitignore		.gitignore
.release-plz.toml		.release-plz.toml
.rustfmt.toml		.rustfmt.toml
CHANGELOG.md		CHANGELOG.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

hannoy 🗼

Motivation

Features

Missing Features

Usage

Tips and tricks

Reducing cold start latencies

About

Uh oh!

Releases 5

Packages

Contributors 4

Languages

License

nnethercott/hannoy

Folders and files

Latest commit

History

Repository files navigation

hannoy 🗼

Motivation

Features

Missing Features

Usage

Tips and tricks

Reducing cold start latencies

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Contributors 4

Languages

Packages