Skip to content

Benchmarking and comparison of vector nearest neighbour in Elasticsearch.

Notifications You must be signed in to change notification settings

mattjw/elasticsearch-nn-benchmarks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Elasticsearch nearest neighbour vector scoring benchmarks

Performance evaluation of vector nearest neighbour solutions in Elasticsearch.

Run the evaluation

Prerequisites

Docker resources: Ensure your Docker is configured with sufficient resources for reliable performance results. For reference, our original experiments were run with the following set up:

  • Hardware: MacBook Pro, 15-inch, 2018; 2.6 GHz Intel Core i7; 6 cores; 16 GB RAM.
  • Docker: Docker for Mac 2.1.0.3; Docker Engine 19.03.2; 11 GiB memory, 6 CPUs; 1 GiB swap.

Install dependencies:

pipenv sync

Do all data preparation for experiments to be excuted. This will involve downloading a large (1GB+) dataset of vectors:

pipenv run invoke prep

Run experiments

Execute all experiments with:

pipenv run python -m experiments

Measurements can be found in reports/results.csv.

Methodology

Experimental setup

See above the hardware that ran these results. See docker file for software versions.

Indexes were configured as single shard and single replica. Experiments do not take take direct action to normalise cold vs warm cache scenarios.

Nearest neighbour implementations

dense: Elasticsearch's vector scoring functionality introduced in version 7.3. Relies on the dense vector datatype.

fcs: The Cookpad fork of the Fast Cosine Similarity plugin for Elasticsearch. Originally written by StaySense and subsequently forked into various projects.

Results

Query performance

Query performance

Insertion performance

Insertion performance

About

Benchmarking and comparison of vector nearest neighbour in Elasticsearch.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published