Performance evaluation of vector nearest neighbour solutions in Elasticsearch.
Docker resources: Ensure your Docker is configured with sufficient resources for reliable performance results. For reference, our original experiments were run with the following set up:
- Hardware: MacBook Pro, 15-inch, 2018; 2.6 GHz Intel Core i7; 6 cores; 16 GB RAM.
- Docker: Docker for Mac 2.1.0.3; Docker Engine 19.03.2; 11 GiB memory, 6 CPUs; 1 GiB swap.
Install dependencies:
pipenv sync
Do all data preparation for experiments to be excuted. This will involve downloading a large (1GB+) dataset of vectors:
pipenv run invoke prep
Execute all experiments with:
pipenv run python -m experiments
Measurements can be found in reports/results.csv
.
See above the hardware that ran these results. See docker file for software versions.
Indexes were configured as single shard and single replica. Experiments do not take take direct action to normalise cold vs warm cache scenarios.
dense
: Elasticsearch's vector scoring functionality introduced in
version 7.3. Relies on the dense vector
datatype.
fcs
: The Cookpad fork of the Fast Cosine Similarity
plugin for Elasticsearch. Originally written by StaySense and
subsequently forked into various projects.