FLOAT_VECTOR Similarity
What changed
MonkDB now supports explicit similarity selection per FLOAT_VECTOR column:
- Column DDL option:
WITH (similarity = '...') - Similarity stored in mappings and used by indexing/query planning
VECTOR_SIMILARITYandKNN_MATCHaccept optional similarity override- Default remains
euclideanfor backward compatibility
Supported similarity values
euclidean(default)cosinedot_productmaximum_inner_product
Accepted aliases include:
l2for euclideandotproduct,dot-productfor dot productcosine_similarity,cosine-similarityfor cosinemips,max_inner_productfor maximum inner product
DDL examples
CREATE TABLE euclid_default (
id STRING PRIMARY KEY,
embedding FLOAT_VECTOR(128)
);
CREATE TABLE word_embeddings (
text STRING PRIMARY KEY,
embedding FLOAT_VECTOR(4) WITH (similarity = 'cosine')
);
CREATE TABLE docs_dp (
id STRING PRIMARY KEY,
embedding FLOAT_VECTOR(768) WITH (similarity = 'dot_product')
);
CREATE TABLE recsys_mips (
id STRING PRIMARY KEY,
embedding FLOAT_VECTOR(256) WITH (similarity = 'maximum_inner_product')
);
Query examples
Default column similarity:
WITH param AS (SELECT [0.3, 0.6, 0.0, 0.9] AS sv)
SELECT text,
VECTOR_SIMILARITY(embedding, (SELECT sv FROM param)) AS score
FROM word_embeddings
WHERE KNN_MATCH(embedding, (SELECT sv FROM param), 2)
ORDER BY score DESC;
Override similarity (must be literal, non-null):
WITH param AS (SELECT [0.3, 0.6, 0.0, 0.9] AS sv)
SELECT text,
VECTOR_SIMILARITY(embedding, (SELECT sv FROM param), 'dot_product') AS score
FROM word_embeddings
WHERE KNN_MATCH(embedding, (SELECT sv FROM param), 2, 'dot_product')
ORDER BY score DESC;
Validation rules
- Similarity override must be a non-null literal string.
- Non-literal expressions are rejected at planning time.
- For
KNN_MATCH, mismatch between override and column similarity fails early. - Unsupported
WITHoptions forFLOAT_VECTORare rejected during analysis.
Do / Don't
Do:
- Prefer column-declared similarity and omit override unless needed.
- Keep override values as literal strings.
Don't:
- Pass
NULLor expressions as similarity overrides. - Use unsupported options in
FLOAT_VECTOR ... WITH (...).