Embedding Similarity Boost — Design

Problem

The guided search endpoint can return semantically irrelevant results even when better matches exist. For example, a query like "Getränkemarkt in Europa 2010-2030" may return unrelated content (dementia statistics, productivity software) alongside the correct beverage market results. This happens because:

The primary lexical path has zero semantic signal — ranking is purely text-match + function_score boosts.
The fallback hybrid path (RRF with kNN) merges ranks but doesn't weight semantic similarity strongly enough — RRF treats lexical and kNN as equal rank contributors.

Solution

Add a script_score function using cosineSimilarity to the existing function_score query. When the experiment variant embedding_similarity_boost is active, the query embedding is fetched from Bedrock and injected into the scoring functions. OpenSearch computes cosine similarity per document, boosting semantically relevant results higher.

Approach: `script_score` with `cosineSimilarity` in `function_score`

Why this over alternatives:

vs. application-side reranking: Single round-trip to OpenSearch. No need to fetch embedding vectors back in results. Simpler.
vs. always-on hybrid RRF: RRF fuses ranks without weighting. script_score gives direct, tunable control over how much semantic similarity influences the final score.
vs. separate script_score query wrapper: Fits naturally into the existing function_score.functions array — minimal structural change.

Architecture

Data Flow (experiment variant active)

Request → Handler: parse ExperimentVariant::EmbeddingSimilarityBoost
                 → fetch embedding from Bedrock (always, not just on fallback)
                 → pass Option<&Embedding> to query builders
                 ↓
         Query builder: append script_score function to function_score.functions
                 ↓
         OpenSearch: computes cosineSimilarity(query_vector, doc.embedding) per hit
                   → weighted score added to function_score total
                   → multiplied with base text relevance (boost_mode: multiply)

Affected Paths

Path	Current behavior	With experiment variant
Lexical (primary)	Pure text + function_score boosts	+ script_score cosine similarity
Hybrid (fallback)	RRF(function_score, kNN)	RRF(function_score + script_score, kNN)
No variant / unknown	Unchanged	Unchanged

Scoring Math

{
  "script_score": {
    "script": {
      "source": "cosineSimilarity(params.query_vector, doc['embedding']) + 1.0",
      "params": {
        "query_vector": [/* 512 floats from Bedrock */]
      }
    }
  },
  "weight": 10.0
}

cosineSimilarity returns [-1, 1]. The + 1.0 shifts to [0, 2] to avoid zeroing out scores (existing boost_mode: "multiply").
weight controls influence relative to other functions (decay at ~0.1, timeframe decay at ~3.0, pinned at 1000). Starting at 10.0.
Net effect: semantically close documents get up to 2 × 10.0 = 20.0 added to their function_score sum before multiplying with text relevance. Unrelated documents get ~1.0 × 10.0 = 10.0 (neutral).

Changes by File

1. `crates/usearch/src/v1/domain/experiment_variant.rs`

Add variant:

pub enum ExperimentVariant {
    UseLowerWeightsForTitleAndSubtitle,
    EmbeddingSimilarityBoost, // NEW
}

Map string "embedding_similarity_boost" → EmbeddingSimilarityBoost.

2. `crates/usearch/src/v1/queries/search_params.rs`

Add embedding_similarity_weight: f64 to SearchParams (or a new EmbeddingSimilarityParams struct). Default value: 10.0.

3. `crates/usearch/src/v1/queries/guided.rs`

create_lexical_query and create_hybrid_query: add embedding: Option<&Embedding> parameter.
build_scoring_functions: when embedding is Some, append the script_score function.
New helper build_embedding_similarity_function(embedding: &Embedding, weight: f64) -> Value.

4. `crates/usearch/src/v1/routes/guided.rs`

When ExperimentVariant::EmbeddingSimilarityBoost is parsed:
- Fetch embedding from br_client.create_embedding(&params.q) before the first query.
- Pass Some(&embedding) to create_lexical_query.
- On fallback, reuse the same embedding for create_hybrid_query (no double Bedrock call).
When no variant or different variant: pass None (existing behavior).

Latency Impact

With experiment active: +50-100ms per query (Bedrock embedding call on every request, not just fallback).
Without experiment: Zero impact — no code path changes.

Testing Plan

Unit tests in guided.rs:
- test_embedding_similarity_function_in_lexical_query — verify script_score appears in function_score.functions when embedding is provided.
- test_no_embedding_similarity_without_embedding — verify no script_score when None.
- test_embedding_similarity_in_hybrid_query — verify it appears in the function_score branch of the hybrid query.
- test_embedding_similarity_weight_configurable — verify the weight value from SearchParams.
Production validation (manual):
- Run with ?experiment_variant=embedding_similarity_boost against production OpenSearch.
- Compare result ordering for known problem queries (e.g., "Getränkemarkt in Europa 2010-2030").

Open Questions

Weight tuning: Starting at 10.0 — will need empirical tuning. Could be made dynamic per language or content type in the future.
Interaction with pinned boosts: Pinned boosts use weight 1000 — the embedding similarity boost (max ~20) won't interfere with pinning. Good.
Documents without embeddings: If some documents lack the embedding field, cosineSimilarity will return 0 (→ shifted to 1.0). These documents get a neutral boost, not a penalty. Acceptable.

Last updated: March 31, 2026 at 14:51

By: Alexandra Klochko

📄 View source

Repository: PIT-Search-API/universal-search