Skip to main content

15. Reciprocal Rank Fusion (RRF)

When you retrieve from more than one source (multi-query variants, or dense + BM25), you get several ranked lists with incompatible score scales. RRF merges them fairly by ignoring scores and using only rank.

The problem it solves

dense scores: 0.81, 0.79, 0.77 (cosine)
BM25 scores: 14.2, 9.6, 7.1 (totally different scale)
→ you can't just add these together

The formula

RRF(d) = Σ 1 / (k + rank_R(d))
lists R
  • rank_R(d) = the position of document d in list R (1 = best).
  • k = a small constant (commonly 60) that softens the gap between top ranks so no single list dominates.
  • A document ranked highly in multiple lists accumulates a large score — RRF rewards consensus.

Code

def reciprocal_rank_fusion(result_lists, k=60):
"""result_lists: list of ranked lists of doc ids (best first)."""
scores = {}
for ranked in result_lists:
for rank, doc_id in enumerate(ranked, start=1):
scores[doc_id] = scores.get(doc_id, 0) + 1 / (k + rank)
return sorted(scores, key=scores.get, reverse=True)

dense = ["d3", "d1", "d7", "d2"]
bm25 = ["d1", "d9", "d3", "d5"]
fused = reciprocal_rank_fusion([dense, bm25]) # d1, d3 rise — they're in both

Where it fits

list A ─┐
list B ─┼─▶ RRF (keep ranks, drop scores) ─▶ one fused ranking
list C ─┘

RRF = keep ranks, drop scores, reward consensus. It's the standard fusion step in Elasticsearch, OpenSearch, Weaviate, Qdrant, and Azure AI Search.

Next: Hybrid Search →