Fig. 2 · The retrieval funnelrecall cheap · precision expensive
Corpus 10⁶+ passages · indexed offline Retrieve ~150 hybrid · bi-encoder + BM25 fused with RRF · cheap Rerank ~10 cross-encoder expensive Context 5–10 shown to the model cost / item → precision worth paying for
items surviving each stage compute cost per item spend it where the set is small