Meta Engineering·April 21, 2026

Facebook Groups Search: Hybrid Retrieval Architecture with LLM Evaluation

Meta modernized Facebook Groups Search with a hybrid retrieval architecture, combining traditional lexical search with dense vector embeddings to improve discovery and relevance. This system addresses limitations of keyword-only search by understanding natural language intent and leverages multi-task learning for ranking and LLMs for automated evaluation, leading to significant improvements in user engagement.

Distributed Systems AI & ML Infrastructure Performance & Scaling

Read original on Meta Engineering

The article details the architectural transformation of Facebook Groups Search, moving from a purely keyword-based system to a sophisticated hybrid retrieval architecture. This shift was motivated by critical user pain points: poor discovery due to lexical mismatch, high "effort tax" for consumption (sifting through comments), and difficulty validating information. The new system aims to surface more relevant community content by understanding user intent better than traditional methods.

Hybrid Retrieval Architecture Overview

The core of the solution is a parallel retrieval strategy that combines the strengths of both lexical and semantic search. This ensures comprehensive coverage, capturing both exact matches and conceptual similarities. The system involves three key stages: query preprocessing, parallel retrieval, and L2 ranking.

Query Preprocessing: User queries are tokenized, normalized, and rewritten to ensure clean inputs for both retrieval paths.
Parallel Retrieval: This stage involves two simultaneous pathways: a lexical path using Facebook's Unicorn inverted index for precise keyword matching and a semantic path using a 12-layer, 200-million-parameter search semantic retriever (SSR) model. The SSR encodes natural language into dense vectors, enabling approximate nearest neighbor (ANN) search on a Faiss vector index for conceptual similarity. This allows queries like "Italian coffee drink" to match "cappuccino."
L2 Ranking with Multi-Task Multi-Label (MTML) Architecture: Results from both lexical and semantic retrieval are merged. The ranking model ingests both lexical features (TF-IDF, BM25) and semantic features (cosine similarity scores). It utilizes an MTML supermodel to optimize for multiple engagement objectives simultaneously (clicks, shares, comments), moving beyond single-objective models and maintaining modularity.

Automated Offline Evaluation with LLMs

ℹ️

LLMs for Search Quality Evaluation

One significant innovation is the integration of an automated evaluation framework using Llama 3 with multimodal capabilities as an "automated judge." This addresses the challenge of validating semantic search quality at scale, where high-dimensional vector similarity is not always intuitive. The evaluation prompts are designed for nuance, recognizing categories like "somewhat relevant" to measure improvements in result diversity and conceptual matching without human labeling bottlenecks.

search enginehybrid searchsemantic searchlexical searchvector embeddingsrankingLLM evaluationFaiss

Comments

Loading comments...

Architecture Design

Design this yourself

Design a search system for a large community platform (e.g., forums, Q&A site) that effectively blends lexical and semantic retrieval. Your design should include parallel processing for query understanding, a multi-stage ranking pipeline incorporating both sparse and dense features, and an automated, model-based evaluation framework to continuously assess and improve search relevance and user engagement.

Practice Interview

Focus: hybrid retrieval architecture for search with LLM-powered evaluation

Other design angles

· Design just the semantic retrieval component, focusing on embedding models, vector databases, and ANN search algorithms.· Design a personalized search ranking system that integrates user behavior and historical interactions in addition to content relevance.· Design the infrastructure for an automated search evaluation pipeline that uses large language models to provide nuanced relevance judgments.

Facebook Groups Search: Hybrid Retrieval Architecture with LLM Evaluation

Hybrid Retrieval Architecture Overview

Automated Offline Evaluation with LLMs

Comments

Architecture Design

Related Lessons