Finalist Selection Pipeline

This guide explains, end to end, how the chat pipeline narrows thousands of catalog candidates down to the three "finalist" products that appear in the streamed response.

Overview

The system transforms a user's conversational request into precisely curated gift recommendations through a multi-phase pipeline that balances relevance, diversity, and personalization.

High-Level Flow

1. Request Entry

The journey begins when a user sends a message:

API Route (app/api/chat/route.ts) - Validates the API key, parses the ChatRequest, and defers to the ParallelOrchestrator
ParallelOrchestrator (app/api/chat/orchestrators/parallel-orchestrator.ts) - Performs query validation, launches context extraction and search in parallel, and streams a skeleton response immediately for better UX
SearchOrchestrator (app/api/chat/orchestrators/search-orchestrator.ts) - Runs the retrieval, funnel, rerank, diversity, and logging phases

2. Routing Branches

SearchOrchestrator supports multiple routing strategies optimized for different use cases:

Author and Book category paths - Bypass generic flow, stop at ~20 results for precision
Specific product fast path - Single filtered search, skips funnel/rerank for low latency
Clarification path - Exits early when user intent is unclear
Standard path - Full pipeline (most common case)

Phase 1: Multi-Query Retrieval

The system generates multiple search variations to cast a wide net while maintaining relevance.

How It Works

Variation Generation - QueryRewritingService.generateVariations creates focused queries:
- Occasion-focused: "birthday gift for teenager"
- Budget-focused: "gifts under €50"
- Product-type-focused: "tech gadgets"
- And more...
Parallel Convex Search - ProductSearchService.searchMulti fires all variations simultaneously:
- Each variation requests up to 50 hits
- Deterministic seeds for repeatable results
- Random seeds for "show more" functionality
- Automatic category filters prevent full table scans
Intelligent Merging:
- Results weighted by variation type
- Deduplicated by product ID
- Book-only fallbacks when requested
- Language fallbacks (Estonian → English)
- "Exclude books" enforcement

Output: 100-300 candidate products ready for filtering

Phase 2: Three-Stage Funnel

The funnel progressively narrows the candidate pool from hundreds to 20 finalists.

Stage A: Merge & Score

Goal: Cap to the highest scoring ~60 candidates

Applies boost multipliers (merchandising priorities, trending items)
Sorts by score descending
Quick initial cut to reduce processing load

Stage B: Constraint Filtering

Goal: Enforce budget, product type, category, and safety constraints

Implemented in app/api/chat/services/stage-b.ts:

Budget enforcement with 20% soft tolerance
Product type matching (e.g., "tech gadgets" only)
Category constraints (occasion-specific)
Exclude IDs (don't show previously seen products)
Cultural safety checks
Material/allergy constraints with fallback logic
Emergency bypass prevents zero results

Stage C: Category Diversification

Goal: Limit to 20 finalists with max 5 per category

Ensures variety before rerank
Prevents category dominance (e.g., all books)
Logs skipped candidates for tuning

Output: FunnelResult with 20 finalists + stage statistics + warnings

Phase 3: LLM Reranking

The LLM reranking phase uses contextual understanding to score products by relevance.

How It Works

Implemented in app/api/chat/services/rerank.ts:

Threshold Check - Only reranks if ≥ 4 finalists exist (RERANK_MIN_FINALISTS)

LLM Prompt - Sends contextual summary + up to 9 top products to Groq:

Context: "Looking for birthday gift for 25yo female, budget €50, loves reading"
Products: [product1, product2, ...]
Score each 0-100 with reasoning

Score Application:
- Top 9 products get LLM scores
- Products beyond cap get fallback scores from funnel
- On timeout/error: deterministic funnel-based scoring
Enrichment - Each finalist gets:
- rerankScore: 0-100 relevance score
- rerankReasoning: Short explanation

Output: Products resorted by contextual relevance

Phase 4: Quality Filter & Gender Boost

Quality Filter

Implemented in QualityFilter.applyQualityFilter:

Preferred threshold: score ≥ 40
Minimum threshold: score ≥ 25
Fallback: Keep top 3 if both thresholds yield zero

Gender Boost (Phase 4.5)

When recipient gender is known, categories historically preferred by that gender get a boost:

// Example: Female recipient
Electronics: 0.8x multiplier
Cosmetics: 1.3x multiplier
Books: 1.1x multiplier

Products are re-sorted after this personalization step.

Phase 5: Diversity Selection

The diversity phase ensures the final 3 products offer varied options to the user.

Diversity Algorithm

Implemented in app/api/chat/services/diversity.ts:

Slot 1 - Always the highest scoring product (ensures best match)

Slots 2 & 3 - Composite "diversity bonus" rewards:

New product type: +50 points
New category: +30 points
New price tier: +20 points

Penalties:

Repeated product type: -50 to -80 points
Repeated category in slot 3: -80 points

Strict Type Enforcement: When user explicitly requests specific verticals (e.g., "gift cards only"), diversity respects that constraint.

Diversity Metrics

The service returns metrics for monitoring:

Category distribution
Price range distribution
Average relevance score
Diversity score

Response Layer Finalists

Even if SearchOrchestrator returns 20 products, ParallelOrchestrator enforces the 3-product limit:

rawProducts = searchResult.products.slice(0, 20)      // Internal pool
displayProducts = rawProducts.slice(0, 3)             // User sees these

This dual constraint means:

Diversity Service selects 3 (when enabled)
Response layer slices to 3 (always)

Configuration & Feature Flags

Location: app/api/chat/orchestrators/search-orchestrator.config.ts

Phase Toggles

PHASE2_ENABLED = true   // Funnel
PHASE3_ENABLED = true   // Reranking  
PHASE4_ENABLED = false  // Diversity (currently disabled)

Key Constants

Constant	Default	Purpose
`MAX_CANDIDATES_STAGE_A`	60	Stage A output cap
`MAX_CANDIDATES_STAGE_B`	40	Stage B output cap
`MAX_FINALISTS`	20	Stage C output (funnel result)
`RERANK_MIN_FINALISTS`	4	Minimum for LLM rerank
`DIVERSITY_SLOT_COUNT`	3	Final products shown

Environment Variables

Runtime tuning without redeploys:

FUNNEL_STAGE_A_MAX - Widen Stage A
FUNNEL_MAX_FINALISTS - Shrink rerank payload
RERANK_THRESHOLD - Quality filter cutoff

Debug & Monitoring

Each phase emits detailed metrics:

{
  multiSearchDurationMs: 145,
  funnelDurationMs: 23,
  rerankDurationMs: 312,
  diversityDurationMs: 8,
  
  funnelStats: {
    inputCount: 247,
    stageACount: 60,
    stageBCount: 38,
    stageCCount: 20
  },
  
  warnings: [
    "Budget relaxed by 15%",
    "Material constraint fallback applied"
  ]
}

These logs make it straightforward to diagnose whether 3 finalists came from:

Limited initial search
Aggressive filtering
Diversity rules
Response layer slice

Key Takeaways

Multiple Safeguards

The 3-finalist result emerges from multiple safeguards:

Category caps (max 5 per category)
Rerank quality thresholds
Diversity selection rules
Response layer slice

Show More Behavior

"Show more" requests:

Skip diversity phase
Honor Stage C 20-product limit
Allow UI to reveal more without backend changes

Modular Design

Each phase is independently toggled and configured:

Disable diversity → get 20 products (sliced to 3)
Adjust Stage C cap → change rerank input size
Tune quality thresholds → affect final count

Performance Optimized

Parallel searches reduce latency
Early exits for author/book queries
Fast path for specific products
Skeleton streaming for perceived speed

Search Orchestrator - Detailed orchestration logic
Query Variations - How search queries are generated
Funnel Stages - Deep dive into Stage A/B/C
LLM Reranking - Reranking model details
Diversity Algorithm - Diversity calculation

File References:

API Route: app/api/chat/route.ts
ParallelOrchestrator: app/api/chat/orchestrators/parallel-orchestrator.ts
SearchOrchestrator: app/api/chat/orchestrators/search-orchestrator.ts
Funnel: app/api/chat/services/funnel.ts
Stage B: app/api/chat/services/stage-b.ts
Reranking: app/api/chat/services/rerank.ts
Diversity: app/api/chat/services/diversity.ts
Config: app/api/chat/orchestrators/search-orchestrator.config.ts

Overview​

High-Level Flow​

1. Request Entry​

2. Routing Branches​

Phase 1: Multi-Query Retrieval​

How It Works​

Phase 2: Three-Stage Funnel​

Stage A: Merge & Score​

Stage B: Constraint Filtering​

Stage C: Category Diversification​

Phase 3: LLM Reranking​

How It Works​

Phase 4: Quality Filter & Gender Boost​

Quality Filter​

Gender Boost (Phase 4.5)​

Phase 5: Diversity Selection​

Diversity Algorithm​

Diversity Metrics​

Response Layer Finalists​

Configuration & Feature Flags​

Phase Toggles​

Key Constants​

Environment Variables​

Debug & Monitoring​

Key Takeaways​

Multiple Safeguards​

Show More Behavior​

Modular Design​

Performance Optimized​

Related Documentation​

Overview

High-Level Flow

1. Request Entry

2. Routing Branches

Phase 1: Multi-Query Retrieval

How It Works

Phase 2: Three-Stage Funnel

Stage A: Merge & Score

Stage B: Constraint Filtering

Stage C: Category Diversification

Phase 3: LLM Reranking

How It Works

Phase 4: Quality Filter & Gender Boost

Quality Filter

Gender Boost (Phase 4.5)

Phase 5: Diversity Selection

Diversity Algorithm

Diversity Metrics

Response Layer Finalists

Configuration & Feature Flags

Phase Toggles

Key Constants

Environment Variables

Debug & Monitoring

Key Takeaways

Multiple Safeguards

Show More Behavior

Modular Design

Performance Optimized

Related Documentation