Skip to main content

Finalist Selection Pipeline

This guide explains, end to end, how the chat pipeline narrows thousands of catalog candidates down to the three "finalist" products that appear in the streamed response.

Overview

The system transforms a user's conversational request into precisely curated gift recommendations through a multi-phase pipeline that balances relevance, diversity, and personalization.

High-Level Flow

1. Request Entry

The journey begins when a user sends a message:

  1. API Route (app/api/chat/route.ts) - Validates the API key, parses the ChatRequest, and defers to the ParallelOrchestrator
  2. ParallelOrchestrator (app/api/chat/orchestrators/parallel-orchestrator.ts) - Performs query validation, launches context extraction and search in parallel, and streams a skeleton response immediately for better UX
  3. SearchOrchestrator (app/api/chat/orchestrators/search-orchestrator.ts) - Runs the retrieval, funnel, rerank, diversity, and logging phases

2. Routing Branches

SearchOrchestrator supports multiple routing strategies optimized for different use cases:

  • Author and Book category paths - Bypass generic flow, stop at ~20 results for precision
  • Specific product fast path - Single filtered search, skips funnel/rerank for low latency
  • Clarification path - Exits early when user intent is unclear
  • Standard path - Full pipeline (most common case)

Phase 1: Multi-Query Retrieval

The system generates multiple search variations to cast a wide net while maintaining relevance.

How It Works

  1. Variation Generation - QueryRewritingService.generateVariations creates focused queries:

    • Occasion-focused: "birthday gift for teenager"
    • Budget-focused: "gifts under €50"
    • Product-type-focused: "tech gadgets"
    • And more...
  2. Parallel Convex Search - ProductSearchService.searchMulti fires all variations simultaneously:

    • Each variation requests up to 50 hits
    • Deterministic seeds for repeatable results
    • Random seeds for "show more" functionality
    • Automatic category filters prevent full table scans
  3. Intelligent Merging:

    • Results weighted by variation type
    • Deduplicated by product ID
    • Book-only fallbacks when requested
    • Language fallbacks (Estonian → English)
    • "Exclude books" enforcement

Output: 100-300 candidate products ready for filtering

Phase 2: Three-Stage Funnel

The funnel progressively narrows the candidate pool from hundreds to 20 finalists.

Stage A: Merge & Score

Goal: Cap to the highest scoring ~60 candidates

  • Applies boost multipliers (merchandising priorities, trending items)
  • Sorts by score descending
  • Quick initial cut to reduce processing load

Stage B: Constraint Filtering

Goal: Enforce budget, product type, category, and safety constraints

Implemented in app/api/chat/services/stage-b.ts:

  • Budget enforcement with 20% soft tolerance
  • Product type matching (e.g., "tech gadgets" only)
  • Category constraints (occasion-specific)
  • Exclude IDs (don't show previously seen products)
  • Cultural safety checks
  • Material/allergy constraints with fallback logic
  • Emergency bypass prevents zero results

Stage C: Category Diversification

Goal: Limit to 20 finalists with max 5 per category

  • Ensures variety before rerank
  • Prevents category dominance (e.g., all books)
  • Logs skipped candidates for tuning

Output: FunnelResult with 20 finalists + stage statistics + warnings

Phase 3: LLM Reranking

The LLM reranking phase uses contextual understanding to score products by relevance.

How It Works

Implemented in app/api/chat/services/rerank.ts:

  1. Threshold Check - Only reranks if ≥ 4 finalists exist (RERANK_MIN_FINALISTS)

  2. LLM Prompt - Sends contextual summary + up to 9 top products to Groq:

    Context: "Looking for birthday gift for 25yo female, budget €50, loves reading"
    Products: [product1, product2, ...]
    Score each 0-100 with reasoning
  3. Score Application:

    • Top 9 products get LLM scores
    • Products beyond cap get fallback scores from funnel
    • On timeout/error: deterministic funnel-based scoring
  4. Enrichment - Each finalist gets:

    • rerankScore: 0-100 relevance score
    • rerankReasoning: Short explanation

Output: Products resorted by contextual relevance

Phase 4: Quality Filter & Gender Boost

Quality Filter

Implemented in QualityFilter.applyQualityFilter:

  • Preferred threshold: score ≥ 40
  • Minimum threshold: score ≥ 25
  • Fallback: Keep top 3 if both thresholds yield zero

Gender Boost (Phase 4.5)

When recipient gender is known, categories historically preferred by that gender get a boost:

// Example: Female recipient
Electronics: 0.8x multiplier
Cosmetics: 1.3x multiplier
Books: 1.1x multiplier

Products are re-sorted after this personalization step.

Phase 5: Diversity Selection

The diversity phase ensures the final 3 products offer varied options to the user.

Diversity Algorithm

Implemented in app/api/chat/services/diversity.ts:

Slot 1 - Always the highest scoring product (ensures best match)

Slots 2 & 3 - Composite "diversity bonus" rewards:

  • New product type: +50 points
  • New category: +30 points
  • New price tier: +20 points

Penalties:

  • Repeated product type: -50 to -80 points
  • Repeated category in slot 3: -80 points

Strict Type Enforcement: When user explicitly requests specific verticals (e.g., "gift cards only"), diversity respects that constraint.

Diversity Metrics

The service returns metrics for monitoring:

  • Category distribution
  • Price range distribution
  • Average relevance score
  • Diversity score

Response Layer Finalists

Even if SearchOrchestrator returns 20 products, ParallelOrchestrator enforces the 3-product limit:

rawProducts = searchResult.products.slice(0, 20)      // Internal pool
displayProducts = rawProducts.slice(0, 3) // User sees these

This dual constraint means:

  1. Diversity Service selects 3 (when enabled)
  2. Response layer slices to 3 (always)

Configuration & Feature Flags

Location: app/api/chat/orchestrators/search-orchestrator.config.ts

Phase Toggles

PHASE2_ENABLED = true   // Funnel
PHASE3_ENABLED = true // Reranking
PHASE4_ENABLED = false // Diversity (currently disabled)

Key Constants

ConstantDefaultPurpose
MAX_CANDIDATES_STAGE_A60Stage A output cap
MAX_CANDIDATES_STAGE_B40Stage B output cap
MAX_FINALISTS20Stage C output (funnel result)
RERANK_MIN_FINALISTS4Minimum for LLM rerank
DIVERSITY_SLOT_COUNT3Final products shown

Environment Variables

Runtime tuning without redeploys:

  • FUNNEL_STAGE_A_MAX - Widen Stage A
  • FUNNEL_MAX_FINALISTS - Shrink rerank payload
  • RERANK_THRESHOLD - Quality filter cutoff

Debug & Monitoring

Each phase emits detailed metrics:

{
multiSearchDurationMs: 145,
funnelDurationMs: 23,
rerankDurationMs: 312,
diversityDurationMs: 8,

funnelStats: {
inputCount: 247,
stageACount: 60,
stageBCount: 38,
stageCCount: 20
},

warnings: [
"Budget relaxed by 15%",
"Material constraint fallback applied"
]
}

These logs make it straightforward to diagnose whether 3 finalists came from:

  • Limited initial search
  • Aggressive filtering
  • Diversity rules
  • Response layer slice

Key Takeaways

Multiple Safeguards

The 3-finalist result emerges from multiple safeguards:

  • Category caps (max 5 per category)
  • Rerank quality thresholds
  • Diversity selection rules
  • Response layer slice

Show More Behavior

"Show more" requests:

  • Skip diversity phase
  • Honor Stage C 20-product limit
  • Allow UI to reveal more without backend changes

Modular Design

Each phase is independently toggled and configured:

  • Disable diversity → get 20 products (sliced to 3)
  • Adjust Stage C cap → change rerank input size
  • Tune quality thresholds → affect final count

Performance Optimized

  • Parallel searches reduce latency
  • Early exits for author/book queries
  • Fast path for specific products
  • Skeleton streaming for perceived speed

File References:

  • API Route: app/api/chat/route.ts
  • ParallelOrchestrator: app/api/chat/orchestrators/parallel-orchestrator.ts
  • SearchOrchestrator: app/api/chat/orchestrators/search-orchestrator.ts
  • Funnel: app/api/chat/services/funnel.ts
  • Stage B: app/api/chat/services/stage-b.ts
  • Reranking: app/api/chat/services/rerank.ts
  • Diversity: app/api/chat/services/diversity.ts
  • Config: app/api/chat/orchestrators/search-orchestrator.config.ts