Skip to main content

Product Inquiry Detection

This document explains the product inquiry misclassification problem discovered during multi-turn followup testing and the architectural solution implemented to fix it.

The Problem

Observed Failure Patterns

During multi-turn testing, 50% of product inquiry scenarios failed with the following pattern:

User MessageExpected IntentActual IntentProducts
"Mis on see esimene toode?"product_inquiryunknown0
"Kas see sobib 8-aastasele?"product_inquirygift_search3 (wrong)
"Kirjelda kolmandat"product_inquiryunknown0
"Räägi mulle teise toote kohta"product_inquiryunknown0

Root Cause Analysis

The misclassification stemmed from three architectural issues:

Issue 1: Followup Router Guard

The LLM-based followup router (which could detect question_about_shown) was gated behind hasContextToRestore:

function shouldRunFollowupRouter(state: OrchestrationState): boolean {
// ❌ This returns FALSE when no prior context exists
return Boolean(state.hasContextToRestore);
}

Result: The followup router was never called for product inquiry messages.

Issue 2: LLM Router Misclassification

When the followup router was bypassed, messages went to the LLM query router which:

  • Classified ordinal references ("esimene toode") as CLARIFICATION_PATH (too vague)
  • Classified pronoun + age patterns ("Kas see sobib 8-aastasele?") as GIFT_OCCASION_PATH

Issue 3: Mixed Intent Blindness

Questions like "Kas see sobib 8-aastasele?" contain two signals:

  1. Pronoun "see" → Asking about shown product
  2. Age pattern "8-aastasele" → Gift recipient context

The LLM prioritized the gift context pattern, missing the pronoun reference entirely.

The Solution

Why Regex Over LLM?

We implemented deterministic regex detection instead of adding another LLM call for several reasons:

1. Latency Impact

With LLM (hypothetical):
User → [Product Inquiry LLM ~600ms] → [LLM Router ~800ms] → [Context Extraction ~1000ms]
↑ +600ms to EVERY request

With Regex (implemented):
User → [Regex Detection ~2ms] → [LLM Router ~800ms] → ...
↑ Near-instant, only runs when stored products exist

2. Estonian Patterns Are Deterministic

Estonian ordinals follow predictable inflection patterns that regex handles perfectly:

OrdinalNominativeGenitivePartitive
1st (first)esimeneesimeseesimest
2nd (second)teineteiseteist
3rd (third)kolmaskolmandakolmandat
4th (fourth)neljasneljandaneljandat
5th (fifth)viiesviiendaviiendat

3. Codebase Architecture Consistency

The existing routeQueryHybrid already follows this pattern:

// PRIORITY 0: Context Recovery Detection (deterministic) ← Regex
// PRIORITY 1: Keyword Detection (deterministic) ← Regex
// PRIORITY 2: LLM routing (semantic fallback) ← LLM

We added product inquiry at PRIORITY -2 (earliest).

4. Cost Efficiency

ApproachCost per RequestMonthly @ 100k requests
Regex$0$0
LLM~$0.002~$200

5. Layered Defense

Regex doesn't replace LLM—it's a fast-path. If regex misses, the pipeline still has:

  • Followup router (when context exists)
  • resolveProductInquiryIntent() in refine stage

Implementation Details

New Route Type

export type QueryRoute = 
| 'SPECIFIC_PRODUCT_FAST_PATH'
| 'AUTHOR_SEARCH_PATH'
| 'GIFT_OCCASION_PATH'
| 'CLARIFICATION_PATH'
| 'SHOW_MORE_PATH'
| 'CONTEXT_RECOVERY_PATH'
| 'PRODUCT_INQUIRY_PATH'; // ← NEW

Detection Function

function detectProductInquirySignals(userMessage: string): {
detected: boolean;
type: 'ordinal' | 'pronoun' | 'mixed' | null;
confidence: number;
}

Pattern Categories

Ordinal Patterns (Estonian)

const ordinalPatterns = [
/\b(esimene|esimese|esimest|esimesel|esimesest)\b/i, // first
/\b(teine|teise|teist|teisel|teisest)\b/i, // second
/\b(kolmas|kolmanda|kolmandat|kolmandal|kolmandast)\b/i, // third
/\b([1-5])\.?\s*(?:toode|toote|toodet|variant|valik)\b/i, // "1. toode"
];

Pronoun Patterns (Estonian)

const pronounPatterns = [
/\bsee\s+(?:toode|asi|kingitus|mäng|raamat|variant)\b/i, // "see toode"
/\bselles\b/i, // "selles" (in this)
/\bseda\b/i, // "seda" (this one)
/\bkas\s+see\b/i, // "kas see" (is this)
/\bkas\s+see\b.*\d+\s*-?\s*aastase/i, // "kas see sobib 8-aastasele"
];

Mixed Patterns (Highest Confidence)

const mixedPatterns = [
/\b(?:mis|kas|mitu|kuidas|kirjelda|räägi)\b.*\b(?:esime|tei|kolma)/i, // question + ordinal
/\b(?:mis|kas|mitu|kuidas)\b.*\b(?:see|selle|selles|seda)\b/i, // question + pronoun
/\bkas\s+see\s+sobib\b.*\d+\s*-?\s*aastase/i, // "kas see sobib 8-aastasele?"
];

Routing Flow

Confidence Levels

Pattern TypeConfidenceExample
Mixed (question + ordinal)0.95"Mis on see esimene toode?"
Ordinal + question0.90"Kas kolmas sobib?"
Pronoun + question0.85"Kas see sobib?"
Ordinal alone0.80"Kolmas"
Pronoun alone0.60"see toode"

Threshold: Detection requires confidence ≥ 0.7

Test Results After Fix

ScenarioBeforeAfter
"Mis on see esimene toode?"unknownproduct_inquiry
"Kas see sobib 8-aastasele?"gift_searchproduct_inquiry
"Kirjelda kolmandat"unknownproduct_inquiry
"Räägi mulle teise toote kohta"unknownproduct_inquiry
  • Detection Logic: app/api/chat/orchestrators/context-orchestrator/llm-query-router.ts
  • Route Handler: app/api/chat/orchestrators/context-orchestrator/stages/route.ts
  • Product Resolution: app/api/chat/orchestrators/context-orchestrator/product-inquiry.ts
  • Test Suite: qa-surface/test-multi-turn-followups.ts