Query Specificity Detection

Overview

This document explains how the system differentiates between specific product queries (like "show me gift cards") and vague exploratory queries (like "show me valentine day gifts under 100 euro"). This distinction is critical for determining search strategy, confidence scoring, and whether to ask clarifying questions.

Two Types of Queries

Specific Query

Example: "Show me gift cards" / "Näita mulle kinkekaarte"

Characteristics:

Direct product mention (gift cards, books, games)
Short and focused (2-5 words)
No occasion or recipient context
Single, clear product type
High confidence intent

System Response: Focused product search in one category

Vague Query

Example: "Show me valentine day gifts under 100 euro" / "Kingitus emadepäevaks alla 50 euro"

Characteristics:

Occasion or recipient mentioned
Multiple possible product types
Budget constraints
Exploratory intent
Lower confidence

System Response: Multi-category exploration or clarifying question

Detection Flow

Specificity Scoring System

Location: app/api/chat/services/query-rewriting/index.ts:33-100

The system uses a weighted scoring system with 5 detection criteria:

function detectQuerySpecificity(query: string, context: GiftContext): boolean {
  const specificityScore = 
    (matchesSpecificPattern ? 3 : 0) +      // Weight: 3
    (hasSingleProductType ? 2 : 0) +        // Weight: 2
    (isProductSearchIntent ? 2 : 0) +       // Weight: 2
    (isShortQuery ? 1 : 0) +                // Weight: 1
    (noOccasionOrRecipient ? 1 : 0);        // Weight: 1
  
  return specificityScore >= 4; // Threshold
}

Criteria Breakdown

1. Pattern Matching (Weight: 3)

Detects direct product request patterns:

Estonian Patterns:

näita mulle [toode] → "show me [product]"
kas teil on [toode] → "do you have [product]"
soovin [toode] → "I want [product]"
otsin [toode] → "looking for [product]"

English Patterns:

show me gift cards
do you have books
looking for games

Products Detected: kinkekaart (gift card), raamat (book), mäng (game), film (movie), muusika (music)

2. Single Product Type (Weight: 2)

Check:

hasSingleProductType = 
  context.productType && 
  (!context.productTypeHints || productTypeHints.length <= 1) &&
  context.productType !== 'Kingitused' // "Gifts" is too generic

Example:

Specific: productType: "Kinkekaart", hints: []
Vague: productType: "Kingitused", hints: ["Raamat", "Mängud", "Kodu ja aed"]

3. Product Search Intent (Weight: 2) 🔎

Check: context.intent === 'product_search'

Contrasted with:

valentines_day_gift (occasion-based)
birthday_gift (occasion-based)
general_gift (exploratory)
product_recommendation (needs suggestions)

4. Short Query (Weight: 1) 📏

Check: wordCount >= 2 && wordCount <= 5

Examples:

Short (Specific): "show me gift cards" (4 words)
Long (Vague): "I need a valentine day gift for my girlfriend under 100 euro" (13 words)

5. No Occasion/Recipient (Weight: 1) 🚫

Check: !context.occasion && !context.recipient

Rationale: Occasion/recipient indicates exploratory gift search, not direct product purchase.

Examples:

No Context (Specific): "show me gift cards"
Has Context (Vague): "gift for mother" (recipient: "ema")

Example Scoring

Example 1: "Show me gift cards"

Breakdown:

Matches pattern: show me gift cards (+3)
Single product type: Kinkekaart (+2)
Intent: product_search (+2)
Word count: 4 words (+1)
No occasion/recipient (+1)

Total Score: 9 ≥ 4 → SPECIFIC

Example 2: "Show me valentine day gifts under 100 euro"

Breakdown:

No pattern match: "gifts" is generic, not specific product (+0)
Multiple product types: Kingitused, hints: ["Raamat", "Kodu ja aed", "Mängud"] (+0)
Intent: valentines_day_gift (occasion-based, not product_search) (+0)
Word count: 7 words (too long) (+0)
Has occasion: valentinipäev (+0)

Total Score: 0 < 4 → VAGUE

Signal-Based Confidence Scoring

Location: app/api/chat/services/context-understanding/context-normalizer.ts:59-81

The system calculates confidence based on number of meaningful signals:

Signal Detection

Product Signals (`hasMeaningfulProductSignals`)

Location: app/api/chat/utils/context-signals.ts:59-70

Checks:

Has explicit productType (non-empty string)
Has explicit category (non-empty string)
Has productTypeHints (non-empty array)
Has categoryHints (non-empty array)
AND signals are NOT fallback defaults

Fallback Defaults (don't count as meaningful):

Product type: "Kingitused" (generic "Gifts")
Category hints: ["Kruusid", "Küünlad", "Vaasid"] (generic gift categories)

Gift Context (`isGiftContextMissing`)

Location: app/api/chat/utils/context-signals.ts:72-74

Present if:

context.recipient || context.occasion

Examples:

Has context: recipient: "ema" (for mother)
Has context: occasion: "valentinipäev" (Valentine's Day)
Missing: No recipient or occasion

Budget Info (`isBudgetMissing`)

Location: app/api/chat/utils/context-signals.ts:76-85

Present if:

budget.min || budget.max || budget.hint

Examples:

Has budget: { max: 100 }
Has budget: { hint: "alla 50 euro" }
Missing: budget: undefined

Search Strategy Differentiation

Specific Query Strategy

When: Specificity score ≥ 4

Approach: Focused single-category search

Example: "Show me gift cards"

Code Reference: app/api/chat/services/query-rewriting/index.ts:236-248

if (isSpecificQuery) {
  console.log(' SPECIFIC QUERY DETECTED - Focus all variations on:', {
    productType: context.productType,
    query: originalQuery
  });
  // Generate variations ONLY for this product type
  // No cross-category exploration
}

Vague Query Strategy

When: Specificity score < 4 OR low confidence

Approach: Multi-category exploration or clarifying question

Example: "Show me valentine day gifts under 100 euro"

Code References:

Clarifying question logic: app/api/chat/handlers/clarifying-question-handler.ts:60-128
Multi-category generation: app/api/chat/services/query-rewriting/index.ts:250-983

🚦 Clarifying Question Flow

Trigger Conditions:

Confidence < 0.7 AND
No meaningful product signals AND
No explicit product type or category AND
Not a follow-up intent (show_more, cheaper_alternatives)

Example Clarifying Questions:

Estonian:

Millist tüüpi kingitust otsite?

Näiteks:
• Raamatud ja kirjandus
• Mängud ja pusled
• Kodu- ja aiatarbed
• Kosmeetika ja ilu

English:

What type of gift are you looking for?

For example:
• Books and literature
• Games and puzzles
• Home and garden
• Cosmetics and beauty

Complete Example Walkthroughs

Example 1: Specific Query - "Show me gift cards"

Example 2: Vague Query - "Show me valentine day gifts under 100 euro"

🔧 Key Implementation Files

1. Query Specificity Detection

File: app/api/chat/services/query-rewriting/index.ts
Lines: 33-100
Function: detectQuerySpecificity(query, context, debug)

Responsibility: Scores query specificity (0-9) and returns boolean (>= 4 is specific)

2. Context Signal Analysis

File: app/api/chat/utils/context-signals.ts
Lines: 59-85
Functions:

hasMeaningfulProductSignals(context) - checks for non-fallback product hints
isGiftContextMissing(context) - checks for recipient/occasion
isBudgetMissing(context) - checks for budget info

3. Confidence Normalization

File: app/api/chat/services/context-understanding/context-normalizer.ts
Lines: 48-240
Function: normalizeContext(context, userMessage)

Responsibility: Adjusts confidence based on signal count (0 signals → 0.3, 1 signal → 0.55, 2+ → keep original)

4. Clarifying Question Handler

File: app/api/chat/handlers/clarifying-question-handler.ts
Lines: 60-128
Function: shouldAskClarifyingQuestion(giftContext, intentResult, debug)

Responsibility: Decides whether to ask clarifying question instead of searching

5. Prompt Classification Rules

File: app/api/chat/services/context-understanding/prompts.ts
Lines: 3-160 (Fast Classifier), 162-320 (Main Extractor)

Responsibility: LLM prompts with examples for classifying specific products vs vague gifts

Configuration Options

Environment Variables

# Enable/disable clarifying questions
CLARIFYING_QUESTIONS_ENABLED=true

# Enable debug logging for specificity detection
CHAT_DEBUG_LOGS=true

# Confidence threshold for clarifying questions
# Default: 0.7 (queries below this may trigger clarification)

Tunable Parameters

In query-rewriting/index.ts:75:

const SPECIFICITY_THRESHOLD = 4; // Minimum score for "specific"

In context-normalizer.ts:70-78:

const CONFIDENCE_0_SIGNALS = 0.3;  // Very low confidence
const CONFIDENCE_1_SIGNAL = 0.55;  // Low confidence
const CONFIDENCE_2_SIGNALS = 0.7;  // High confidence (keeps original)

In clarifying-question-handler.ts:67:

const CLARIFY_CONFIDENCE_THRESHOLD = 0.7;

Best Practices

When Adding New Product Types

Update pattern matching in query-rewriting/index.ts:37-46:

const SPECIFIC_PATTERNS = [
  /näita.*mulle.*(?:kinkekaart|raamat|NEW_PRODUCT)/i,
  // ... add patterns
];

Add to product type constants in prompts.ts:132-144
Test specificity scoring for new patterns

When Tuning Confidence Thresholds

Increase specificity (fewer clarifying questions):

Lower CLARIFY_CONFIDENCE_THRESHOLD from 0.7 to 0.6
Lower SPECIFICITY_THRESHOLD from 4 to 3

Increase precision (more clarifying questions):

Raise CLARIFY_CONFIDENCE_THRESHOLD from 0.7 to 0.8
Raise SPECIFICITY_THRESHOLD from 4 to 5

🐛 Debugging Tips

Enable Debug Logging

export CHAT_DEBUG_LOGS=true

Output includes:

 QUERY SPECIFICITY ANALYSIS: {
  query: "show me gift cards",
  score: 9,
  isSpecific: true,
  signals: {
    matchesPattern: true,
    singleProductType: true,
    productSearchIntent: true,
    shortQuery: true,
    noOccasionRecipient: true
  }
}

Check Signal Counts

Look for:

 SIGNAL COUNT: {
  productSignals: true,
  giftContext: false,
  budget: false,
  totalSignals: 1,
  confidence: 0.55
}

Fast Classifier - Pre-classification before context extraction
Context Extraction - Full semantic context extraction
Intent Classification - Intent taxonomy and routing
Gift Context and Follow-up System - Gift-specific context handling

Overview​

Two Types of Queries​

Specific Query​

Vague Query​

Detection Flow​

Specificity Scoring System​

Criteria Breakdown​

1. Pattern Matching (Weight: 3)​

2. Single Product Type (Weight: 2)​

3. Product Search Intent (Weight: 2) 🔎​

4. Short Query (Weight: 1) 📏​

5. No Occasion/Recipient (Weight: 1) 🚫​

Example Scoring​

Example 1: "Show me gift cards"​

Example 2: "Show me valentine day gifts under 100 euro"​

Signal-Based Confidence Scoring​

Signal Detection​

Product Signals (hasMeaningfulProductSignals)​

Gift Context (isGiftContextMissing)​

Budget Info (isBudgetMissing)​

Search Strategy Differentiation​

Specific Query Strategy​

Vague Query Strategy​

🚦 Clarifying Question Flow​

Complete Example Walkthroughs​

Example 1: Specific Query - "Show me gift cards"​

Example 2: Vague Query - "Show me valentine day gifts under 100 euro"​

🔧 Key Implementation Files​

1. Query Specificity Detection​

2. Context Signal Analysis​

3. Confidence Normalization​

4. Clarifying Question Handler​

5. Prompt Classification Rules​

Configuration Options​

Environment Variables​

Tunable Parameters​

Best Practices​

When Adding New Product Types​

When Tuning Confidence Thresholds​

🐛 Debugging Tips​

Enable Debug Logging​

Check Signal Counts​

Related Documentation​

Overview

Two Types of Queries

Specific Query

Vague Query

Detection Flow

Specificity Scoring System

Criteria Breakdown

1. Pattern Matching (Weight: 3)

2. Single Product Type (Weight: 2)

3. Product Search Intent (Weight: 2) 🔎

4. Short Query (Weight: 1) 📏

5. No Occasion/Recipient (Weight: 1) 🚫

Example Scoring

Example 1: "Show me gift cards"

Example 2: "Show me valentine day gifts under 100 euro"

Signal-Based Confidence Scoring

Signal Detection

Product Signals (`hasMeaningfulProductSignals`)

Gift Context (`isGiftContextMissing`)

Budget Info (`isBudgetMissing`)

Search Strategy Differentiation

Specific Query Strategy

Vague Query Strategy

🚦 Clarifying Question Flow

Complete Example Walkthroughs

Example 1: Specific Query - "Show me gift cards"

Example 2: Vague Query - "Show me valentine day gifts under 100 euro"

🔧 Key Implementation Files

1. Query Specificity Detection

2. Context Signal Analysis

3. Confidence Normalization

4. Clarifying Question Handler

5. Prompt Classification Rules

Configuration Options

Environment Variables

Tunable Parameters

Best Practices

When Adding New Product Types

When Tuning Confidence Thresholds

🐛 Debugging Tips

Enable Debug Logging

Check Signal Counts

Related Documentation