Skip to main content

Fast Classifier

Purpose: Ultra-low-latency intent pre-classifier that short-circuits simple flows (show more, greetings, budget tweaks) before the full semantic extractor runs.

Performance: ~200-300ms vs moderate for full extraction
Success Rate: 60-70% of queries can use fast-path


Overview

The Fast Classifier is a performance optimization that quickly classifies common query types using a lightweight LLM call, allowing the system to skip the heavier enhanced semantic extraction for simple, unambiguous requests.

Quick Facts

  • Model: meta-llama/llama-4-scout-17b-16e-instruct (Groq)
  • Latency: ~200-300ms (vs moderate for full extraction)
  • Confidence Threshold: 0.1 minimum
  • Fast-Path Intents: show_more_products, greeting, question, cheaper_alternatives, occasion gifts
  • Skip Logic: Automatically bypassed for author queries and pronouns

Architecture Flow


How It Works

Step-by-Step Process


Execution Modes

Sequential Mode (Default)

Flow:

  1. Fast classifier runs first (fast)
  2. If eligible → immediate return (saves fast)
  3. If not → falls through to enhanced extraction

Parallel Mode (Feature Flag)

Enable: PARALLEL_CONTEXT_EXTRACTION_ENABLED=true

Flow:

  1. Fast classifier gets 200ms head start
  2. Enhanced extraction starts after delay
  3. First acceptable result wins
  4. Saves time when classifier fails (no wasted wait)

Configuration:

  • PARALLEL_CLASSIFIER_HEADSTART_MS (default: 200ms)

Fast-Path Intents

These intents can be returned immediately after fast classification:

Not fast-path (require enhanced extraction):

  • product_search
  • author_search
  • product_inquiry
  • category_search

Routing & Skip Logic

When Classifier is Skipped

The fast classifier is automatically bypassed for queries that need deep context understanding:

Skip Patterns

Explicit Author Names:

/\b[A-ZÕÄÖÜõäöü][a-zõäöü]+(?:\s+[A-ZÕÄÖÜõäöü][a-zõäöü.]+)*?(?:lt|i\s+teosed|i\s+raamat)\b|by\s+[A-Z]|'s\s+books|from\s+[A-Z]/i

Examples:

  • "raamatuid Tolkienilt"
  • "Andrus Kivirähkilt" (with diacritics)
  • "books by Agatha Christie"
  • "Stephen King's books"

Author Pronouns:

/\b(tema|teda|temalt|selle\s+autori|sama\s+autorilt|that\s+author|...)\b/i

Examples:

  • "näita veel tema teoseid"
  • "sama autorilt"
  • "selle autori raamatuid"

Why skip for authors? Author queries require conversation state and pronoun resolution, which the fast classifier doesn't have access to. Skipping ensures these queries get the full enhanced LLM treatment with 9 few-shot examples.


Data Flow


ClassifierResult Interface

interface ClassifierResult {
intent?: GiftContext['intent']; // Classified intent
occasion?: string | null; // Birthday, Christmas, etc.
recipient?: string | null; // Friend, mom, teacher, etc.
budgetMin?: number | null; // Minimum budget
budgetMax?: number | null; // Maximum budget
budgetHint?: string | null; // Budget description
confidence?: number; // 0.0 - 1.0
isPopularQuery?: boolean; // Trending/bestseller flag
durationMs: number; // Execution time
}

Note: No authorName field - this is why author queries are routed to enhanced extraction instead.


Configuration

File Locations

  • Entry point: app/api/chat/services/context-understanding/index.ts:220-255
  • Classifier: app/api/chat/services/context-understanding/fast-classifier.ts
  • Prompt: app/api/chat/services/context-understanding/prompts.ts:3-160
  • Config: app/api/chat/services/context-understanding/config.ts

Key Settings

// Model
export const FAST_CLASSIFIER_MODEL =
process.env.FAST_CLASSIFIER_MODEL || 'meta-llama/llama-4-scout-17b-16e-instruct';

// Timeout
export const FAST_CLASSIFIER_TIMEOUT_MS =
Number(process.env.CONTEXT_CLASSIFIER_TIMEOUT_MS ?? 4000);

// Minimum confidence
export const FAST_CLASSIFIER_MIN_CONFIDENCE = 0.1;

// Fast-path intents
export const FAST_CLASSIFIER_FAST_PATH_INTENTS = new Set([
'show_more_products',
'greeting',
'question',
'cheaper_alternatives',
// ... occasion gifts ...
]);

Decision Tree


Example Scenarios

Scenario 1: Fast-Path Success

Query: "näita rohkem"

Routing check: No author signals

Fast classifier runs: fast

Result: {
intent: "show_more_products",
confidence: 0.85
}

Check: show_more_products in fast-path? YES ✓

Return immediately (saved fast!)

Total: fast vs moderate

Scenario 2: Skip for Author Query

Query: "raamatuid Tolkienilt"

Routing check: hasAuthorPattern = TRUE

Skip fast classifier

Enhanced extraction runs: moderate

Result: {
intent: "author_search",
authorName: "Tolkien",
confidence: 0.55
}

Total: moderate
Benefit: Correct author extraction!

Scenario 3: Classifier Falls Through

Query: "Tahan kingitust sünnipäevaks"

Routing check: No skip signals

Fast classifier runs: fast

Result: {
intent: "product_search",
confidence: 0.6
}

Check: product_search in fast-path? NO ✗

Fall through to enhanced extraction: moderate

Total: ~750ms (classifier + enhanced)
Note: Telemetry collected, not wasted

Performance Comparison

Performance Gains:

  • Fast-path: fast total
  • Enhanced-only: moderate total
  • Fallback (both): ~770ms total

Routing Logic Details

Pattern Detection (Lightweight Regex)

// Pattern 1: Explicit author names (with Estonian diacritics)
const hasAuthorPattern = /\b[A-ZÕÄÖÜõäöü][a-zõäöü]+...(?:lt|i\s+teosed|i\s+raamat)\b|by\s+[A-Z]|'s\s+books/i;

// Pattern 2: Author pronouns
const hasAuthorPronoun = /\b(tema|selle\s+autori|sama\s+autorilt|that\s+author|...)\b/i;

// Pattern 3: Check conversation state
const hasAuthorContext = conversationState?.primaryAuthor || conversationState?.authors?.length > 0;

// Decision: Skip if ANY author signal present
const skipClassifier = hasAuthorPattern || hasAuthorPronoun;

Cost: ~0.2ms (negligible)


Prompt Structure

Fast Classifier Prompt (Simplified)

export const FAST_CLASSIFIER_PROMPT = `
Sa oled kiire klassifitseerija.
Tagasta JSON:
{
"intent": "...",
"occasion": "...",
"recipient": "...",
"budgetMin": number,
"budgetMax": number,
"confidence": 0-1,
"isPopularQuery": boolean
}

REEGLID:
- Autori päringud → intent: "author_search"
- Kingitused + saaja → intent varies
- "näita rohkem" → intent: "show_more_products"
...
`;

Features:

  • No conversation state (keeps it fast)
  • Last 2 user turns for context
  • Focused on quick classification
  • Explicit author detection rules

createContextFromClassifier

Converts lightweight ClassifierResult into full GiftContext:

Fields mapped:

  • intent, occasion, recipient
  • budget (min, max, hint)
  • confidence, isPopularQuery
  • authorName (not in ClassifierResult)

Adds automatically:

  • language (detected from message)
  • ageGroup (deterministic signals)
  • meta flags (classifierUsed, duration, etc.)

Observability

Debug Logs

Enable with: CHAT_DEBUG_LOGS=true

// Always logged (unconditional)
FAST CLASSIFIER CALLED: { query, timestamp }

// Debug mode
FAST CLASSIFIER RESULT: { intent, confidence, duration }
SKIPPING FAST CLASSIFIER: { reason, hasAuthorPattern, hasAuthorPronoun }
FAST CLASSIFIER FAST-PATH: { intent, confidence }

Meta Flags

Every context includes telemetry:

context.meta = {
classifierUsed: boolean, // true if classifier ran
classifierConfidence: number, // classifier confidence score
classifierDurationMs: number, // how long it took
fallbackTriggered: boolean, // true if enhanced ran after classifier failed
extractionDurationMs: number, // total extraction time
parallelMode: boolean // true if parallel mode used
}

Configuration Guide

Disabling Fast Classifier

# Disable completely (always use enhanced extraction)
export CONTEXT_CLASSIFIER_DISABLED=true

When to disable:

  • Debugging extraction issues
  • Need highest accuracy
  • Author queries not being detected

Adjusting Thresholds

// Lower for stricter fast-path
export const FAST_CLASSIFIER_MIN_CONFIDENCE = 0.3; // Default: 0.1

// Faster timeout for lower latency
export CONTEXT_CLASSIFIER_TIMEOUT_MS=2000 // Default: 4000

Adding Fast-Path Intents

// In config.ts
export const FAST_CLASSIFIER_FAST_PATH_INTENTS = new Set([
'show_more_products',
'greeting',
// Add new intent here
'my_new_fast_intent',
]);

Criteria for fast-path:

  • Simple, unambiguous classification
  • No deep context needed
  • No conversation state required
  • High confidence from classifier

Known Limitations

1. No Conversation State

Fast classifier doesn't have access to:

  • Previous authors mentioned
  • Previously shown products
  • Conversation history beyond last 2 turns

Solution: Skip classifier for queries needing this context (authors, pronouns)


2. No authorName Support

ClassifierResult doesn't have authorName field.

Impact: Can't extract author names in fast-path
Solution: Author queries always skip to enhanced extraction


3. show_more_products Can Hijack Pronouns

Problem:

Query: "näita veel tema teoseid"
Classifier sees: "näita veel" → show_more_products
Fast-path returns → Misses "tema" pronoun

Solution: Routing logic skips classifier when pronouns detected


Best Practices

DO

  1. Use for simple, unambiguous queries

    • Greetings, show more, budget changes
    • Occasion-based gift requests
  2. Update prompt when taxonomy changes

    • New product types
    • New recipients or occasions
  3. Monitor classifier confidence

    • Low confidence suggests prompt needs improvement
  4. Keep fast-path intents minimal

    • Only add intents that truly don't need context

DON'T

  1. Add author_search to fast-path

    • Needs conversation state for pronouns
    • Needs enhanced examples for extraction
  2. Remove routing skip logic

    • Critical for preventing hijacking
    • Ensures correct queries get enhanced treatment
  3. Set confidence too high

    • Will cause most queries to fall through
    • Defeats the performance benefit

Troubleshooting

Issue: Classifier always timing out

Check:

  • CONTEXT_CLASSIFIER_TIMEOUT_MS value (default: 4000ms)
  • Groq API connectivity
  • Model availability

Fix:

export CONTEXT_CLASSIFIER_TIMEOUT_MS=6000

Issue: Author queries still returning wrong results

Check:

  • Are patterns being detected? Look for SKIPPING FAST CLASSIFIER in logs
  • Is enhanced extraction running?
  • Check CONTEXT_CLASSIFIER_DISABLED

Debug:

export CHAT_DEBUG_LOGS=true
# Look for skip reasons and routing decisions

Issue: Performance regression

Check:

  • Is classifier being skipped too often?
  • Are too many queries falling through to enhanced?
  • Check fast-path intent list coverage

Metrics:

// In logs, check:
classifierUsed: true/false
fallbackTriggered: true/false
extractionDurationMs: number

File Reference

FilePurposeLines
index.tsRouting, skip logic, fast-path return220-255, 257-269
fast-classifier.tsLLM call, parsing, result creation27-178
prompts.tsFast classifier prompt3-160
config.tsThresholds, fast-path intents14-45

Summary

The Fast Classifier is a performance optimization that:

  • Saves fast for 60-70% of queries
  • Maintains accuracy through confidence thresholds
  • Intelligently skips when deep context needed
  • Provides telemetry for monitoring

Key insight: Speed matters, but correctness matters more - routing logic ensures complex queries (authors, pronouns) get the full enhanced treatment they need.