Fast Classifier
Purpose: Ultra-low-latency intent pre-classifier that short-circuits simple flows (show more, greetings, budget tweaks) before the full semantic extractor runs.
Performance: ~200-300ms vs moderate for full extraction
Success Rate: 60-70% of queries can use fast-path
Overview
The Fast Classifier is a performance optimization that quickly classifies common query types using a lightweight LLM call, allowing the system to skip the heavier enhanced semantic extraction for simple, unambiguous requests.
Quick Facts
- Model:
meta-llama/llama-4-scout-17b-16e-instruct(Groq) - Latency: ~200-300ms (vs moderate for full extraction)
- Confidence Threshold: 0.1 minimum
- Fast-Path Intents: show_more_products, greeting, question, cheaper_alternatives, occasion gifts
- Skip Logic: Automatically bypassed for author queries and pronouns
Architecture Flow
How It Works
Step-by-Step Process
Execution Modes
Sequential Mode (Default)
Flow:
- Fast classifier runs first (fast)
- If eligible → immediate return (saves fast)
- If not → falls through to enhanced extraction
Parallel Mode (Feature Flag)
Enable: PARALLEL_CONTEXT_EXTRACTION_ENABLED=true
Flow:
- Fast classifier gets 200ms head start
- Enhanced extraction starts after delay
- First acceptable result wins
- Saves time when classifier fails (no wasted wait)
Configuration:
PARALLEL_CLASSIFIER_HEADSTART_MS(default: 200ms)
Fast-Path Intents
These intents can be returned immediately after fast classification:
Not fast-path (require enhanced extraction):
product_searchauthor_searchproduct_inquirycategory_search
Routing & Skip Logic
When Classifier is Skipped
The fast classifier is automatically bypassed for queries that need deep context understanding:
Skip Patterns
Explicit Author Names:
/\b[A-ZÕÄÖÜõäöü][a-zõäöü]+(?:\s+[A-ZÕÄÖÜõäöü][a-zõäöü.]+)*?(?:lt|i\s+teosed|i\s+raamat)\b|by\s+[A-Z]|'s\s+books|from\s+[A-Z]/i
Examples:
- "raamatuid Tolkienilt"
- "Andrus Kivirähkilt" (with diacritics)
- "books by Agatha Christie"
- "Stephen King's books"
Author Pronouns:
/\b(tema|teda|temalt|selle\s+autori|sama\s+autorilt|that\s+author|...)\b/i
Examples:
- "näita veel tema teoseid"
- "sama autorilt"
- "selle autori raamatuid"
Why skip for authors? Author queries require conversation state and pronoun resolution, which the fast classifier doesn't have access to. Skipping ensures these queries get the full enhanced LLM treatment with 9 few-shot examples.
Data Flow
ClassifierResult Interface
interface ClassifierResult {
intent?: GiftContext['intent']; // Classified intent
occasion?: string | null; // Birthday, Christmas, etc.
recipient?: string | null; // Friend, mom, teacher, etc.
budgetMin?: number | null; // Minimum budget
budgetMax?: number | null; // Maximum budget
budgetHint?: string | null; // Budget description
confidence?: number; // 0.0 - 1.0
isPopularQuery?: boolean; // Trending/bestseller flag
durationMs: number; // Execution time
}
Note: No authorName field - this is why author queries are routed to enhanced extraction instead.
Configuration
File Locations
- Entry point:
app/api/chat/services/context-understanding/index.ts:220-255 - Classifier:
app/api/chat/services/context-understanding/fast-classifier.ts - Prompt:
app/api/chat/services/context-understanding/prompts.ts:3-160 - Config:
app/api/chat/services/context-understanding/config.ts
Key Settings
// Model
export const FAST_CLASSIFIER_MODEL =
process.env.FAST_CLASSIFIER_MODEL || 'meta-llama/llama-4-scout-17b-16e-instruct';
// Timeout
export const FAST_CLASSIFIER_TIMEOUT_MS =
Number(process.env.CONTEXT_CLASSIFIER_TIMEOUT_MS ?? 4000);
// Minimum confidence
export const FAST_CLASSIFIER_MIN_CONFIDENCE = 0.1;
// Fast-path intents
export const FAST_CLASSIFIER_FAST_PATH_INTENTS = new Set([
'show_more_products',
'greeting',
'question',
'cheaper_alternatives',
// ... occasion gifts ...
]);
Decision Tree
Example Scenarios
Scenario 1: Fast-Path Success
Query: "näita rohkem"
↓
Routing check: No author signals
↓
Fast classifier runs: fast
↓
Result: {
intent: "show_more_products",
confidence: 0.85
}
↓
Check: show_more_products in fast-path? YES ✓
↓
Return immediately (saved fast!)
↓
Total: fast vs moderate
Scenario 2: Skip for Author Query
Query: "raamatuid Tolkienilt"
↓
Routing check: hasAuthorPattern = TRUE
↓
Skip fast classifier
↓
Enhanced extraction runs: moderate
↓
Result: {
intent: "author_search",
authorName: "Tolkien",
confidence: 0.55
}
↓
Total: moderate
Benefit: Correct author extraction!
Scenario 3: Classifier Falls Through
Query: "Tahan kingitust sünnipäevaks"
↓
Routing check: No skip signals
↓
Fast classifier runs: fast
↓
Result: {
intent: "product_search",
confidence: 0.6
}
↓
Check: product_search in fast-path? NO ✗
↓
Fall through to enhanced extraction: moderate
↓
Total: ~750ms (classifier + enhanced)
Note: Telemetry collected, not wasted
Performance Comparison
Performance Gains:
- Fast-path: fast total
- Enhanced-only: moderate total
- Fallback (both): ~770ms total
Routing Logic Details
Pattern Detection (Lightweight Regex)
// Pattern 1: Explicit author names (with Estonian diacritics)
const hasAuthorPattern = /\b[A-ZÕÄÖÜõäöü][a-zõäöü]+...(?:lt|i\s+teosed|i\s+raamat)\b|by\s+[A-Z]|'s\s+books/i;
// Pattern 2: Author pronouns
const hasAuthorPronoun = /\b(tema|selle\s+autori|sama\s+autorilt|that\s+author|...)\b/i;
// Pattern 3: Check conversation state
const hasAuthorContext = conversationState?.primaryAuthor || conversationState?.authors?.length > 0;
// Decision: Skip if ANY author signal present
const skipClassifier = hasAuthorPattern || hasAuthorPronoun;
Cost: ~0.2ms (negligible)
Prompt Structure
Fast Classifier Prompt (Simplified)
export const FAST_CLASSIFIER_PROMPT = `
Sa oled kiire klassifitseerija.
Tagasta JSON:
{
"intent": "...",
"occasion": "...",
"recipient": "...",
"budgetMin": number,
"budgetMax": number,
"confidence": 0-1,
"isPopularQuery": boolean
}
REEGLID:
- Autori päringud → intent: "author_search"
- Kingitused + saaja → intent varies
- "näita rohkem" → intent: "show_more_products"
...
`;
Features:
- No conversation state (keeps it fast)
- Last 2 user turns for context
- Focused on quick classification
- Explicit author detection rules
createContextFromClassifier
Converts lightweight ClassifierResult into full GiftContext:
Fields mapped:
- intent, occasion, recipient
- budget (min, max, hint)
- confidence, isPopularQuery
- authorName (not in ClassifierResult)
Adds automatically:
- language (detected from message)
- ageGroup (deterministic signals)
- meta flags (classifierUsed, duration, etc.)
Observability
Debug Logs
Enable with: CHAT_DEBUG_LOGS=true
// Always logged (unconditional)
FAST CLASSIFIER CALLED: { query, timestamp }
// Debug mode
FAST CLASSIFIER RESULT: { intent, confidence, duration }
SKIPPING FAST CLASSIFIER: { reason, hasAuthorPattern, hasAuthorPronoun }
FAST CLASSIFIER FAST-PATH: { intent, confidence }
Meta Flags
Every context includes telemetry:
context.meta = {
classifierUsed: boolean, // true if classifier ran
classifierConfidence: number, // classifier confidence score
classifierDurationMs: number, // how long it took
fallbackTriggered: boolean, // true if enhanced ran after classifier failed
extractionDurationMs: number, // total extraction time
parallelMode: boolean // true if parallel mode used
}
Configuration Guide
Disabling Fast Classifier
# Disable completely (always use enhanced extraction)
export CONTEXT_CLASSIFIER_DISABLED=true
When to disable:
- Debugging extraction issues
- Need highest accuracy
- Author queries not being detected
Adjusting Thresholds
// Lower for stricter fast-path
export const FAST_CLASSIFIER_MIN_CONFIDENCE = 0.3; // Default: 0.1
// Faster timeout for lower latency
export CONTEXT_CLASSIFIER_TIMEOUT_MS=2000 // Default: 4000
Adding Fast-Path Intents
// In config.ts
export const FAST_CLASSIFIER_FAST_PATH_INTENTS = new Set([
'show_more_products',
'greeting',
// Add new intent here
'my_new_fast_intent',
]);
Criteria for fast-path:
- Simple, unambiguous classification
- No deep context needed
- No conversation state required
- High confidence from classifier
Known Limitations
1. No Conversation State
Fast classifier doesn't have access to:
- Previous authors mentioned
- Previously shown products
- Conversation history beyond last 2 turns
Solution: Skip classifier for queries needing this context (authors, pronouns)
2. No authorName Support
ClassifierResult doesn't have authorName field.
Impact: Can't extract author names in fast-path
Solution: Author queries always skip to enhanced extraction
3. show_more_products Can Hijack Pronouns
Problem:
Query: "näita veel tema teoseid"
Classifier sees: "näita veel" → show_more_products
Fast-path returns → Misses "tema" pronoun
Solution: Routing logic skips classifier when pronouns detected
Best Practices
DO
-
Use for simple, unambiguous queries
- Greetings, show more, budget changes
- Occasion-based gift requests
-
Update prompt when taxonomy changes
- New product types
- New recipients or occasions
-
Monitor classifier confidence
- Low confidence suggests prompt needs improvement
-
Keep fast-path intents minimal
- Only add intents that truly don't need context
DON'T
-
Add author_search to fast-path
- Needs conversation state for pronouns
- Needs enhanced examples for extraction
-
Remove routing skip logic
- Critical for preventing hijacking
- Ensures correct queries get enhanced treatment
-
Set confidence too high
- Will cause most queries to fall through
- Defeats the performance benefit
Troubleshooting
Issue: Classifier always timing out
Check:
CONTEXT_CLASSIFIER_TIMEOUT_MSvalue (default: 4000ms)- Groq API connectivity
- Model availability
Fix:
export CONTEXT_CLASSIFIER_TIMEOUT_MS=6000
Issue: Author queries still returning wrong results
Check:
- Are patterns being detected? Look for
SKIPPING FAST CLASSIFIERin logs - Is enhanced extraction running?
- Check
CONTEXT_CLASSIFIER_DISABLED
Debug:
export CHAT_DEBUG_LOGS=true
# Look for skip reasons and routing decisions
Issue: Performance regression
Check:
- Is classifier being skipped too often?
- Are too many queries falling through to enhanced?
- Check fast-path intent list coverage
Metrics:
// In logs, check:
classifierUsed: true/false
fallbackTriggered: true/false
extractionDurationMs: number
File Reference
| File | Purpose | Lines |
|---|---|---|
index.ts | Routing, skip logic, fast-path return | 220-255, 257-269 |
fast-classifier.ts | LLM call, parsing, result creation | 27-178 |
prompts.ts | Fast classifier prompt | 3-160 |
config.ts | Thresholds, fast-path intents | 14-45 |
Summary
The Fast Classifier is a performance optimization that:
- Saves fast for 60-70% of queries
- Maintains accuracy through confidence thresholds
- Intelligently skips when deep context needed
- Provides telemetry for monitoring
Key insight: Speed matters, but correctness matters more - routing logic ensures complex queries (authors, pronouns) get the full enhanced treatment they need.
Related Documentation
- Context Extraction - Main extraction system
- Intent Classification - Intent system overview
- Author Intent - Author handling details
- Gift Context System - Complete context flow