Fast Classifier

Purpose: Ultra-low-latency intent pre-classifier that short-circuits simple flows (show more, greetings, budget tweaks) before the full semantic extractor runs.

Performance: ~200-300ms vs moderate for full extraction
Success Rate: 60-70% of queries can use fast-path

Overview

The Fast Classifier is a performance optimization that quickly classifies common query types using a lightweight LLM call, allowing the system to skip the heavier enhanced semantic extraction for simple, unambiguous requests.

Quick Facts

Model: meta-llama/llama-4-scout-17b-16e-instruct (Groq)
Latency: ~200-300ms (vs moderate for full extraction)
Confidence Threshold: 0.1 minimum
Fast-Path Intents: show_more_products, greeting, question, cheaper_alternatives, occasion gifts
Skip Logic: Automatically bypassed for author queries and pronouns

Architecture Flow

How It Works

Step-by-Step Process

Execution Modes

Sequential Mode (Default)

Flow:

Fast classifier runs first (fast)
If eligible → immediate return (saves fast)
If not → falls through to enhanced extraction

Parallel Mode (Feature Flag)

Enable: PARALLEL_CONTEXT_EXTRACTION_ENABLED=true

Flow:

Fast classifier gets 200ms head start
Enhanced extraction starts after delay
First acceptable result wins
Saves time when classifier fails (no wasted wait)

Configuration:

PARALLEL_CLASSIFIER_HEADSTART_MS (default: 200ms)

Fast-Path Intents

These intents can be returned immediately after fast classification:

Not fast-path (require enhanced extraction):

product_search
author_search
product_inquiry
category_search

Routing & Skip Logic

When Classifier is Skipped

The fast classifier is automatically bypassed for queries that need deep context understanding:

Skip Patterns

Explicit Author Names:

/\b[A-ZÕÄÖÜõäöü][a-zõäöü]+(?:\s+[A-ZÕÄÖÜõäöü][a-zõäöü.]+)*?(?:lt|i\s+teosed|i\s+raamat)\b|by\s+[A-Z]|'s\s+books|from\s+[A-Z]/i

Examples:

"raamatuid Tolkienilt"
"Andrus Kivirähkilt" (with diacritics)
"books by Agatha Christie"
"Stephen King's books"

Author Pronouns:

/\b(tema|teda|temalt|selle\s+autori|sama\s+autorilt|that\s+author|...)\b/i

Examples:

"näita veel tema teoseid"
"sama autorilt"
"selle autori raamatuid"

Why skip for authors? Author queries require conversation state and pronoun resolution, which the fast classifier doesn't have access to. Skipping ensures these queries get the full enhanced LLM treatment with 9 few-shot examples.

Data Flow

ClassifierResult Interface

interface ClassifierResult {
  intent?: GiftContext['intent'];      // Classified intent
  occasion?: string | null;             // Birthday, Christmas, etc.
  recipient?: string | null;            // Friend, mom, teacher, etc.
  budgetMin?: number | null;            // Minimum budget
  budgetMax?: number | null;            // Maximum budget
  budgetHint?: string | null;           // Budget description
  confidence?: number;                  // 0.0 - 1.0
  isPopularQuery?: boolean;             // Trending/bestseller flag
  durationMs: number;                   // Execution time
}

Note: No authorName field - this is why author queries are routed to enhanced extraction instead.

Configuration

File Locations

Entry point: app/api/chat/services/context-understanding/index.ts:220-255
Classifier: app/api/chat/services/context-understanding/fast-classifier.ts
Prompt: app/api/chat/services/context-understanding/prompts.ts:3-160
Config: app/api/chat/services/context-understanding/config.ts

Key Settings

// Model
export const FAST_CLASSIFIER_MODEL = 
  process.env.FAST_CLASSIFIER_MODEL || 'meta-llama/llama-4-scout-17b-16e-instruct';

// Timeout
export const FAST_CLASSIFIER_TIMEOUT_MS = 
  Number(process.env.CONTEXT_CLASSIFIER_TIMEOUT_MS ?? 4000);

// Minimum confidence
export const FAST_CLASSIFIER_MIN_CONFIDENCE = 0.1;

// Fast-path intents
export const FAST_CLASSIFIER_FAST_PATH_INTENTS = new Set([
  'show_more_products',
  'greeting',
  'question',
  'cheaper_alternatives',
  // ... occasion gifts ...
]);

Decision Tree

Example Scenarios

Scenario 1: Fast-Path Success

Query: "näita rohkem"
  ↓
Routing check: No author signals
  ↓
Fast classifier runs: fast
  ↓
Result: {
  intent: "show_more_products",
  confidence: 0.85
}
  ↓
Check: show_more_products in fast-path? YES ✓
  ↓
Return immediately (saved fast!)
  ↓
Total: fast vs moderate

Scenario 2: Skip for Author Query

Query: "raamatuid Tolkienilt"
  ↓
Routing check: hasAuthorPattern = TRUE
  ↓
Skip fast classifier
  ↓
Enhanced extraction runs: moderate
  ↓
Result: {
  intent: "author_search",
  authorName: "Tolkien",
  confidence: 0.55
}
  ↓
Total: moderate
Benefit: Correct author extraction!

Scenario 3: Classifier Falls Through

Query: "Tahan kingitust sünnipäevaks"
  ↓
Routing check: No skip signals
  ↓
Fast classifier runs: fast
  ↓
Result: {
  intent: "product_search",
  confidence: 0.6
}
  ↓
Check: product_search in fast-path? NO ✗
  ↓
Fall through to enhanced extraction: moderate
  ↓
Total: ~750ms (classifier + enhanced)
Note: Telemetry collected, not wasted

Performance Comparison

Performance Gains:

Fast-path: fast total
Enhanced-only: moderate total
Fallback (both): ~770ms total

Routing Logic Details

Pattern Detection (Lightweight Regex)

// Pattern 1: Explicit author names (with Estonian diacritics)
const hasAuthorPattern = /\b[A-ZÕÄÖÜõäöü][a-zõäöü]+...(?:lt|i\s+teosed|i\s+raamat)\b|by\s+[A-Z]|'s\s+books/i;

// Pattern 2: Author pronouns
const hasAuthorPronoun = /\b(tema|selle\s+autori|sama\s+autorilt|that\s+author|...)\b/i;

// Pattern 3: Check conversation state
const hasAuthorContext = conversationState?.primaryAuthor || conversationState?.authors?.length > 0;

// Decision: Skip if ANY author signal present
const skipClassifier = hasAuthorPattern || hasAuthorPronoun;

Cost: ~0.2ms (negligible)

Prompt Structure

Fast Classifier Prompt (Simplified)

export const FAST_CLASSIFIER_PROMPT = `
Sa oled kiire klassifitseerija.
Tagasta JSON:
{
  "intent": "...",
  "occasion": "...",
  "recipient": "...",
  "budgetMin": number,
  "budgetMax": number,
  "confidence": 0-1,
  "isPopularQuery": boolean
}

REEGLID:
- Autori päringud → intent: "author_search"
- Kingitused + saaja → intent varies
- "näita rohkem" → intent: "show_more_products"
...
`;

Features:

No conversation state (keeps it fast)
Last 2 user turns for context
Focused on quick classification
Explicit author detection rules

createContextFromClassifier

Converts lightweight ClassifierResult into full GiftContext:

Fields mapped:

intent, occasion, recipient
budget (min, max, hint)
confidence, isPopularQuery
authorName (not in ClassifierResult)

Adds automatically:

language (detected from message)
ageGroup (deterministic signals)
meta flags (classifierUsed, duration, etc.)

Observability

Debug Logs

Enable with: CHAT_DEBUG_LOGS=true

// Always logged (unconditional)
 FAST CLASSIFIER CALLED: { query, timestamp }

// Debug mode
 FAST CLASSIFIER RESULT: { intent, confidence, duration }
  SKIPPING FAST CLASSIFIER: { reason, hasAuthorPattern, hasAuthorPronoun }
 FAST CLASSIFIER FAST-PATH: { intent, confidence }

Meta Flags

Every context includes telemetry:

context.meta = {
  classifierUsed: boolean,          // true if classifier ran
  classifierConfidence: number,     // classifier confidence score
  classifierDurationMs: number,     // how long it took
  fallbackTriggered: boolean,       // true if enhanced ran after classifier failed
  extractionDurationMs: number,     // total extraction time
  parallelMode: boolean             // true if parallel mode used
}

Configuration Guide

Disabling Fast Classifier

# Disable completely (always use enhanced extraction)
export CONTEXT_CLASSIFIER_DISABLED=true

When to disable:

Debugging extraction issues
Need highest accuracy
Author queries not being detected

Adjusting Thresholds

// Lower for stricter fast-path
export const FAST_CLASSIFIER_MIN_CONFIDENCE = 0.3;  // Default: 0.1

// Faster timeout for lower latency
export CONTEXT_CLASSIFIER_TIMEOUT_MS=2000  // Default: 4000

Adding Fast-Path Intents

// In config.ts
export const FAST_CLASSIFIER_FAST_PATH_INTENTS = new Set([
  'show_more_products',
  'greeting',
  // Add new intent here
  'my_new_fast_intent',
]);

Criteria for fast-path:

Simple, unambiguous classification
No deep context needed
No conversation state required
High confidence from classifier

Known Limitations

1. No Conversation State

Fast classifier doesn't have access to:

Previous authors mentioned
Previously shown products
Conversation history beyond last 2 turns

Solution: Skip classifier for queries needing this context (authors, pronouns)

2. No authorName Support

ClassifierResult doesn't have authorName field.

Impact: Can't extract author names in fast-path
Solution: Author queries always skip to enhanced extraction

3. show_more_products Can Hijack Pronouns

Problem:

Query: "näita veel tema teoseid"
Classifier sees: "näita veel" → show_more_products
Fast-path returns → Misses "tema" pronoun

Solution: Routing logic skips classifier when pronouns detected

Best Practices

DO

Use for simple, unambiguous queries
- Greetings, show more, budget changes
- Occasion-based gift requests
Update prompt when taxonomy changes
- New product types
- New recipients or occasions
Monitor classifier confidence
- Low confidence suggests prompt needs improvement
Keep fast-path intents minimal
- Only add intents that truly don't need context

DON'T

Add author_search to fast-path
- Needs conversation state for pronouns
- Needs enhanced examples for extraction
Remove routing skip logic
- Critical for preventing hijacking
- Ensures correct queries get enhanced treatment
Set confidence too high
- Will cause most queries to fall through
- Defeats the performance benefit

Troubleshooting

Issue: Classifier always timing out

Check:

CONTEXT_CLASSIFIER_TIMEOUT_MS value (default: 4000ms)
Groq API connectivity
Model availability

Fix:

export CONTEXT_CLASSIFIER_TIMEOUT_MS=6000

Issue: Author queries still returning wrong results

Check:

Are patterns being detected? Look for SKIPPING FAST CLASSIFIER in logs
Is enhanced extraction running?
Check CONTEXT_CLASSIFIER_DISABLED

Debug:

export CHAT_DEBUG_LOGS=true
# Look for skip reasons and routing decisions

Issue: Performance regression

Check:

Is classifier being skipped too often?
Are too many queries falling through to enhanced?
Check fast-path intent list coverage

Metrics:

// In logs, check:
classifierUsed: true/false
fallbackTriggered: true/false
extractionDurationMs: number

File Reference

File	Purpose	Lines
`index.ts`	Routing, skip logic, fast-path return	220-255, 257-269
`fast-classifier.ts`	LLM call, parsing, result creation	27-178
`prompts.ts`	Fast classifier prompt	3-160
`config.ts`	Thresholds, fast-path intents	14-45

Summary

The Fast Classifier is a performance optimization that:

Saves fast for 60-70% of queries
Maintains accuracy through confidence thresholds
Intelligently skips when deep context needed
Provides telemetry for monitoring

Key insight: Speed matters, but correctness matters more - routing logic ensures complex queries (authors, pronouns) get the full enhanced treatment they need.

Context Extraction - Main extraction system
Intent Classification - Intent system overview
Author Intent - Author handling details
Gift Context System - Complete context flow

Overview​

Quick Facts​

Architecture Flow​

How It Works​

Step-by-Step Process​

Execution Modes​

Sequential Mode (Default)​

Parallel Mode (Feature Flag)​

Fast-Path Intents​

Routing & Skip Logic​

When Classifier is Skipped​

Skip Patterns​

Data Flow​

ClassifierResult Interface​

Configuration​

File Locations​

Key Settings​

Decision Tree​

Example Scenarios​

Scenario 1: Fast-Path Success​

Scenario 2: Skip for Author Query​

Scenario 3: Classifier Falls Through​

Performance Comparison​

Routing Logic Details​

Pattern Detection (Lightweight Regex)​

Prompt Structure​

Fast Classifier Prompt (Simplified)​

createContextFromClassifier​

Observability​

Debug Logs​

Meta Flags​

Configuration Guide​

Disabling Fast Classifier​

Adjusting Thresholds​

Adding Fast-Path Intents​

Known Limitations​

1. No Conversation State​

2. No authorName Support​

3. show_more_products Can Hijack Pronouns​

Best Practices​

DO​

DON'T​

Troubleshooting​

Issue: Classifier always timing out​

Issue: Author queries still returning wrong results​

Issue: Performance regression​

File Reference​

Summary​

Related Documentation​

Overview

Quick Facts

Architecture Flow

How It Works

Step-by-Step Process

Execution Modes

Sequential Mode (Default)

Parallel Mode (Feature Flag)

Fast-Path Intents

Routing & Skip Logic

When Classifier is Skipped

Skip Patterns

Data Flow

ClassifierResult Interface

Configuration

File Locations

Key Settings

Decision Tree

Example Scenarios

Scenario 1: Fast-Path Success

Scenario 2: Skip for Author Query

Scenario 3: Classifier Falls Through

Performance Comparison

Routing Logic Details

Pattern Detection (Lightweight Regex)

Prompt Structure

Fast Classifier Prompt (Simplified)

createContextFromClassifier

Observability

Debug Logs

Meta Flags

Configuration Guide

Disabling Fast Classifier

Adjusting Thresholds

Adding Fast-Path Intents

Known Limitations

1. No Conversation State

2. No authorName Support

3. show_more_products Can Hijack Pronouns

Best Practices

DO

DON'T

Troubleshooting

Issue: Classifier always timing out

Issue: Author queries still returning wrong results

Issue: Performance regression

File Reference

Summary

Related Documentation