Skip to main content

LLM Context Extraction

Deep semantic understanding of user queries using llama-4-scout-17b-16e-instruct with conversation state injection.

Enhanced Semantic Prompt

Key Innovation: Pronoun resolution via conversation state

Conversation State Injection

interface ConversationState {
authors: string[]; // All validated authors
primaryAuthor?: string; // From explicit query (highest priority)
lastAuthor?: string; // Most recently mentioned
lastProductIds?: string[]; // For exclusion
lastCategory?: string; // For continuity
lastProductType?: string; // For continuity
}

Prompt Format:

VESTLUSE KONTEKST:
PEAMINE AUTOR: J.R.R. Tolkien
TEISED AUTORID: Agatha Christie, Stephen King
EELMINE KATEGOORIA: Krimi ja põnevus
EELMINE TÜÜP: Raamat

Pronoun Resolution

Example Flow

Context State:
primaryAuthor: "J.R.R. Tolkien"

User: "näita veel tema teoseid"

LLM with state injection:
Input: PEAMINE AUTOR: J.R.R. Tolkien + user message
Resolves: "tema" → "J.R.R. Tolkien"
Output: authorName: "J.R.R. Tolkien", intent: "author_search"

Priority Rules

When multiple authors exist:

  1. Primary author (from user's original query) - Highest priority
  2. Last author (most recently mentioned) - Fallback
  3. Clarification request - Last resort

Few-Shot Examples in Prompt:

// Example 1: Primary author resolution
User previous: "Tolkienilt ja Lewiselt raamatuid"
primaryAuthor: "J.R.R. Tolkien"
User current: "näita tema teoseid"
→ Output: authorName: "J.R.R. Tolkien"

// Example 2: Last author fallback
authors: ["Stephen King", "Dean Koontz"]
lastAuthor: "Dean Koontz"
User: "tema uuemaid raamatuid"
→ Output: authorName: "Dean Koontz"

Extraction Features

1. Intent Classification

Classifies into 11 intent types with confidence scoring.

2. Taxonomy Extraction

{
productType: "Raamat",
category: "Krimi ja põnevus",
categoryHints: ["Krimi ja põnevus", "Õudus", "Psühholoogiline trilller"],
productTypeHints: ["Raamat", "E-raamatud"]
}

3. Gift Context

{
occasion: "sünnipäev",
recipient: "õpetaja",
recipientGender: "unknown",
ageGroup: "adult",
recipientAge: null
}

4. Budget Parsing

"kuni 30 euro"{ max: 30, hint: "kuni 30 euro" }
"15-35 eurot"{ min: 15, max: 35, hint: "15-35 eurot" }
"umbes 25"{ max: 30, min: 20, hint: "umbes 25" }

5. Constraint Extraction

"aga mitte raamat" → constraints: ["MITTE raamat"]
"armastab aiandust" → constraints: ["aiandus"]
"eelistab kohvi" → constraints: ["eelistab kohvi"]

Conversation History Processing

Filters last 3-5 messages:

// Exclude budget-only messages (context poisoning prevention)
const isBudgetOnly = /^\d+\s*euro$/i;

const recentHistory = conversationHistory
.filter(m => !isBudgetOnly.test(m.content))
.slice(-3);

const historyContext = recentHistory.map(m =>
`${m.role}: ${m.content}`
).join('\n');

Format:

Eelnev kontekst:
user: Tere, vajan kingitust
assistant: Kellele soovite kingitust valida?
user: Kolleegile

Praegune sõnum:
Sünnipäevaks

Language Detection Override

Critical: LLM language detection is unreliable

LLM outputs:

  • "et" ✓
  • "estonian" ✗ (not enum)
  • "ET" ✗ (case mismatch)
  • "et|en" ✗ (pipe-separated)

Solution: Force pattern-based detection

// AFTER LLM extraction, ALWAYS override
const detectedLanguage = LanguageService.detectLanguage(userMessage);
extracted.language = detectedLanguage; // Override

Pattern-based accuracy: 100%, 0ms latency

Estonian Morphology

Genre detection across case variations:

// All recognized:
"kriminaalraamat" (nominative)
"krimiraamatuid" (partitive plural)
"kriminaalromaane" (partitive plural)
→ category: "Krimi ja põnevus"

"fantaasiaraamat"
"fantasiaraamatuid"
→ category: "Fantaasia"

Performance

Model Configuration:

{
model: 'llama-4-scout-17b-16e-instruct',
temperature: 0.1,
response_format: { type: 'json_object' },
max_tokens: 1500,
timeout: 10000,
seed: isDeterministic ? hashMessage(userMessage) : undefined
}

Typical Duration: 300-500ms