Response Validation

After GPT-5.1 generates a response, it undergoes validation to detect and prevent hallucinations, ensuring recommendations are grounded in actual product data.

Purpose

Prevent AI from:

Inventing product details (plots, recipes, features)
Hallucinating titles that don't exist
Creating fake authors or characters
Describing content it hasn't seen
Mismatching categories (cookbook as children's book)

Validation Pipeline

Implementation

Location: app/api/chat/system-prompt.ts:398-567

export function validateAIResponse(
  response: string, 
  validatedProducts: ValidatedProduct[],
  userContext?: string
): ValidationResult {
  const invalidMentions = [];
  let totalMentions = 0;
  let validMentions = 0;
  
  // STEP 1: Extract valid content from products
  const validContent = extractValidContent(validatedProducts);
  
  // STEP 2: Extract quoted text (potential titles)
  const quotedMatches = response.match(/"([^"]+)"/g) || [];
  
  quotedMatches.forEach(quoted => {
    const cleanQuoted = quoted.replace(/"/g, '').toLowerCase();
    const confidence = calculateMentionConfidence(
      cleanQuoted, 
      validContent, 
      'title_hallucination'
    );
    
    if (confidence < 0.5 && cleanQuoted.length > 3) {
      invalidMentions.push({
        text: quoted,
        category: 'title_hallucination',
        confidence: 1.0 - confidence
      });
    } else if (confidence >= 0.5) {
      validMentions++;
    }
  });
  
  // STEP 3: Extract proper nouns (potential authors/characters)
  const properNouns = response.match(/\b[A-ZÄÖÜÕ][a-zäöüõ]+(?:\s+[A-ZÄÖÜÕ][a-zäöüõ]+)*\b/g) || [];
  
  properNouns.forEach(noun => {
    const confidence = calculateMentionConfidence(
      noun.toLowerCase(),
      validContent,
      'author_hallucination'
    );
    
    if (confidence < 0.3 && noun.length > 3) {
      invalidMentions.push({
        text: noun,
        category: 'author_hallucination',
        confidence: 0.8
      });
    } else if (confidence >= 0.3) {
      validMentions++;
    }
  });
  
  // STEP 4: Calculate overall confidence
  const highConfidenceViolations = invalidMentions.filter(
    m => m.confidence > 0.7
  );
  
  let overallConfidence = 1.0;
  let severity = 'low';
  
  if (highConfidenceViolations.length > 0) {
    overallConfidence = 0.2;
    severity = 'high';
  }
  
  // STEP 5: Return validation result
  return {
    isValid: highConfidenceViolations.length === 0,
    confidence: overallConfidence,
    severity,
    invalidMentions,
    validatedResponse: response,
    metrics: {
      totalMentions,
      validMentions,
      suspiciousMentions: invalidMentions.length
    }
  };
}

Pattern Detection

Location: system-prompt.ts:316-322

Estonian Patterns

const estonianPatterns = {
  // Proper nouns (Capitalized words)
  properNouns: /\b[A-ZÄÖÜÕ][a-zäöüõ]+(?:\s+[A-ZÄÖÜÕ][a-zäöüõ]+)*\b/g,
  
  // Quoted text (potential book titles)
  quotedText: /"([^"]+)"/g,
  
  // Author names (1-3 capitalized words)
  authorNames: /\b[A-ZÄÖÜÕ][a-zäöüõ]+(?:\s+[A-ZÄÖÜÕ][a-zäöüõ]+){1,2}\b/g,
  
  // Character titles (Detective, Inspector, etc.)
  characterTitles: /\b(?:inspektor|komissar|uurija|doktor|professor|härra|proua)\s+[A-ZÄÖÜÕ][a-zäöüõ]+/gi,
  
  // Plot elements (case, murder, investigation, etc.)
  plotElements: /\b(?:juhtum|mõrv|kuritegu|uurimine|kahtlane|ohver|tunnistaja)\b/gi
};

Confidence Scoring

function calculateMentionConfidence(
  mention: string,
  validContent: Set<string>,
  category: string
): number {
  const lowerMention = mention.toLowerCase();
  
  // Exact match → 1.0 (perfect)
  if (validContent.has(lowerMention)) {
    return 1.0;
  }
  
  // Partial match → 0.7 (likely valid)
  for (const valid of validContent) {
    if (valid.includes(lowerMention) || 
        lowerMention.includes(valid)) {
      return 0.7;
    }
  }
  
  // Character/plot patterns → 0.3 (from descriptions)
  if (category === 'content_discussion') {
    if (characterTitles.test(mention) || 
        plotElements.test(mention)) {
      return 0.3;
    }
  }
  
  // No match → 0.0 (hallucination)
  return 0.0;
}

Validation Categories

1. Title Hallucination

Detection: Quoted text not matching product titles

// AI says: "Soovitan raamatut "Müstiline Saladus""
// Product: "Detektiiv Smith"
// → MISMATCH: Title hallucination

Confidence: 0.9 (high - likely invented)

2. Author Hallucination

Detection: Proper nouns not matching product authors

// AI says: "...autor Kristjan Tamm..."
// Product authors: "John Smith"
// → MISMATCH: Author hallucination

Confidence: 0.8 (high - likely invented)

3. Content Discussion

Detection: Plot elements or character names

// AI says: "Raamatus juhtub mõrv ja inspektor uurib..."
// → SUSPECT: Discussing content (shouldn't happen)

Confidence: 0.2 (low - might be from description)

Severity Levels

High Severity (>0.7 confidence violations)

{
  isValid: false,
  confidence: 0.2,
  severity: 'high',
  invalidMentions: [
    {
      text: '"Invented Title"',
      category: 'title_hallucination',
      confidence: 0.9
    }
  ]
}

Action: Replace with fallback response

Medium Severity (0.4-0.7 confidence)

{
  isValid: true,
  confidence: 0.6,
  severity: 'medium',
  invalidMentions: [
    {
      text: 'Ambiguous Name',
      category: 'author_hallucination',
      confidence: 0.5
    }
  ]
}

Action: Log for review, allow response

Low Severity (<0.4 confidence or no violations)

{
  isValid: true,
  confidence: 1.0,
  severity: 'low',
  invalidMentions: []
}

Action: Accept response as-is

Fallback Responses

When validation fails:

const FALLBACK_RESPONSES = {
  et: "Vabandust, hetkel ei ole sobivaid tooteid saadaval. 
       Palun täpsustage oma päringut või proovige teisi märksõnu.",
  
  en: "Sorry, no suitable products are currently available. 
       Please refine your query or try different keywords."
};

Context-Aware Validation

Category Mismatch Detection

// User asks for cooking books
if (userContext.includes('kok')) {
  validatedProducts.forEach(product => {
    const isCookingRelated = /kok|retsept|toit|cook|recipe|food/i
      .test(`${product.title} ${product.category}`);
    
    if (!isCookingRelated) {
      invalidMentions.push({
        text: product.title,
        category: 'content_discussion',
        confidence: 0.8,
        context: 'Product does not match cooking context'
      });
    }
  });
}

Validation Metrics

interface ValidationResult {
  isValid: boolean,
  confidence: number,         // 0-1
  severity: 'low' | 'medium' | 'high',
  
  invalidMentions: {
    text: string,
    category: string,
    confidence: number,
    context?: string
  }[],
  
  validatedResponse: string,
  
  metrics: {
    totalMentions: number,
    validMentions: number,
    suspiciousMentions: number
  }
}

Example Validations

Valid Response

Response: "Soovitan raamatut **Harry Potter**..."
Products: [{ title: "Harry Potter" }]

Result: {
  isValid: true,
  confidence: 1.0,
  severity: 'low',
  invalidMentions: []
}

Invalid Response (Hallucination)

Response: "Raamatus juhtub mõrv ja inspektor Smith uurib..."
Products: [{ title: "Müsteerium", category: "Kriminaal" }]

Result: {
  isValid: false,
  confidence: 0.2,
  severity: 'high',
  invalidMentions: [
    { text: 'Smith', category: 'author_hallucination', confidence: 0.8 },
    { text: 'mõrv', category: 'content_discussion', confidence: 0.3 }
  ],
  validatedResponse: "Vabandust, hetkel ei ole sobivaid tooteid..."
}

Monitoring

Track validation metrics:

{
  validationRate: number,           // % of responses validated
  halluc inationRate: number,       // % with high-confidence violations
  averageConfidence: number,        // Mean confidence score
  
  categoryCounts: {
    title_hallucination: number,
    author_hallucination: number,
    content_discussion: number
  },
  
  severityDistribution: {
    low: number,
    medium: number,
    high: number
  }
}

Estonian Prompt - Rules being validated
English Prompt - English validation rules
Phase 5: Generation - Where validation is used

Purpose​

Validation Pipeline​

Implementation​

Pattern Detection​

Estonian Patterns​

Confidence Scoring​

Validation Categories​

1. Title Hallucination​

2. Author Hallucination​

3. Content Discussion​

Severity Levels​

High Severity (>0.7 confidence violations)​

Medium Severity (0.4-0.7 confidence)​

Low Severity (<0.4 confidence or no violations)​

Fallback Responses​

Context-Aware Validation​

Category Mismatch Detection​

Validation Metrics​

Example Validations​

Valid Response​

Invalid Response (Hallucination)​

Monitoring​

Related Documentation​