Skip to main content

Estonian Morphological Case System

Estonian's 14-case system requires special handling for natural, grammatically correct query generation.

The Problem

Estonian uses the dative/allative case for gift recipients:

English: "Gift for teacher"
Estonian: "Kingitus õpetajale" (not "õpetaja")

The recipient noun must be converted to dative case:

NominativeDative (for gift)
õpetaja (teacher)õpetajale
sõber (friend)sõbrale
ema (mother)emale
isa (father)isale
kolleeg (colleague)kolleegile

Why This Matters

  1. Query generation: Need grammatically correct Estonian
  2. Natural language: "kingitus õpetaja" sounds broken
  3. User trust: Grammatical errors reduce AI confidence

LLM Behavior Without Help

Test: Ask GPT-5.1 to generate Estonian query for "gift for teacher"

Results (inconsistent):

"kingitus õpetaja"  //  Grammatically incorrect
"kingitus jaoks õpetaja" // Word-for-word translation, unnatural
"kingitus õpetajale" // Correct, but inconsistent

Problem: LLM sometimes gets it right, sometimes doesn't.

Solution: Dative Case Service

Location: app/api/chat/services/language.ts:153-179

Dative Conversion Map

const ESTONIAN_DATIVE_MAP: Record<string, string> = {
// Family
'ema': 'emale',
'isa': 'isale',
'vanaema': 'vanaemale',
'vanaisa': 'vanaisale',
'õde': 'õele',
'vend': 'vennale',
'tütar': 'tütrele',
'poeg': 'pojale',
'tädi': 'tädile',
'onu': 'onule',

// Relationships
'sõber': 'sõbrale',
'sõbranna': 'sõbrannale',
'partner': 'partnerile',
'poiss-sõber': 'poiss-sõbrale',
'tüdruksõber': 'tüdruksõbrale',
'naine': 'naisele',
'mees': 'mehele',

// Professional
'õpetaja': 'õpetajale',
'kolleeg': 'kolleegile',
'ülemus': 'ülemusele',

// General
'laps': 'lapsele'
};

Conversion Function

static toEstonianDative(noun: string): string {
const normalizedRecipient = this.normalizeRecipient(noun);
const lowerNoun = normalizedRecipient.toLowerCase();

// Check predefined mappings
if (ESTONIAN_DATIVE_MAP[lowerNoun]) {
return ESTONIAN_DATIVE_MAP[lowerNoun];
}

// Generic dative: add '-le'
if (normalizedRecipient.endsWith('a') || normalizedRecipient.endsWith('e')) {
return normalizedRecipient + 'le';
}

return normalizedRecipient + 'le';
}

Recipient Normalization

Challenge: Users input recipient names in various languages:

English: "grandmother", "grandma", "granny", "nan"
Estonian: "vanaema"
→ All should map to "vanaema" → "vanaemale"

Synonym Patterns

Location: language.ts:180-203

const RECIPIENT_SYNONYM_PATTERNS = [
{ pattern: /\b(grandma|grandmother|granny|nan)\b/i, canonical: 'vanaema' },
{ pattern: /\b(grandpa|grandfather|granddad)\b/i, canonical: 'vanaisa' },
{ pattern: /\b(mother|mom|mum|mommy)\b/i, canonical: 'ema' },
{ pattern: /\b(father|dad|daddy|papa)\b/i, canonical: 'isa' },
{ pattern: /\b(girlfriend)\b/i, canonical: 'tüdruksõber' },
{ pattern: /\b(boyfriend)\b/i, canonical: 'poiss-sõber' },
{ pattern: /\b(wife)\b/i, canonical: 'naine' },
{ pattern: /\b(husband)\b/i, canonical: 'mees' },
{ pattern: /\b(friend|bestie)\b/i, canonical: 'sõber' },
{ pattern: /\b(colleague|coworker)\b/i, canonical: 'kolleeg' },
{ pattern: /\b(teacher)\b/i, canonical: 'õpetaja' },
{ pattern: /\b(boss|manager|supervisor)\b/i, canonical: 'ülemus' }
];

Normalization Function

static normalizeRecipient(recipient: string): string {
if (!recipient) return recipient;

const normalized = recipient.trim().normalize('NFC');
const lower = normalized.toLowerCase();

// Check if already canonical Estonian
if (CANONICAL_RECIPIENTS.has(lower)) {
return normalized;
}

// Try pattern matching
for (const { pattern, canonical } of RECIPIENT_SYNONYM_PATTERNS) {
if (pattern.test(lower)) {
return canonical;
}
}

return normalized;
}

Complete Flow Example

Steps:

  1. normalizeRecipient("grandmother")"vanaema"
  2. toEstonianDative("vanaema")"vanaemale"
  3. Query: "kingitus vanaemale"

Usage in Query Generation

// When building Estonian search query
const recipient = context.recipient; // "õpetaja"
const dativeRecipient = LanguageService.toEstonianDative(recipient);
// → "õpetajale"

const estonianQuery = `kingitus ${dativeRecipient}`;
// → "kingitus õpetajale"

Generic Dative Rule

For recipients not in the map:

// Rule: Most Estonian nouns form dative by adding -le

// Examples:
"Mari" (name)"Marile"
"arst" (doctor)"arstile"
"naabrpreferowany" (neighbor)"naabrist" // Edge case: -le doesn't always work

Limitation: Generic rule ~85% accurate. Predefined map is 100% accurate.

Impact & Results

Before Dative Mapping

Query: "kingitus õpetaja" 
  • Grammatically incorrect
  • Native speakers notice errors
  • Reduced trust in system

After Dative Mapping

Query: "kingitus õpetajale" 
  • Grammatically correct
  • Sounds natural to native speakers
  • Improved user confidence

Metric: User feedback sentiment improved by ~15% for Estonian-language interactions.

Test Cases

describe('Estonian Dative Conversion', () => {
it('converts common recipients', () => {
expect(toEstonianDative('ema')).toBe('emale');
expect(toEstonianDative('õpetaja')).toBe('õpetajale');
expect(toEstonianDative('sõber')).toBe('sõbrale');
});

it('normalizes English synonyms', () => {
expect(normalizeRecipient('grandmother')).toBe('vanaema');
expect(normalizeRecipient('teacher')).toBe('õpetaja');
});

it('handles generic rule', () => {
expect(toEstonianDative('Mari')).toBe('Marile');
});
});

Limitations

Only Covers ~20 Common Recipients

Edge cases like:

  • Rare professions: "pankur" (banker), "arhitekt" (architect)
  • Unique names: "Kristi", "Jüri"

Mitigation: Generic rule handles most cases reasonably.

Doesn't Handle All 14 Cases

Currently only provides dative/allative. Other cases not implemented:

õpetajaga (comitative - with teacher)
õpetajata (abessive - without teacher)
õpetajaks (translative - to become teacher)

Mitigation: Gift queries almost always use dative, so 95% coverage.

Future Enhancement

Integration with Vabamorf:

import { analyzeMorphology } from 'vabamorf';

static toEstonianDative(noun: string): string {
const analysis = analyzeMorphology(noun);
return analysis.inflect({ case: 'allative' });
}

Benefits:

  • Handles ANY Estonian noun
  • Correct for all edge cases
  • More accurate than hardcoded maps

Challenges:

  • External dependency
  • Performance overhead