Model
The public model is built around three hydrated DTOs:
Lemma: the dictionary or lemma-like entrySurface: the normalized full form in contextSelection: the exact observed highlight in learner text
Selections are always hydrated:
- a
Selectionalways contains aSurface - a
Surfacealways contains aLemma
Lemma
const seeLemma = dumling.de.create.lemma({
canonicalLemma: "see",
lemmaKind: "Lexeme",
lemmaSubKind: "NOUN",
inherentFeatures: {
gender: "Masc",
},
meaningInEmojis: "🌊",
}) satisfies Lemma<"de", "Lexeme", "NOUN">;
A lemma is the canonical lexical object, or a lemma-like fused entry. It is where you put the language, the canonical form, the broad lemma kind, the concrete lemma subtype, inherent features, and a learner-facing meaning hint.
Surface
const seeSurface = dumling.de.create.surface.citation({
lemma: seeLemma,
normalizedFullSurface: "See",
}) satisfies Surface<
"de",
"Citation",
"Lexeme",
"NOUN"
>;
A citation surface uses surfaceKind: "Citation" and normally has the canonical lemma spelling as normalizedFullSurface.
Marked properties of the resolved surface live in surfaceFeatures. For example, a historical citation or inflection can carry surfaceFeatures: { historicalStatus: "Archaic" }.
Construction entries are citation-only today, so Construction/Fusion and Construction/PairedFrame only appear under Surface<Citation> and never under Surface<Inflection>.
An inflection surface uses surfaceKind: "Inflection" and adds inflectionalFeatures:
const ranSurface = dumling.en.create.surface.inflection({
lemma: runLemma,
normalizedFullSurface: "ran",
inflectionalFeatures: {
tense: "Past",
verbForm: "Fin",
},
});
Selection
const seeSelection = dumling.de.create.selection({
spelledSelection: "See",
surface: seeSurface,
}) satisfies Selection<"de", "Citation", "Lexeme", "NOUN">;
A selection records what was observed in text. The normalized surface stays available through selection.surface, and the lemma entry stays available through selection.surface.lemma.
Only marked mismatches are stored on the selection itself. selectionFeatures can record:
orthography: "Typo"coverage: "Partial"spelling: "Variant"
When selectionFeatures is omitted, the selection is implicitly standard, full, and canonically spelled relative to its resolved surface.
Descriptors
Descriptors are compact structural summaries of DTOs. They are useful when code needs to route by entity kind, language, lemma kind, or surface kind without carrying the whole object through the branch.
const descriptor = dumling.de.describe.as.selection(seeSelection);
descriptor.entityKind; // "Selection"
descriptor.language; // "de"
descriptor.lemmaKind; // "Lexeme"
descriptor.lemmaSubKind; // "NOUN"
descriptor.surfaceKind; // "Citation"
IDs
IDs are stable strings produced from hydrated DTOs. Use the language-bound ID helpers when the caller already knows the language:
const id = dumling.de.id.encode(seeSelection);
const decoded = dumling.de.id.decodeAs("Selection", id);
Use inspectId(id) from the root entrypoint when you need metadata before full decoding.
Runtime Validation
Parsing returns an ApiResult instead of throwing:
const parsed = dumling.de.parse.selection(input);
if (!parsed.success) {
console.error(parsed.error.code, parsed.error.issues);
}
The schema entrypoint exposes concrete Zod schemas when a caller needs direct validator access:
schemasFor.de.entity.Selection.Citation.Lexeme.NOUN().parse(value);