CMU-CS-03-147
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-03-147

A Markov Model for the
Acquisition of Morphological Structure

Leonid Kontorovich, Dana Ron*, Yoram Singer**

June 2003

CMU-CS-03-147.ps
CMU-CS-03-147.pdf


Keywords: Morphology, Markov, probabilistic suffix tree


We describe a new formalism for word morphology. Our model views word generation as a random walk on a trellis of units where each unit is a set of (short) strings. The model naturally incorporates segmentation of words into morphemes. We capture the statistics of unit generation using a probabilistic suffix tree (PST) which is a variant of variable length Markov models. We present an efficient algorithm that learns a PST over the units whose output is a compact stochastic representation of morphological structure. We demonstrate the applicability of our approach by using the model in an allomorphy decision problem.

18 pages

* Tel-Aviv University ** Hebrew University


Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by [email protected]