Computational Phonology

4
arXiv:cs.CL/0204023 v1 10 Apr 2002 Computational Phonology Steven Bird University of Pennsylvania Phonology, as it is practiced, is deeply computational. Phonological analy- sis is data-intensive and the resulting models are nothing other than specialized data structures and algorithms. In the past, phonological computation – managing data and developing analyses – was done manually with pencil and paper. In- creasingly, with the proliferation of affordable computers, IPA fonts and drawing software, phonologists are seeking to move their computation work online. Com- putational Phonology provides the theoretical and technological framework for this migration, building on methodologies and tools from computational linguis- tics. This piece consists of an apology for computational phonology, a history, and an overview of current research. Documentation and Description. Phonological data is of essentially three types: texts, wordlists and paradigms. A text is any phonetically transcribed nar- rative or conversation. A wordlist is any compilation of linguistic forms which can be uttered in isolation, with information about pronunciation and meaning. A paradigm is broadly construed to mean any tabulation of words or phrases which illustrates contrasts and systematic variation. Any of these data types may be an- notated with more abstract information originating from a phonological theory, such as syllable boundaries, stress marks and prosodic structure. Additionally, any of these data types may be associated with recordings of audio, video or physiological signals. Digitizing this documentation and description brings all the different media types together, makes the cross-links navigable, and opens up many new possibilities for management, access and preservation. Exploration and Analysis. The data types described above are closely inter- connected in phonological practice. For instance, the discovery of a new word in a text may require an update to the lexicon and the construction of a new paradigm (e.g. to correctly classify the word). Fresh insights may lead to new annotations and further elicitation, closing the loop in this perpetual, exploratory 1

description

Computational linguistics and phonology

Transcript of Computational Phonology

  • arX

    iv:c

    s.CL/

    0204

    023

    v1

    10 A

    pr 2

    002

    Computational Phonology

    Steven Bird

    University of Pennsylvania

    Phonology, as it is practiced, is deeply computational. Phonological analy-sis is data-intensive and the resulting models are nothing other than specializeddata structures and algorithms. In the past, phonological computation managingdata and developing analyses was done manually with pencil and paper. In-creasingly, with the proliferation of affordable computers, IPA fonts and drawingsoftware, phonologists are seeking to move their computation work online. Com-putational Phonology provides the theoretical and technological framework forthis migration, building on methodologies and tools from computational linguis-tics. This piece consists of an apology for computational phonology, a history,and an overview of current research.

    Documentation and Description. Phonological data is of essentially threetypes: texts, wordlists and paradigms. A text is any phonetically transcribed nar-rative or conversation. A wordlist is any compilation of linguistic forms whichcan be uttered in isolation, with information about pronunciation and meaning. Aparadigm is broadly construed to mean any tabulation of words or phrases whichillustrates contrasts and systematic variation. Any of these data types may be an-notated with more abstract information originating from a phonological theory,such as syllable boundaries, stress marks and prosodic structure. Additionally,any of these data types may be associated with recordings of audio, video orphysiological signals. Digitizing this documentation and description brings allthe different media types together, makes the cross-links navigable, and opens upmany new possibilities for management, access and preservation.

    Exploration and Analysis. The data types described above are closely inter-connected in phonological practice. For instance, the discovery of a new wordin a text may require an update to the lexicon and the construction of a newparadigm (e.g. to correctly classify the word). Fresh insights may lead to newannotations and further elicitation, closing the loop in this perpetual, exploratory

    1

  • process. Phonological analysis typically involves defining a formal model, sys-tematically testing it against data, and comparing it with other models. (In somecases, the model may be incorporated into a software system, e.g. for generatingnatural intonation in a text-to-speech system.) In this exploration and analysis sorting, searching, tabulating, defining, testing and comparing the principal taskis computational.

    Perhaps the earliest work in computational phonology was Bobrow and FrasersPhonological Rule Tester (Bobrow and Fraser, 1968), an implementation of SPEdesigned to alleviate the problem of rule evaluation. Shortly afterwards Johnsonshowed that, while SPE rules resemble general rewriting systems at the top of theChomsky hierarchy, the way SPE rules are used in practice only requires finitestate power (Johnson, 1972). Independently, Kaplan and Kay discovered the con-nections between SPE grammars and finite state transducers in the 70s and 80s,and laid down a complete algebraic foundation (ultimately reported in (Kaplanand Kay, 1994)). Significant implementations followed, including (Koskenniemi,1983; Beesley and Karttunen, 2002). Attempts to apply finite state devices toAutosegmental Phonology have largely foundered, but applications to OptimalityTheory are thriving.

    While finite-state phonology fixated on SPE, generative phonology contin-ued its rapid evolution. The discovery of rule conspiracies (Kisseberth, 1970)and the abstractness controversy (Koutsoudas et al., 1974), lead to calls for thereintroduction of surface structure constraints. Many theories arose from the fall-out; most notable for its computational ramifications was Montague Phonology(Wheeler, 1981). This model adapted new lexicalist formalisms from syntax andsemantics, providing a declarative (as opposed to procedural) account of phono-logical well-formedness, and providing the first computational account of under-specification (where the phonological content of a lexical entry is incompletelyspecified, to be filled in during a derivation). From these beginnings, DeclarativePhonology was born, and subsequent work provided a mathematical foundationin first-order logic (Bird, 1995) and phonetic interpretation with links to Firthianprosodic analysis and speech synthesis (Coleman, 1997), with implementationsgenerally in the Prolog programming language.

    A third major strand of development, complementing the finite state and declar-ative models, is best characterized as statistical. It seeks to apply neural networks,information theory, and weighted automata in the automatic discovery of phono-logical information. Gasser trained a recurrent neural network to recognize syl-lables and to repair ill-formed syllables (Gasser, 1992). Ellison showed how atechnique from information theory called MDL minimum description length

    2

  • could be applied to automatically identify syllable boundaries in phonemicallytranscribed texts (Ellison, 1992). Many researchers apply Markov models (a kindof weighted automata) in speech recognition, mapping speech recordings to pho-netic transcriptions and thence to orthographic words, using large, phoneticallyannotated corpora as training data (e.g. TIMIT (Garofolo et al., 1986)).

    Four key areas of ongoing research in computational phonology are in Opti-mality Theory, automatic learning, interfaces to grammar and phonetics, and sup-porting phonological description in the field. Comprehensive references to onlineresearch papers in this areas may be found on the SIGPHON website.

    Computational phonology is generating sophisticated and rigorous ways forcreating, exploring and disseminating multidimensional phonological informa-tion, encompassing primary recordings, texts, wordlists, paradigms, theories andanalyses. As phonologists adopt the computational methods described above, ex-tending and adapting them as needed, the consequences for the discipline will beincreased accessibility, accountability, and stability of empirical research.

    Resources. The Association for Computational Linguistics (ACL) has a spe-cial interest group in computational phonology (SIGPHON) with a homepage athttp://www.cogsci.ed.ac.uk/sigphon/. The website contains on-line proceedings for SIGPHON workshops and information about relevant books,dissertations and articles. A special issue of Computational Linguistics devoted tocomputational phonology was published in 1994 (Bird, 1994).

    ReferencesBeesley, K. R. and Karttunen, L. (2002). Finite-State Morphology: Xerox Tools

    and Techniques. Studies in Natural Language Processing. CambridgeUniversity Press.

    Bird, S., editor (1994). Computational Linguistics: Special Issue onComputational Phonology, volume 20(3). MIT Press.

    Bird, S. (1995). Computational Phonology: A Constraint-Based Approach.Studies in Natural Language Processing. Cambridge University Press.

    Bobrow, D. G. and Fraser, J. B. (1968). A phonological rule tester.Communications of the ACM, 11:76672.

    Coleman, J. S. (1997). Phonological Representations their names, forms andpowers. Cambridge Studies in Linguistics. Cambridge University Press.

    3

  • Ellison, T. M. (1992). Machine Learning of Phonological Structure. PhD thesis,University of Western Australia.

    Garofolo, J. S., Lamel, L. F., Fisher, W. M., Fiscus, J. G., Pallett, D. S., andDahlgren, N. L. (1986). The DARPA TIMIT Acoustic-Phonetic ContinuousSpeech Corpus CDROM. NIST.http://www.ldc.upenn.edu/Catalog/LDC93S1.html.

    Gasser, M. (1992). Learning distributed representations for syllables. InProceedings of the Fourteenth Annual Conference of the Cognitive ScienceSociety, pages 396401. Hillsdale NJ: Lawrence Erlbaum Associates.

    Johnson, C. D. (1972). Formal Aspects of Phonological Description. The Hague:Mouton.

    Kaplan, R. M. and Kay, M. (1994). Regular models of phonological rulesystems. Computational Linguistics, 20:33178.

    Kisseberth, C. W. (1970). On the functional unity of phonological rules.Linguistic Inquiry, 1:291306.

    Koskenniemi, K. (1983). Two-Level Morphology: A General ComputationalModel for Word-Form Recognition and Production. PhD thesis, Universityof Helsinki.

    Koutsoudas, A., Sanders, G., and Noll, C. (1974). The application ofphonological rules. Language, 50:128.

    Wheeler, D. W. (1981). Aspects of a Categorial Theory of Phonology. PhDthesis, University of Massachusetts at Amherst.

    4