7/27/2019 ch01 cs834
1/19
An Introduction to Language Processing with Perl andProlog
Chapter 1: An Overview of Language Processing
Pierre Nugues
Lund University
http://www.cs.lth.se/home/Pierre_Nugues/
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 1 / 19
http://localhost/var/www/apps/conversion/tmp/scratch_1/[email protected]://www.cs.lth.se/home/Pierre_Nugues/http://www.cs.lth.se/home/Pierre_Nugues/http://localhost/var/www/apps/conversion/tmp/scratch_1/[email protected]://find/http://goback/7/27/2019 ch01 cs834
2/19
Chapter 1: An Overview of Language Processing
Applications of Language Processing
Spelling and grammatical checkers: MS Word
Text indexing and information retrieval on the Internet: Google,
Microsoft Bing, YahooTelephone information that understands some spoken questions: SJ(trains in Sweden) or Tellme.com in the United States
Speech dictation of letters or reports: IBM ViaVoice, Windows Vista
Translation: Google Translate, SYSTRAN
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 2 / 19
http://find/7/27/2019 ch01 cs834
3/19
7/27/2019 ch01 cs834
4/19
Chapter 1: An Overview of Language Processing
Linguistics Layers
Sounds
Phonemes
Words and morphologySyntax and functions
Semantics
Dialogue
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 4 / 19
Ch 1 A O i f L P i
http://find/7/27/2019 ch01 cs834
5/19
Chapter 1: An Overview of Language Processing
Sounds and Phonemes
Serious Cest par la It is that way
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 5 / 19
Ch t 1 A O i f L P i
http://goforward/http://find/http://goback/7/27/2019 ch01 cs834
6/19
Chapter 1: An Overview of Language Processing
Lexicon and Parts of Speech
The big cat ate the gray mouse
The/article big/adjective cat/noun ate/verb the/article gray/adjective
mouse/nounLe/article gros/adjectif chat/nom mange/verbe la/article souris/nomgrise/adjectifDie/Artikel groe/Adjektiv Katze/Substantiv it/Verb die/Artikelgraue/Adjektiv Maus/Substantiv
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 6 / 19
Chapter 1: An Overview of Language Processing
http://find/7/27/2019 ch01 cs834
7/19
Chapter 1: An Overview of Language Processing
Morphology
Word Root form
worked to work + verb + preterittravaille travailler + verb + past participlegearbeitet arbeiten + verb + past participle
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 7 / 19
Chapter 1: An Overview of Language Processing
http://find/7/27/2019 ch01 cs834
8/19
Chapter 1: An Overview of Language Processing
Syntactic Tree
sentence
noun phrase verb phrase
article verb noun phrase
nounarticle
noun
The boy hit the ball
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 8 / 19
Chapter 1: An Overview of Language Processing
http://find/7/27/2019 ch01 cs834
9/19
Chapter 1: An Overview of Language Processing
Syntax: A Classical View
A graph of dependencies and functions
The boy hit the ballSubject
ObjectVerb
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 9 / 19
Chapter 1: An Overview of Language Processing
http://find/7/27/2019 ch01 cs834
10/19
Chapter 1: An Overview of Language Processing
Semantics
As opposed to syntax:
1 Colorless green ideas sleep furiously.
2 *Furiously sleep ideas green colorless.
Determining the logical form:
Sentence Logical representation
Frank is writing notes writing(Frank, notes).Francois ecrit des notes ecrit(Francois, notes).
Franz schreibt Notizen schreibt(Franz, Notizen).
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 10 / 19
http://find/7/27/2019 ch01 cs834
11/19
Chapter 1: An Overview of Language Processing
7/27/2019 ch01 cs834
12/19
p g g g
Reference
Pierre wrote notes wrote(pierre, notes)
Pierre
Louis
Charlotte operating
systems
computational
linguistics
Prolog
programming
1. sentence 2. logical representation
3. real
world
referencing referencing
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 12 / 19
Chapter 1: An Overview of Language Processing
http://find/http://goback/7/27/2019 ch01 cs834
13/19
g g g
Ambiguity
Many analyses are ambiguous. It makes language processing difficult.Ambiguity occurs in any layer: speech recognition, part-of-speech tagging,parsing, etc.
Example of an ambiguous phonetic transcription:The boys eat the sandwichesThat may correspond to:The boy seat the sandwiches; the boy seat this and which is; the buoyseat the sand which is
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 13 / 19
Chapter 1: An Overview of Language Processing
http://find/7/27/2019 ch01 cs834
14/19
Models and Tools
Linguistics has produced an impressive set of theories and modelsLanguage processing requires significant resources
Models and tools have matured. Resources are available.Tools involve notably finite-state automata, regular expressions, rewritingrules, logic, statistics and machine learning.
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 14 / 19
Chapter 1: An Overview of Language Processing
http://find/http://goback/7/27/2019 ch01 cs834
15/19
The Carsim System: A Text-to-Scene Converter
Texts XML Templates 3D Animation
Vehicule B venantde ma gauche, je me trouve dans lecarrefour, a faiblevitesse environ 40
km/h, quand le vehicule B, percutemon vehicule, et merefuse la priorite adroite. Le premier
choc atteint mon ailearriere gauche,
// Static ObjectsSTATIC [ROADTREE]// Dynamic ObjectsDYNAMIC [VEHICLE [ID = vehicule b;
INITDIRECTION = east;
= =NLP engine Java 3D animation program
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 15 / 19
Chapter 1: An Overview of Language Processing
http://find/7/27/2019 ch01 cs834
16/19
Dialogue: The Persona Project from Microsoft Research
A conversation with Peedy
Turn Utterance[Peedy is asleep on his perch]
User: Good morning, Peedy.[Peedy rouses]
Peedy: Good morning.User: Lets do a demo.
[Peedy stands up, smiles]Peedy: Your wish is my command, what would you like to hear?User: What have you got by Bonnie Raitt?
[Peedy waves in a stream of notes, and grabs one as they rushby.]Peedy: I have The Bonnie Raitt Collection from 1990.User: Pick something from thatPeedy: How about Angel from Montgomery?
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 16 / 19
Chapter 1: An Overview of Language Processing
http://find/7/27/2019 ch01 cs834
17/19
Dialogue: The Persona Project from Microsoft Research
User: Sounds good.[Peedy drops note on pile]
Peedy: OK.User: Play some rock after that.
[Peedy scans the notes again, selects one]
Peedy: How about Fools in love?User: Who wrote that?[Peedy cups one wing to his ear]
Peedy: Huh?User: Who wrote that?
[Peedy looks up, scrunches his brow]Peedy: Joe JacksonUser: Fine.
[Drops note on pile]
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 17 / 19
Chapter 1: An Overview of Language Processing
http://find/7/27/2019 ch01 cs834
18/19
Persona System Architecture
Source: http:
//research.microsoft.com/research/pubs/view.aspx?pubid=439Pierre Nugues An Introduction to Language Processing with Perl and Prolog 18 / 19
Chapter 1: An Overview of Language Processing
http://research.microsoft.com/research/pubs/view.aspx?pubid=439http://research.microsoft.com/research/pubs/view.aspx?pubid=439http://research.microsoft.com/research/pubs/view.aspx?pubid=439http://research.microsoft.com/research/pubs/view.aspx?pubid=439http://find/http://goback/7/27/2019 ch01 cs834
19/19
Research Relevance
Large companies like Microsoft, Google, Yahoo, IBM, or Xerox have a
research activity in natural language processing.The 7th European framework program (2007-2013) names six technologypillars in information technologies. Two of them are related to languageprocessing:
Knowledge, cognitive and learning systems: semantic systems;capturing and exploiting knowledge embedded in web and multimediacontent; bio-inspired artificial systems that perceive, understand, learnand evolve, and act autonomously; learning by convivial machines andhumans based on a better understanding of human cognition.
Simulation, visualization, interaction and mixed realities: tools forinnovative design and creativity in products, services and digitalmedia, and for natural, language-enabled and context-rich interactionand communication.
Pierre Nugues An Introduction to Language Processing with Perl and Prolog 19 / 19
http://find/