Speech and Language Processing: Can We Use the Past to Predict the Future?
Where have we been and where are we going? Three types of answers will be
discussed: consistent progress, oscillations and discontinuities. Moore's
Law provides a convincing demonstration of consistent progress, when it
applies. Speech recognition error rates are declining by 10 times per
decade; speech coding rates are declining by 2 times per decade.
Unfortunately, fields do not always move in consistent directions.
Empiricism dominated the field in the 1950s, and was revived again in the
1990s. Oscillations between Empiricism and Rationalism may be inevitable,
with the next revival of Rationalism coming in the 2010s, assuming
a 40-year cycle. Discontinuities are a third logical possibility. From
time to time, there will be fundamental changes that invalidate fundamental
assumptions. As petabytes become a commodity (in the 2010s), old apps like
data entry (dictation) will be replaced with new priorities like data
Dr. Kenneth Church's Biography
I moved to Microsoft Research in late 2003. Before that, I was the head of
a data mining department in AT&T Labs-Research (formally AT&T Bell
Labs, Murray Hill, NJ). I received my BS, Masters and PhD from MIT in
computer science in 1978, 1980 and 1983, and immediately joined AT&T.
I have worked in many areas of computational linguistics including:
acoustics, speech recognition, speech synthesis, OCR, phonetics, phonology,
morphology, word-sense disambiguation, spelling correction, terminology,
translation, lexicography, information retrieval, compression, language
modeling and text analysis. I enjoy working with very large corpora such as
the Associated Press newswire (1 million words per week) and larger
datasets such as larger data sets such as telephone call detail (1-10
billion records per month).
Dr. Patrick Hanks |
Berlin-Brandenburgische Akademie der Wissenschaften
Prof. James Pustejovsky |
Department of Computer Science
Common Sense about Word Meaning: Sense in Context
We present a new approach to determining the meaning of words in text,
which relies on assigning senses to the contexts within which words occur,
rather than to the words themselves. A preliminary version of this approach
is presented in Pustejovsky, Hanks and Rumshisky (2004, COLING). We argue
that words senses are not directly encoded in the lexicon of a language,
but rather that each word is associated with one or more stereotypical
syntagmatic patterns. Each pattern is associated with a meaning, which can
be expressed in a formal way as a resource for any of a variety of
Dr. Patrick Hanks's Biography
Patrick Hanks is one of Britain's leading lexicographers. He was
responsible for the first editions of Collins English Dictionary,
Cobuild, and the New Oxford Dictionary of English.
He is a leading corpus linguist. He has worked with John Sinclair on
corpus analysis, Ken Church on statistical methods in computational
linguistics, Yorick Wilks on preference semantics, and James Pustejovsky on
techniques for inferencing and disambiguation. His work in the philosophy
of language has been described by David Wiggins (Wykeham Professor of Logic
in the University of Oxford) as "the first really significant advance in
the handling of word meaning since the 17th-18th century."
He has led teams researching surnames and first names, and is currently
working with Kate Hardcastle, Ken Tucker, and others on a vast database of
the origins and distribution of American family names.
He is currently researching phraseology and idioms in German
and English and is a consultant to the Electronic Dictionary
of the German Language project at the Berlin-Brandenburg
Academy of Sciences.
Prof. James Pustejovsky's Biography
- computational linguistics,
- lexical semantics,
- compositional semantics,
- theories of categorization, commonsense metaphysics,
- automatic indexing techniques, and
- content-based information extraction.
Prof. Dr. Jan Odijk |
UIL-OTS University of Utrecht
3512 JK, Utrecht
I will first sketch some background on the company ScanSoft. Next, I
will discuss ScanSoft's products and technologies, which include digital
imaging and OCR technology, automatic speech recognition technology
(ASR), text-to-speech technology (TTS), dialogue technology, including
multimodal dialogues, dictation technology and audiomining technology. I
will sketch the basic functionality of these technologies, a global
sketch of the components they are composed of, demonstrate some of them,
and illustrate the platform types on which they can be used.
Finally I will sketch what is needed to develop such technologies,
focusing not only on data but also on required modules and
Prof. Dr. Jan Odijk's Biography
Jan Odijk is professor of language and speech technology at the University
of Utrecht. The focus of the research is on making grammars useable for
language and speech technology by developing approximations to them in
a systematic manner.
Jan Odijk worked at the University of Utrecht, from 1982-1988 where he
carried out research into theoretical syntax and research in computational
linguistics (esp. machine translation). In 1988 he joined Philips Research
Laboratories, Eindhoven, where he carried out research into grammars and
lexicons for machine translation, and where he worked since 1993 (at IPO)
on the development of natural language and speech interfaces, esp. language
and speech generation.
In 1997 he joined Lernout & Hauspie Speech Products to become the senior
director of the linguistic resources division. Since December 2001 he
occupies the same position at ScanSoft Belgium.