| John Carroll School of Informatics University of Sussex Falmer, Brighton BN1 9QJ UK J.A.Carroll@sussex.ac.uk |
|
Abstract:
Much recent research in natural language parsing takes as input carefully crafted, edited text, often from newspapers. However, many real-world applications involve processing text which is not written carefully by a native speaker, is produced for an eventual audience of only one, and is in essence ephemeral. In this talk I will present a number of research and commercial applications of this type which I and collaborators are developing, in which we process text as diverse as mobile phone text messages, non-native language learner essays, and primary care medical notes. I will discuss the problems these types of text pose, and outline how we integrate information from parsing into applications.
John Carroll's Biography
John Carroll works in the area of intelligent computer processing of human language (natural language processing, or NLP). His current research is concerned with: practical natural language parsing, parser evaluation, acquisition of lexical information from corpora, automatic generation of text from semantic representations, sentiment analysis, and other applications of natural language processing to real-world tasks.
|
Christiane Fellbaum Department of Computer Science 35 Olden Street Princeton University Princeton New Jersey 08540-5233 U.S.A. fellbaum@princeton.edu |
Abstract:
The KYOTO (Knowledge-Yielding Ontologies for Transition-Based Organization) project is developing a platform for mining, enhancing and sharing knowledge among a wide range of users across different languages. We discuss the challenges for constructing a logically rigorous, language-independent knowledge representation that allows deep semantic search and is accessible to naive users from diverse linguistic backgrounds. The project web site is located at www.kyoto-project.eu.
Christiane Fellbaum's Biography
Christiane Fellbaum received a Ph.D. in Linguistics from Princeton University. She joined the Cognitive Science Laboratory and, together with George A. Miller, played an active role in developing WordNet. She is a co-founder and co-president of the Global WordNet Association, which guides the constructions of lexical databases in many languages around the world. In 2001, funded by the Wolfgang-Paul Prize of the Humboldt-Foundation, she directed the 'Kollokationen im Woerterbuch' project at the Berlin-Brandenburg Academy of Sciences. Currently a member Princeton's Computer Science Department, she focuses on lexical semantics and computational linguistics.
|
Miroslav Novak T.J. Watson Research Center 1101 Kitchawan Road Yorktown Heights NY 10598, USA miroslav@us.ibm.com |
|
Abstract:
The ASR decoder as one of the fundamental components of an ASR system has been evolving over years to address the ever increasing demands for larger domains and the availability of more powerful hardware. Though the basic search algorithm (i.e. Viterbi search) is relatively simple, implementation of a decoder which can handle hundreds of thousands of words in the active vocabulary and hundreds of millions of n-grams in the language model in real time was never quite trivial. Some of the design concepts used in the past to cope with limitations of the available hardware can become relevant today again with the emergence of embedded platforms, where such limitations are similar to those of workstations of early days of ASR. In this paper we will describe various basic design concept encountered in various decoder implementations, with the focus on those which are relevant today given a fairly large spectrum of available hardware platforms.

