Skip navigation

Computational Linguistics

Statistical approaches to processing natural language text have become dominant in recent years. This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear. The book contains all the theory and algorithms needed for building NLP tools. It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations. The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications.

Based on an introductory course on natural-language semantics, this book provides an introduction to type-logical grammar and the range of linguistic phenomena that can be handled in categorial grammar. It also contains a great deal of original work on categorial grammar and its application to natural-language semantics. The author chose the type-logical categorial grammar as his grammatical basis because of its broad syntactic coverage and its strong linkage of syntax and semantics. Although its basic orientation is linguistic, the book should also be of interest to logicians and computer scientists seeking connections between logical systems and natural language.

The book, which stepwise develops successively more powerful logical and grammatical systems, covers an unusually broad range of material. Topics covered include higher-order logic, applicative categorial grammar, the Lambek calculus, coordination and unbounded dependencies, quantifiers and scope, plurals, pronouns and dependency, modal logic, intensionality, and tense and aspect. The book contains more mathematical development than is usually found in texts on natural language; an appendix includes the basic mathematical concepts used throughout the book.

Eugene Charniak breaks new ground in artificial intelligence research by presenting statistical language processing from an artificial intelligence point of view in a text for researchers and scientists with a traditional computer science background.

New, exacting empirical methods are needed to break the deadlock in such areas of artificial intelligence as robotics, knowledge representation, machine learning, machine translation, and natural language processing (NLP). It is time, Charniak observes, to switch paradigms. This text introduces statistical language processing techniques—word tagging, parsing with probabilistic context free grammars, grammar induction, syntactic disambiguation, semantic word classes, word-sense disambiguation—along with the underlying mathematics and chapter exercises.

Charniak points out that as a method of attacking NLP problems, the statistical approach has several advantages. It is grounded in real text and therefore promises to produce usable results, and it offers an obvious way to approach learning: "one simply gathers statistics."

Language, Speech, and Communication

An Introduction

The Formal Semantics of Programming Languages provides the basic mathematical techniques necessary for those who are beginning a study of the semantics and logics of programming languages. These techniques will allow students to invent, formalize, and justify rules with which to reason about a variety of programming languages. Although the treatment is elementary, several of the topics covered are drawn from recent research, including the vital area of concurency. The book contains many exercises ranging from simple to miniprojects.

Starting with basic set theory, structural operational semantics is introduced as a way to define the meaning of programming languages along with associated proof techniques. Denotational and axiomatic semantics are illustrated on a simple language of while-programs, and fall proofs are given of the equivalence of the operational and denotational semantics and soundness and relative completeness of the axiomatic semantics. A proof of Godel's incompleteness theorem, which emphasizes the impossibility of achieving a fully complete axiomatic semantics, is included. It is supported by an appendix providing an introduction to the theory of computability based on while-programs.

Following a presentation of domain theory, the semantics and methods of proof for several functional languages are treated. The simplest language is that of recursion equations with both call-by-value and call-by-name evaluation. This work is extended to lan guages with higher and recursive types, including a treatment of the eager and lazy lambda-calculi. Throughout, the relationship between denotational and operational semantics is stressed, and the proofs of the correspondence between the operation and denotational semantics are provided. The treatment of recursive types - one of the more advanced parts of the book - relies on the use of information systems to represent domains. The book concludes with a chapter on parallel programming languages, accompanied by a discussion of methods for specifying and verifying nondeterministic and parallel programs.