Information retrieval is the foundation for modern search engines. This text offers an introduction to the core topics underlying modern search technologies, including algorithms, data structures, indexing, retrieval, and evaluation. The emphasis is on implementation and experimentation; each chapter includes exercises and suggestions for student projects. Wumpus—a multiuser open-source information retrieval system developed by one of the authors and available online—provides model implementations and a basis for student work.
Lessons from database research have been applied in academic fields ranging from bioinformatics to next-generation Internet architecture and in industrial uses including Web-based e-commerce and search engines. The core ideas in the field have become increasingly influential. This text provides both students and professionals with a grounding in database research and a technical context for understanding recent innovations in the field. The readings included treat the most important issues in the database area—the basic material for any DBMS professional.
In Ontological Semantics, Sergei Nirenburg and Victor Raskin introduce a comprehensive approach to the treatment of text meaning by computer. Arguing that being able to use meaning is crucial to the success of natural language processing (NLP) applications, they depart from the ad hoc approach to meaning taken by much of the NLP community and propose theory-based semantic methods.
As the World Wide Web continues to expand, it becomes increasingly difficult for users to obtain information efficiently. Because most search engines read format languages such as HTML or SGML, search results reflect formatting tags more than actual page content, which is expressed in natural language.
The growing interest in data mining is motivated by a common problem across disciplines: how does one store, access, model, and ultimately describe and understand very large data sets? Historically, different aspects of data mining have been addressed independently by different disciplines. This is the first truly interdisciplinary text on data mining, blending the contributions of information science, computer science, and statistics.
The idea of knowledge bases lies at the heart of symbolic, or "traditional," artificial intelligence. A knowledge-based system decides how to act by running formal reasoning procedures over a body of explicitly represented knowledge—a knowledge base. The system is not programmed for specific tasks; rather, it is told what it needs to know and expected to infer the rest.
Until recently, information systems have been designed around different business functions, such as accounts payable and inventory control. Object-oriented modeling, in contrast, structures systems around the data—the objects—that make up the various business functions. Because information about a particular function is limited to one place—to the object—the system is shielded from the effects of change. Object-oriented modeling also promotes better understanding of requirements, clear designs, and more easily maintainable systems.
With 10,000 entries, this dictionary is the most complete of its kind. It is a major contribution to more accurate sharing of scientific and technological information.
The dictionary is unique in providing a romanized transcription for each of the 10,000 Japanese terms. It promotes clear oral communication, whether one is using purely Japanese words or terms that have been borrowed from English but are pronounced somewhat differently by the Japanese.