The growing interest in data mining is motivated by a common problem across disciplines: how does one store, access, model, and ultimately describe and understand very large data sets? Historically, different aspects of data mining have been addressed independently by different disciplines. This is the first truly interdisciplinary text on data mining, blending the contributions of information science, computer science, and statistics.
The book consists of three sections. The first, foundations, provides a tutorial overview of the principles underlying data mining algorithms and their application. The presentation emphasizes intuition rather than rigor. The second section, data mining algorithms, shows how algorithms are constructed to solve specific problems in a principled manner. The algorithms covered include trees and rules for classification and regression, association rules, belief networks, classical statistical models, nonlinear models such as neural networks, and local "memory-based" models. The third section shows how all of the preceding analysis fits together when applied to real-world data mining problems. Topics include the role of metadata, how to handle missing data, and data preprocessing.
Advances in Knowledge Discovery and Data Mining brings together the latest research—in statistics, databases, machine learning, and artificial intelligence —that are part of the exciting and rapidly growing field of Knowledge Discovery and Data Mining. Topics covered include fundamental issues, classification and clustering, trend and deviation analysis, dependency modeling, integrated discovery systems, next generation database systems, and application case studies. The contributors include leading researchers and practitioners from academia, government laboratories, and private industry.
The last decade has seen an explosive growth in the generation and collection of data. Advances in data collection, widespread use of bar codes for most commercial products, and the computerization of many business and government transactions have flooded us with data and generated an urgent need for new techniques and tools that can intelligently and automatically assist in transforming this data into useful knowledge. This book is a timely and comprehensive overview of the new generation of techniques and tools for knowledge discovery in data.