Tutorial 1 : 10 Data Mining Mistakes -- and How to Avoid Them |
John F. Elder IV (Chief Scientist, Elder Research, Inc.) |
Abstract |
This tutorial will reveal the top mistakes data analysts can make, from the simple to the subtle, using real-world (often humorous) stories. The topics will be presented from case studies of real projects and the (often overlooked) symptoms that suggested something might be amiss |
Biography |
|
Tutorial 2 : Data Mining In Time Series Databases |
Eamonn Keogh (University of California, Riverside) |
Abstract |
In this tutorial we will review the state of the art in time series data mining. In addition to the ubiquitous classification and similarity search problems, we will also consider clustering, anomaly detection, visualization, motif discovery and other exciting tasks. The ideas presented will be motivated by case studies in domains as diverse as video surveillance, cardiology, text mining, space telemetry monitoring, handwriting indexing, query by humming and motion capture/animation. Rather that simply review previous work, we have taken the time to reimplement and compare most of the work in the literature. For example: we have |
Biography |
|
Tutorial 3 : Algorithmic Excursions in Data Streams |
Sudipto Guha (University of Pennsylvannia,) |
Abstract |
For many recent applications, the concept of a data stream is more appropriate than a data set. By nature, a stored data set is an appropriate model when significant portions of the data are queried repeatedly, and updates are small and/or relatively infrequent. In contrast, a data stream is a more appropriate model in scenarios where large volumes of data or updates arrive continuously and it is either unnecessary or impractical to store the data in some form of memory. Many applications naturally generate data streams as opposed to simple data sets. |
Biography |
|
Tutorial 4 : Data Grid Management Systems (DGMS) |
Arun swaran Jagatheesan (University of California at San Diego) |
Abstract |
A data grid infrastructure facilitates a logical view of heterogeneous distributed resources that are shared between autonomous administrative domains. Data grids are being built around the world, as the next generation data-handling infrastructures, for coordinated sharing of data and storage resources. A datagrid infrastructure provides a location independent logical namespace, consisting of persistent global identifiers for data resources, storage resources and users in an inter/intra organizational enterprise. Data Grid Management Systems (DGMS) provide services on the data grid infrastructure for inter/intra organizational information storage management. |
Biography |
|