Sebastian Lühr and Mihai Lazarescu (2008) Connectivity Based Stream Clustering Using Localised Density Exemplars. In Proc. Pacific-Asia Conference on Knowledge Discovery and Data Mining, volume 5012 of Lecture Notes in Artificial Intelligence, pages 983–984. Springer-Verlag.
- Download as PDF (1.1 MiB; © Springer-Verlag)
Abstract: Advances in data acquisition have allowed large data collections of millions of time varying records in the form of data streams. The challenge is to effectively process the stream data with limited resources while maintaining sufficient historical information to define the changes and patterns over time. It is highly desirable to handle recurrent changes without requiring the re-learning of previously observed patterns. This paper describes an evidence-based approach that uses representative points to incrementally process stream data by using a graph based method to cluster points based on connectivity and density. Critical cluster features are archived in repositories to allow the algorithm to cope with recurrent information and to provide a rich history of relevant cluster changes if a detailed analysis of past data is required. We have applied our algorithm to both synthetic and real world data sets and present results that clearly show that our approach performs better than the current established stream mining techniques: DenStream, HPStream and CluStream.