Knowledge Discovery and Data Mining for Time Series
Basically, the field of knowledge discovery and data mining (short: DM) deals with the extraction of interesting patterns or knowledge from often huge amounts of raw data. The goal is to find interesting patterns, that is, patterns that are valid, novel, useful, and understandable. Certainly, validity is usually seen as the most important of these goals.
Temporal data mining (TDM) addresses tasks such as segmentation, classification, clustering, forecasting, and indexing of time series, event sequences, or sections of time series or sequences. Applications deal with financial, biomedical, meteorological, or technical time series or sequences, for instance.
Efficient Time Series Modeling Techniques
In our work in the field of TDM we focus on a fusion of probabilistic modeling techniques with extremely fast polynomial least-squares approximation techniques (with E. Fuchs, Passau). These techniques allow for
The time complexity of these techniques is only linear with respect to the overall number of time series or – if they are applied on-line – constant for each new sample (observation). Therefore, these techniques are suitable for many real-time applications.
With the new segmentation, representation, and similarity measurement techniques, new and extremely fast techniques for motif detection in time series and time series clustering, classification, or prediction become possible as well. It is also possible to extract understandable temporal rules for time series classification or anomaly detection in time series.
In our work we develop a framework for TDM which we call SwiftMiner. Up to now it consists of modules such as SwiftSeg, SwiftMotif, and SwiftRule. We are interested in a broad applicability of our SwiftMiner approach, but we also investigate a few application scenarios in much more detail:
The shape space representation of time series can also be used to generate large number of artificial, but realistic time series that can be used to benchmark database systems (with H. Kosch, Passau).
The video shows the segmentation and representation of an ECG time series with SwiftSeg (slow motion). Segmentation points are indicated by red vertical lines, segmentation polynomials in green color, and modeling polynomials in blue color.