This is accompanied by increasing numbers of machine learning applications and volumes of data. Nevertheless, the capacities of processing systems or human supervisors or domain experts remain limited in real-world applications. Furthermore, many applications require fast reaction to new situations, which means that first predictive models need to be available even if little data is yet available. Therefore approaches are needed that optimise the whole learning process, including the interaction with human supervisors, processing systems, and data of various kind and at different timings: techniques for estimating the impact of additional resources (e.g. data) on the learning progress; techniques for the active selection of the information processed or queried; techniques for reusing knowledge across time, domains, or tasks, by identifying similarities and adaptation to changes between them; techniques for making use of different types of information, such as labelled or unlabelled data, constraints or domain knowledge. Such techniques are studied for example in the fields of adaptive, active, semi-supervised, and transfer learning. However, this is mostly done in separate lines of research, while combinations thereof in interactive and adaptive machine learning systems that are capable of operating under various constraints, and thereby address the immanent real-world challenges of volume, velocity and variability of data and data mining systems, are rarely reported. Therefore, this combined tutorial and workshop aims to bring together researchers and practitioners from these different areas, and to stimulate research in interactive and adaptive machine learning systems as a whole.
This workshop aims at discussing techniques and approaches for optimising the whole learning process, including the interaction with human supervisors, processing systems, and includes adaptive, active, semi-supervised, and transfer learning techniques, and combinations thereof in interactive and adaptive machine learning systems. Our objective is to bridge the communities researching and developing these techniques and systems in machine learning and data mining. Therefore we welcome contributions that present a novel problem setting, propose a novel approach, or report experience with the practical deployment of such a system and raise unsolved questions to the research community.
In particular, we welcome contributions that address aspects including, but not limited to:
|09:00 - 09:10||Welcome||Organizing Committee|
|09:10 - 09:50||Tutorial Part 1
Introduction to Stream Mining
|09:50 - 10:40||Tutorial Part 2
|Morning Coffee Break|
|11:00 - 11:40||Tutorial Part 3
Semi-Supervised and Transfer Learning
|11:40 - 12:30||Tutorial Part 4
Evaluation, Applications and Emerging Trends
|Spotlights on Poster Session|
|12:30 -||Short Paper 1
Probabilistic Expert Knowledge Elicitation of Feature Relevances in Sparse Linear Regression
|Pedram Daee, Tomi Peltola, Marta Soare, and Samuel Kaski|
|Short Paper 2
Users behavioural inference with Markovian decision process and active learning
|Firas Jarboui, Vincent Rocchisani, and Wilfried Kirchenmann|
|- 12:40||Short Paper 3
Multi-Arm Active Transfer Learning for Telugu Sentiment Analysis
|Subba Reddy Oota, Vijayasaradhi Indurthi, Mounika Reddy Marreddy, Sandeep Sricharan Mukku, and Radhika Mamidi|
|Lunch Break + Poster Session|
|14:00 - 14:20||Talk 1
Probabilistic Active Learning with Structure-Sensitive Kernels
|Dominik Lang, Daniel Kottke, Georg Krempl, and Bernhard Sick|
|14:20 - 14:40||Talk 2
Transfer learning for time series anomaly detection
|Vincent Vercruyssen, Wannes Meert, and Jesse Davis|
|14:40 - 15:40||Invited Talk
Ensemble learning from data streams with active and semi-supervised approaches
|Afternoon Coffee Break + Poster Session|
|16:00 - 16:20||Talk 3
Simulation of Annotators for Active Learning: Uncertain Oracles
|Adrian Calma and Bernhard Sick|
|16:20 - 16:40||Talk 4
Interactive Anonymization for Privacy aware Machine Learning
|Bernd Malle, Peter Kieseberg, and Andreas Holzinger|
|16:40 - 17:40||Panel Discussion||George Kachergis, Bartosz Krawczyk, Myra Spiliopoulou, and Jerzy Stefanowski|
This part starts with the classic stream mining paradigm. In its context, we discuss the challenges posed by non-stationarity and limitations in processing, storage, and supervision capacities. We briefly summarize related techniques, e.g. for incremental processing, forgetting, and change detection. This part concludes by an overview on further challenges that are investigated in the state-of-the-art research.
In this part of the tutorial, we focus on techniques for optimising the interaction of a machine learning system with an oracle such as a human supervisor. We review active machine learning techniques, with focus on adaptive active learning for evolving and streaming data. We discuss recent advances and conclude with an overview on open research questions in adaptive active machine learning.
This part of the tutorial addresses the problem of learning with incomplete or delayed supervision. We focus on the problem of learning with verification latency, and review techniques from change mining, semi-supervised and (unsupervised) transfer learning in non-stationary environments. We conclude with an overview on open challenges.
This last part of the tutorial takes an integrative view on the previous parts, with focus on industrial applications and open challenges of adaptive interactive mining systems as a whole. We briefly discuss the related issues of evaluation and deployment, applications, reported challenges and solutions, and highlight potential directions for future research.
Developing efficient classifiers which are able to cope with big and streaming data, especially with the presence of the so-called concept drift is currently one of the primary directions among the machine learning community. This presentation will be devoted to the importance of ensemble learning methods for handling drifting and online data. It has been shown that a collective decision can increase classification accuracy due to mutually complementary competencies of each base learner. This premise is true if the set consists of diverse and mutually complementary classifiers. For non-stationary environments, diversity may also be viewed as a changing context — which makes them an excellent tool for handling data shifts. The main focus of the lecture will be given to using these mentioned advantages of ensemble learning for data stream mining on a budget. As streaming data is characterized by both massive volume and velocity one cannot assume unlimited access to class labels. Instead methods that allow to reduce the number of label queries should be sought after. Recent trends in combining active and semi-supervised learning with ensemble solutions, such as online Query by Committee or Self-Labeling Committees, will be presented. Additionally, this talk will offer discussion on emerging challenges and future directions in this area.
Bartosz Krawczyk is an assistant professor in the Department of Computer Science, Virginia Commonwealth University, Richmond VA, USA, where he heads the Machine Learning and Stream Mining Lab. (www.egr.vcu.edu)
georg.krempl (at) ovgu.de
University Magdeburg, Germany
vincent.lemaire (at) orange.com
Orange Labs, France
polikar (at) rowan.edu
Rowan University, USA
bsick (at) uni-kassel.de
University of Kassel, Germany
daniel.kottke (at) uni-kassel.de
University of Kassel, Germany
acalma (at) uni-kassel.de
University of Kassel, Germany