Data analyses provide a profound insight into the internal structure of data. Algorithms from the field of machine learning are used to derive knowledge from existing data, which can then be used to create additional value in a certain context.
Prior to the analysis, it is often not clear which algorithms will lead to valuable results of a data analysis. Therefore, different algorithms with different parameters are often executed in several iterations and then the most promising result is chosen. However, this procedure encounters - especially in today's world - considerable problems: Due to the ever increasing amount of data, the execution time of each iteration requires more time.
Within the INTERACT project, selected aspects are investigated which reduce the execution time and at the same time lead to valuable results. In particular, the aspects (1) of the mining algorithms to be executed, (2) of the strategies to reduce the amount of data, and (3) of the execution environments for an analysis process are examined more closely.
The micro project INTERACT was performed within the Software Campus from 01.01.2018 until 30.11.2020.
- Fritz, M., Gang, S., Schwarz, H. 2021. "Automatic Selection of Analytic Platforms with ASAP-DM," to appear in Proceedings of the International Conference on Scientific and Statistical Database Management (SSDBM 2021).
- Fritz, M., Tschechlov, D., and Schwarz, H. 2021. “Efficient Exploratory Clustering Analyses with Qualitative Approximations,” in Proceedings of the International Conference on Extending Database Technology (EDBT 2021).
- Fritz, M., Behringer, M., and Schwarz, H. 2020. “LOG-Means: Efficiently Estimating the Number of Clusters in Large Datasets,” in Proceedings of 46th International Conference on Very Large Data Bases (VLDB 2020).
- Fritz, M., Tschechlov, D., and Schwarz, H. 2020. “Learning from past observations: Meta-Learning for Efficient Clustering Analyses,” in Proceedings of 22nd International Conference on Big Data Analytics and Knowledge Discovery (DaWaK 2020). Lecture Notes in Computer Science (Vol. 12393 LNCS), Springer, Cham, pp. 364-379.
- Fritz, M., and Schwarz, H. 2019. “Initializing k-means efficiently: Benefits for exploratory cluster analysis,” in Proceedings of 27th International Conference on Cooperative Information Systems (CoopIS 2019). Lecture Notes in Computer Science (Vol. 11877 LNCS), Springer, Cham, pp. 146–163.
- Fritz, M., Muazzen, O., Behringer, M., and Schwarz, H. 2019. “ASAP-DM: a framework for automatic selection of analytic platforms for data mining,” in Proceedings of 13th Symposium and Summer School On Service-Oriented Computing (SummerSoC 2019). SICS Software-Intensive Cyber-Physical Systems, Springer Berlin Heidelberg, pp. 1–13.
- Fritz, M., Behringer, M., and Schwarz, H. 2019. “Quality-driven early stopping for explorative cluster analysis for big data,” in Proceedings of 12th Symposium and Summer School On Service-Oriented Computing (SummerSoC 2018). SICS Software-Intensive Cyber-Physical Systems (34:2–3), pp. 129–140.