Due to the increasing need for machine learning in companies, more and more people are becoming self-taught data scientists, commonly referred to as citizen data scientists. In contrast to ML experts, they do not have a profound understanding of learning algorithms. Nevertheless, they can still develop software systems with machine learning on their own responsibility, but with a greater expenditure of time. These ML solutions include various software components from data preparation to the ML algorithm and its hyper-parameterization. To support citizen data scientists, there is need for new approaches that simplify the development of software systems with machine learning. The core challenge is that citizen data scientists have to select and configure a combination of software components to obtain a usable and high-quality end-to-end solution. However, the high number of possible software component combinations makes this process time-consuming and complex. Existing approaches, e. g., AutoML, try to automate the process. However, these approaches deliver results that are difficult to interpret and thus unsuitable for citizen data scientists. This calls for new approaches that are more intuitive for this user group.
In the context of this project, several concepts have been developed to support citizen data scientists in specifying, configuring and selecting software components for machine learning and data preparation. A central concept is AssistML that uses metadata from a repository of existing solutions to recommend proper combinations of software components for a new use case. In this way, AssistML removes the burden from citizen data scientists to know and assess all combinations and implementation details of software components. In addition, AssistML offers an explanation component that simplifies the interpretation and evaluation of the generated suggestions.
This project was funded by the German Research Foundation (Deutsche Forschungsgemeinschaft - DFG) and the Ministry of Science, Research and Arts of the State of Baden-Württemberg as part of the Graduate School of Excellence advanced Manufacturing Engineering (GSaME). In addition, it received support from the Software Campus Initiative, which is funded by the German Federal Ministry of Education and Research (BMBF). The associated project in the Software Campus is GUACAMOLE.