Institute for Parallel and Distributed Systems (IPVS)

Data Engineering

Prof. Dr. rer. nat. Melanie Herschel

Latest news

  • Open positions: Contact us if interested in research assistant (HiWi) positions.
  • New team member: Nico Lässig will work on a data management to enable adaptive design and manufacturing within EXC IntCDC. 
  • ICDE 2021: one full paper accepted.
  • ER 2020: one paper accepted to the ER Forum.
  • Best Paper Award recieved at ADBIS 2020.
  • Information Systems journal paper accepted.


The field of Data Engineering encompasses technologies related to processing and transforming any kind of data into a useful format for further analysis. These data may for instance be structured data from enterprise databases, semi- or unstructured Web data, or streaming data in the context of the Internet of Things (IoT). The data engineering group in Stuttgart works on various steps of data engineering with the overarching goal to automatically, transparently, and responsibly refine data from its raw state into a state ready for use in various data analytics and data exploration applications.

Currently, we are particularly interested in algorithms and tools for data annotation, data cleaning, and data integration as well as foundations and practical implementations of provenance management to trace complex data engineering processes. Another research focus of the group are languages, algorithms, and tools that support users in complex data processing through data exploration or process analysis solutions. Finally, we study data management techniques to empower fair, accountable, and transparent data analysis.

Below is a list of recent publications involving at least one author of the IPVS DE group. Full publication lists of team members (including those not published while affiliated with the IPVS) can be found on the individual team member pages.


A System Framework for Personalized and Transparent Data-Driven Decisions
Sarah Oppold, Melanie Herschel
International Conference on Advanced Information Systems Engineering (CAISE), Grenoble, France, 2020

Tracing nested data with structural provenance for big data analytics
Ralf Diestelkämper, Melanie Herschel
International Conference on Extending Database Technology (EDBT), Copenhagen, Denmark, 2020

Boosting Blocking Performance in Entity Resolution Pipelines: Comparison Cleaning using Bloom Filters
Leonardo Gazzarri, Melanie Herschel
International Conference on Extending Database Technology (EDBT), Copenhagen, Denmark, 2020


LuPe: A System for Personalized and Transparent Data-driven Decisions
Sarah Oppold, Melanie Herschel
International Conference on Information and Knowledge Management (CIKM), Beijing, China, 2019

Towards Integrating Collaborative Filtering in Visual Data Exploration Systems
Houssem Ben Lahmar and Melanie Herschel
European Conference on Advances in Databases and Information Systems (ADBIS), Bled, Slovenia, 2019

Capturing and querying structural provenance in Spark with Pebble
Ralf Diestankämper, Melanie Herschel
ACM SIG Conference on the Management of Data (SIGMOD), Amsterdam, The Netherlands, 2019

Volume-based large dynamic graph analysis supported by evolution provenance
Valentin Bruder, Houssem Ben Lahmar, Marcel Hlawatsch, Steffen Frey, Michael Burch, Daniel Weiskopf, Melanie Herschel, Thomas Ertl
Multimedia Tools and Applications, Vol. 78, No. 23, 2019

Towards task-based parallelization for entity resolution
Leonardo Gazzarri, Melanie Herschel
Special Issue of the Springer Journal Software-Intensive Cyber-Physical Systems (SICS)

Query-based Why-not Explanations for Nested Data
Ralf Diestelkämper, Boris Glavic, Melanie Herschel, Seokki Lee
Workshop on Theory and Practice of Provenance (TaPP), Philadelphia, PA, USA

Structural summaries for visual provenance analysis
Houssem Ben Lahmar, Melanie Herschel
Workshop on Theory and Practice of Provenance (TaPP), Philadelphia, PA, USA

Advances in Database Technology - 22nd International Conference on Extending Database Technology, EDBT 2019, Lisbon, Portugal, March 26-29, 2019, Proceedings
Melanie Herschel, Helena Galhardas, Berthold Reinwald, Irini Fundulaki, Carsten Binnig, Zoi Kaoudi 2019, ISBN 978-3-89318-081-3

Prediction of air pollution with machine learning
Christian Schmitz, Dhiren Devinder Serai, Tatiane Escobar Gava
Datenbanksysteme für Business, Technologie und Web (BTW 2019), 18. Fachtagung des GI- Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 4.-8. März 2019, Rostock, Ger‐ many, Workshops


This picture showsMelanie Herschel
Prof. Dr. rer. nat.

Melanie Herschel

Head of Institute

This picture showsEva Strähle

Eva Strähle


To the top of the page