Latest news
- NFDI: Prof. Herschel appointed to DGF Expert Committee
- New team member: Welcome Aditya
- DECo@VLDB22: workshop paper accepted.
- VLDB 2022: paper accepted.
- Open positions: Looking for PhDs, Postdocs, HiWis
The field of Data Engineering encompasses technologies related to processing and transforming any kind of data into a useful format for further analysis. These data may for instance be structured data from enterprise databases, semi- or unstructured Web data, or streaming data in the context of the Internet of Things (IoT). The data engineering group in Stuttgart works on various steps of data engineering with the overarching goal to automatically, transparently, and responsibly refine data from its raw state into a state ready for use in various data analytics and data exploration applications.
Currently, we are particularly interested in algorithms and tools for data annotation, data cleaning, and data integration as well as foundations and practical implementations of provenance management to trace complex data engineering processes. Another research focus of the group are languages, algorithms, and tools that support users in complex data processing through data exploration or process analysis solutions. Finally, we study data management techniques to empower fair, accountable, and transparent data analysis.
Below is a list of selected recent publications involving at least one author of the IPVS DE group. A full list is available on our publications page.
2023
Progressive Entity Resolution over Incremental Data.
Leonardo Gazzarri, Melanie Herschel
International Conference on Extending Database Technology (EDBT), 2023
2022
DyHealth: Making Neural Networks Dynamic for Effective Healthcare Analytics.
Kaiping Zheng, Shaofeng Cai, Horng Ruey Chua, Melanie Herschel, Meihui Zhang, Beng Chin Ooi
Proceedings of the VLDB Endowment (PVLDB), 15(12), 2022
Metrics and Algorithms for Locally Fair and Accurate Classifications using Ensembles.
Nico Lässig, Sarah Oppold, Melanie Herschel.
Datenbank-Spektrum 22(1), 2022
2021
To Not Miss the Forest for the Trees - A Holistic Approach for Explaining Missing Answers over Nested Data
Ralf Diestelkämper, Seokki Lee, Melanie Herschel, Boris Glavic
ACM International Conference on the Management of Data (SIGMOD), Xi'an, Shaanxi, China, 2021
PACE: Learning Effective Task Decomposition for Human-in-the-loop Healthcare Delivery
Kaiping Zheng, Gang Chen, Melanie Herschel, Kee Yuan Herschel, Beng Chin Ooi, Jinyang Gao
ACM International Conference on the Management of Data (SIGMOD), Xi'an, Shaanxi, China, 2021
End-to-end Task Based Parallelization for Entity Resolution on Dynamic Data
Leonardo Gazzarri, Melanie Herschel
IEEE International Conference on Data Engineering (ICDE), Chania, Crete, Greece, 2021
Using FALCES against bias in automated decisions by integrating fairness in dynamic model ensembles
Nico Lässig, Sarah Oppold, Melanie Herschel
Database Systems for Business, Technology, and Web (BTW), 2021
Houssem Ben Lahmar, Melanie Herschel
Information Systems, 95, 101620, 2021
2020
Distributed Tree-Pattern Matching in Big Data Analytics Systems
Ralf Diestelkämper, Melanie Herschel
In Proceedings of the Conference on Advances in Databases and Information Systems (ADBIS), Lyon, France, 2020
Towards task-based parallelization for entity resolution
Leonardo Gazzarri, Melanie Herschel
SICS Software-Intensive Cyber-Physical Systems, 35(1), 2020
Accountable Data Analytics Start with Accountable Data: The LiQuID Metadata Model
Sarah Oppold, Melanie Herschel
ER Forum, Demo and Posters 2020 Co-Located with International Conference on Conceptual Modeling (ER), 2020
A System Framework for Personalized and Transparent Data-Driven Decisions
Sarah Oppold, Melanie Herschel
International Conference on Advanced Information Systems Engineering (CAISE), Grenoble, France, 2020
Tracing nested data with structural provenance for big data analytics
Ralf Diestelkämper, Melanie Herschel
International Conference on Extending Database Technology (EDBT), Copenhagen, Denmark, 2020
Boosting Blocking Performance in Entity Resolution Pipelines: Comparison Cleaning using Bloom Filters
Leonardo Gazzarri, Melanie Herschel
International Conference on Extending Database Technology (EDBT), Copenhagen, Denmark, 2020
2019
LuPe: A System for Personalized and Transparent Data-driven Decisions
Sarah Oppold, Melanie Herschel
International Conference on Information and Knowledge Management (CIKM), Beijing, China, 2019
Towards Integrating Collaborative Filtering in Visual Data Exploration Systems
Houssem Ben Lahmar and Melanie Herschel
European Conference on Advances in Databases and Information Systems (ADBIS), Bled, Slovenia, 2019
Capturing and querying structural provenance in Spark with Pebble
Ralf Diestankämper, Melanie Herschel
ACM SIG Conference on the Management of Data (SIGMOD), Amsterdam, The Netherlands, 2019
Volume-based large dynamic graph analysis supported by evolution provenance
Valentin Bruder, Houssem Ben Lahmar, Marcel Hlawatsch, Steffen Frey, Michael Burch, Daniel Weiskopf, Melanie Herschel, Thomas Ertl
Multimedia Tools and Applications, Vol. 78, No. 23, 2019
Query-based Why-not Explanations for Nested Data
Ralf Diestelkämper, Boris Glavic, Melanie Herschel, Seokki Lee
Workshop on Theory and Practice of Provenance (TaPP), Philadelphia, PA, USA, 2019
Structural summaries for visual provenance analysis
Houssem Ben Lahmar, Melanie Herschel
Workshop on Theory and Practice of Provenance (TaPP), Philadelphia, PA, USA, 2019
Advances in Database Technology - 22nd International Conference on Extending Database Technology, EDBT 2019, Lisbon, Portugal, March 26-29, 2019, Proceedings
Melanie Herschel, Helena Galhardas, Berthold Reinwald, Irini Fundulaki, Carsten Binnig, Zoi Kaoudi
OpenProceedings.org 2019, ISBN 978-3-89318-081-3
Prediction of air pollution with machine learning
Christian Schmitz, Dhiren Devinder Serai, Tatiane Escobar Gava
Datenbanksysteme für Business, Technologie und Web (BTW 2019), 18. Fachtagung des GI- Fachbereichs "Datenbanken und Informationssysteme" (DBIS), 4.-8. März 2019, Rostock, Germany, Workshops
Our group contributes to the curricula of the different study programs at bachelor and master level offered by the department of computer science by offering lectures, seminars, projects, and thesis topics in the broad area of data management, data engineering, and data science.
- Overview of current and upcoming courses and course registration in C@MPUS
- Further details on teaching activities (general course descriptions, project and thesis topics, ...)
Contact

Melanie Herschel
Prof. Dr. rer. nat.Head of Institute

Eva Strähle
M.A.Secretary