Machine learning models are commonly used for decision support. Ideally, the decisions should be impartial, unbiased, and fair. However, machine learning models are far from perfect, e.g., due to bias introduced by imperfect training data or wrong feature selection. While efforts are made and should continue to be put into developing better models, we also acknowledge that we will continue to rely on imperfect models in many applications. But what if we can provably rely on the “best” model for an individual or a group of individuals and transparently communicate the risks and weaknesses that apply?
In light of this question, we propose a system framework that optimizes the choice of model for specific subgroups of the population or even individual persons, relying on metadata sheets for data and models. At the same time, to achieve transparency, the framework captures data to explain the choices made and results of the model at different scales to different stakeholders.
Publications
- Lässig, N., & Herschel, M. (2024). FALCC: Efficiently performing locally fair and accurate classifications. International Conference on Extending Database Technology (EDBT). https://openproceedings.org/2024/conf/edbt/paper-59.pdf
- Lässig, N., Nies, O., & Herschel, M. (2024). FairCR - An Evaluation and Recommendation System for Fair Classification Algorithms. Proceedings of the International Conference on Data Engineering (ICDE).
- Lässig, N. (2023). Towards an AutoML System for Fair Classifications. International Conference on Data Engineering (ICDE), 3913--3917. https://doi.org/10.1109/ICDE55515.2023.00380
- Oppold, S., & Herschel, M. (2022). Trust in data engineering: reflection, framework, and evaluation methodology. International Workshop on Data Ecosystems (DEco) - VLDB Workshops. https://ceur-ws.org/Vol-3306/paper1.pdf
- Oppold, S., & Herschel, M. (2022). Provenance-based explanations: are they useful? International Workshop on the Theory and Practice of Provenance (TAPP), 2:1--2:4. https://doi.org/10.1145/3530800.3534529
- Lässig, N., Oppold, S., & Herschel, M. (2022). Metrics and Algorithms for Locally Fair and Accurate Classifications using Ensembles. Datenbank-Spektrum, 22(1), Article 1. https://doi.org/10.1007/s13222-021-00401-y
- Lässig, N., Oppold, S., & Herschel, M. (2021). Using FALCES against bias in automated decisions by integrating fairness in dynamic model ensembles. In Proceedings of Database Systems for Business, Technology, and Web (BTW). https://doi.org/10.18420/btw2021-08
- Oppold, S., & Herschel, M. (2020). A System Framework for Personalized and Transparent Data-Driven Decisions. Proceedings of the International Conference on Advanced Information Systems Engineering (CAiSE), 153–168. https://doi.org/10.1007/978-3-030-49435-3_10
- Oppold, S., & Herschel, M. (2020). Accountable Data Analytics Start with Accountable Data: The LiQuID Metadata Model. ER Forum, Demo and Posters 2020 Co-Located with International Conference on Conceptual Modeling (ER), 59--72. http://ceur-ws.org/Vol-2716/paper5.pdf
- Oppold, S., & Herschel, M. (2019). LuPe: A System for Personalized and Transparent Data-driven Decisions. Proceedings of the International Conference on Information and Knowledge Management (CIKM), 2905–2908. https://doi.org/10.1145/3357384.3357857
Resources
Title | File |
---|---|
LiQuID supplemental material - Model overview | Information_Overview.pdf |
LiQuID supplemental material - XML Schema | LiQuID.xsd |
Information on Study on Human Attitudes towards Critical Programs | Study_Attitudes_Critical_Programs.pdf |
LiQuID supplemental material - Accountability workload | Workload.pdf |
MDS for Adult Dataset - Training data | adultData_d1_MDS.json |
MDS for Adult Dataset - Test data | adultData_d2_MDS.json |
MDS for gradient boosted tree model m10 | adult_m10_gradBoostMDS.json |
MDS for SVM model m1 | adult_m1_linearSVMMDS.json |
MDS for logistic regression model m2 | adult_m2_logRegMDS.json |
MDS for decision tree model m3 | adult_m3_decTreeMDS.json |
MDS for random forest tree model m4 | adult_m4_randForestMDS.json |
MDS for gradient boosted tree model m5 | adult_m5_gradBoostMDS.json |
MDS for SVM model m6 | adult_m6_linearSVMMDS.json |
MDS for logistic regression model m7 | adult_m7_logRegMDS.json |
MDS for decision tree model m8 | adult_m8_decTreeMDS.json |
MDS for random forest tree model m9 | adult_m9_randForestMDS.json |
MDS model ensemble m1 | adult_me1MDS.json |
MDS model ensemble m2 | adult_me2MDS.json |
MDS model ensemble m3 | adult_me3MDS.json |
MDS for German Credit Dataset | germanCreditDataMDS.json |
MDS for German Credit Dataset in Spark readable format | germanCreditDataReadableMDS.json |
MDS for logistic regression model m2 | germanCredit_m2_logRegMDS.json |
MDS for decision tree model m3 | germanCredit_m3_decTreeMDS.json |
MDS for random forest tree model m4 | germanCredit_m4_randForestMDS.json |
MDS for gradient boosted tree model m5 | germanCredit_m5_gradBoostMDS.json |
MDS for SVM model m6 | germanCredit_m6_linearSVMMDS.json |
MDS for logistic regression model m7 | germanCredit_m7_logRegMDS.json |
MDS for German Credit Dataset | germanCredit_me1MDS.json |
MDS model ensemble m2 | germanCredit_me2MDS.json |
MDS model ensemble m3 | germanCredit_me3MDS.json |
MDS model ensemble m4 | germanCredit_me4MDS.json |
MDS model ensemble m5 | germanCredit_me5MDS.json |