Machine learning models are commonly used for decision support. Ideally, the decisions should be impartial, unbiased, and fair. However, machine learning models are far from perfect, e.g., due to bias introduced by imperfect training data or wrong feature selection. While efforts are made and should continue to be put into developing better models, we also acknowledge that we will continue to rely on imperfect models in many applications. But what if we can provably rely on the “best” model for an individual or a group of individuals and transparently communicate the risks and weaknesses that apply?
In light of this question, we propose a system framework that optimizes the choice of model for specific subgroups of the population or even individual persons, relying on metadata sheets for data and models. At the same time, to achieve transparency, the framework captures data to explain the choices made and results of the model at different scales to different stakeholders.
Publications
- Oppold, S., & Herschel, M. (2020). A System Framework for Personalized and Transparent Data-Driven Decisions. Proceedings of the International Conference on Advanced Information Systems Engineering (CAiSE).
- Oppold, S., & Herschel, M. (2019). LuPe: A System for Personalized and Transparent Data-driven Decisions. In W. Zhu, D. Tao, X. Cheng, P. Cui, E. A. Rundensteiner, D. Carmel, Q. He, & J. X. Yu (Eds.), CIKM (pp. 2905–2908). ACM. http://dblp.uni-trier.de/db/conf/cikm/cikm2019.html#OppoldH19
Resources
Title | File |
---|---|
LiQuID supplemental material - Model overview | Information_Overview.pdf |
LiQuID supplemental material - XML Schema | LiQuID.xsd |
LiQuID supplemental material - Accountability workload | Workload.pdf |
MDS for Adult Dataset - Training data | adultData_d1_MDS.json |
MDS for Adult Dataset - Test data | adultData_d2_MDS.json |
MDS for gradient boosted tree model m10 | adult_m10_gradBoostMDS.json |
MDS for SVM model m1 | adult_m1_linearSVMMDS.json |
MDS for logistic regression model m2 | adult_m2_logRegMDS.json |
MDS for decision tree model m3 | adult_m3_decTreeMDS.json |
MDS for random forest tree model m4 | adult_m4_randForestMDS.json |
MDS for gradient boosted tree model m5 | adult_m5_gradBoostMDS.json |
MDS for SVM model m6 | adult_m6_linearSVMMDS.json |
MDS for logistic regression model m7 | adult_m7_logRegMDS.json |
MDS for decision tree model m8 | adult_m8_decTreeMDS.json |
MDS for random forest tree model m9 | adult_m9_randForestMDS.json |
MDS model ensemble m1 | adult_me1MDS.json |
MDS model ensemble m2 | adult_me2MDS.json |
MDS model ensemble m3 | adult_me3MDS.json |
MDS for German Credit Dataset | germanCreditDataMDS.json |
MDS for German Credit Dataset in Spark readable format | germanCreditDataReadableMDS.json |
MDS for logistic regression model m2 | germanCredit_m2_logRegMDS.json |
MDS for decision tree model m3 | germanCredit_m3_decTreeMDS.json |
MDS for random forest tree model m4 | germanCredit_m4_randForestMDS.json |
MDS for gradient boosted tree model m5 | germanCredit_m5_gradBoostMDS.json |
MDS for SVM model m6 | germanCredit_m6_linearSVMMDS.json |
MDS for logistic regression model m7 | germanCredit_m7_logRegMDS.json |
MDS for German Credit Dataset | germanCredit_me1MDS.json |
MDS model ensemble m2 | germanCredit_me2MDS.json |
MDS model ensemble m3 | germanCredit_me3MDS.json |
MDS model ensemble m4 | germanCredit_me4MDS.json |
MDS model ensemble m5 | germanCredit_me5MDS.json |