中文版 | English
Title

面向机器学习模型可解释性的反事实样本生成

Alternative Title
GENERATION OF COUNTERFACTUAL SAMPLES FOR THE INTERPRETABILITY OF MACHINE LEARNING MODELS
Author
Name pinyin
YUAN Yidong
School number
12132910
Degree
硕士
Discipline
0701 数学
Subject category of dissertation
07 理学
Supervisor
徐匆
Mentor unit
统计与数据科学系
Publication Years
2023-04-28
Submission date
2023-06-26
University
南方科技大学
Place of Publication
深圳
Abstract
反事实解释是机器学习可解释方法的一种,通过生成一组反事实样本,使其达到预期的输出来解释模型的预测。目前的反事实解释方法:生成模型和优化模型,都是利用结构因果模型中的结构方程来保留并解释变量间的关系,以获得可行的反事实样本。但在现实中,完整的结构因果模型很难获取。本文首先在两个已知完整结构因果模型的数据集上,假设错误的结构因果模型,使用反事实可解释方法中常用的评价指标,以研究其对反事实解释方法的影响。实验发现,生成的反事实样本的几种评价指标皆有不同程度的下降,其中用于衡量反事实样本的可行性的因果保留得分受影响程度最大。其次,我们发现现有反事实解释方法在数据特征维度较高时,无法快速、准确地生成反事实样本,且无法处理多分类问题。针对现有方法在无法获取正确的结构因果模型的情况下效果下降的问题,本文通过构造一种基于生成对抗网络的判别器的近似可行性约束以更好地保留变量间的因果关系,进而提高所生成的反事实样本的因果保留得分。最后,针对多分类问题,本文基于生成对抗网络的生成器构建了一种新的反事实样本生成方法。实验结果表明,此方法生成的样本能够满足可行性等条件,在各种评价指标下都表现出色。
Keywords
Language
Chinese
Training classes
独立培养
Enrollment Year
2021
Year of Degree Awarded
2023-06
References List

[1] Islam, Aylin Caliskan and Bryson, Joanna J. and Narayanan, Arvind. Semantics derived automatically from language corpora necessarily contain human biases. 2016.
[2] Sandra Wachter, Brent Mittelstadt, and Chris Russell. Counterfactual explanations withoutopening the black box: Automated decisions and the gpdr. Harv. JL Tech., 31:841, 2017.
[3] J. Angwin, J. Larson, L. Kirchner, and S. Mattu. Machine bias.https://www.propublica.org/article/machine- bias- risk- assessments- in- criminal- sentencing ,Mar 2019
[4] B. Goodman and S. Flaxman. European union regulations on algorithmic decision-making anda right to explanation. AI Magazine, 38(3):50–57, 2017.
[5] D. Boyd and K. Crawford. Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication & society, 15(5):662–679,2012.
[6] I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples.arXiv preprint arXiv:1412.6572, 2014.
[7] S. Moosavi-Dezfooli, A. Fawzi, and P. Frossard. Deepfool: a simple and accurate method tofool deep neural networks. CoRR, abs/1511.04599, 2015.
[8] N. Papernot, P. D. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami. The limitationsof deep learning in adversarial settings. CoRR, abs/1511.07528, 2015.
[9] Z. C. Lipton. The mythos of model interpretability. CoRR, abs/1606.03490, 2016.
[10] F. Doshi-Velez and B. Kim. Towards a rigorous science of interpretable machine learning. arXivpreprint arXiv:1702.08608, 2017.
[11] L. H. Gilpin, D. Bau, B. Z. Yuan, A. Bajwa, M. Specter, and L. Kagal. Explaining explanations:An overview of interpretability of machine learning. In 2018 IEEE 5th International Conferenceon Data Science and Advanced Analytics (DSAA), pages 8089. IEEE, 2018.
[12] W. Cheng, Y. Shen, L. Huang, and Y. Zhu. Incorporating interpretability into latent factormodels via fast influence analysis. In Proceedings of the 25th ACM SIGKDD InternationalConference on Knowledge Discovery & Data Mining, pages 885–893. ACM, 2019.
[13] M. Du, N. Liu, and X. Hu. Techniques for interpretable machine learning. arXiv preprintarXiv:1808.00033, 2018.
[14] S. Wachter, B. D. Mittelstadt, and C. Russell. Counterfactual explanations without opening theblack box: Automated decisions and the GDPR. CoRR, abs/1711.00399, 2017.
[15] S. Liu, B. Kailkhura, D. Loveland, and Y. Han. Generative counterfactual introspection forexplainable deep learning. CoRR, abs/1907.03077, 2019.
[16] R. M. Grath, L. Costabello, C. L. Van, P. Sweeney, F. Kamiab, Z. Shen, and F. Le ́cué. Interpretable credit application predictions with counterfactual explanations. CoRR, abs/1811.05245, 2018.
[17] Y. Goyal, Z. Wu, J. Ernst, D. Batra, D. Parikh, and S. Lee. Counterfactual visual explanations.CoRR, abs/1904.07451, 2019.
[18] Y. Goyal, U. Shalit, and B. Kim. Explaining classifiers with causal concept effect (cace). CoRR,abs/1907.07165, 2019.
[19] J. Moore, N. Hammerla, and C. Watkins. Explaining deep learning models with constrainedadversarial examples. CoRR, abs/1906.10671, 2019.
[20] R. K. Mothilal, A. Sharma, and C. Tan. Explaining machine learning classifiers through diversecounterfactual explanations. CoRR, abs/1905.07697, 2019.
[21] Divyat Mahajan, Chenhao Tan, and Amit Sharma. Preserving causal constraints in counterfactual explanations for machine learning classifiers. arXiv preprint arXiv:1912.03277, 2019.
[22] Duong T D , Li Q , Xu G . Prototype-based Counterfactual Explanation for Causal Classification[J]. 2021.
[23] Amit Dhurandhar, Pin-Yu Chen, Ronny Luss, Chun-Chen Tu, Paishun Ting, Karthikeyan Shanmugam, and Payel Das. Explanations based on the missing: Towards contrastive explanationswith pertinent negatives. In Advances in Neural Information Processing Systems, pages 592–603, 2018.
[24] Ramaravind Kommiya Mothilal, Amit Sharma, and Chenhao Tan. Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings of the ACM FATconference (to appear), 2020.
[25] Chris Russell. Efficient search for diverse coherent explanations. In Proceedings of FAT, 2019.
[26] Judea Pearl. Causality. Cambridge University Press, 2009.
[27] Spirtes P , Zhang K . Causal discovery and inference: concepts and recent methodologicaladvances[J]. Applied Informatics, 2016, 3(1):1-28.
[28] Glymour C , Zhang K , Spirtes P . Review of Causal Discovery Methods Based on GraphicalModels[J]. Frontiers in Genetics, 2019, 10:-.
[29] Alessandro Magrini, Stefano Di Blasi, and Federico Mattia Stefanini. A conditional linear gaussian network to assess the impact of several agronomic settings on the quality of tuscan sangiovese grapes. Biometrical Letters, 2017.
[30] Goodfellow I J , Pouget-Abadie J , Mirza M , et al. Generative Adversarial Networks[J]. 2014.
[31] Amir-Hossein Karimi, Bernhard Schölkopf, and Isabel Valera. 2020. Algorithmic Recourse:from Counterfactual Explanations to Interventions.
[32] Amir-Hossein Karimi, Julius von Kügelgen, Bernhard Schölkopf, and Isabel Valera. 2020. Algorithmic recourse under imperfect causal knowledge: a proba-bilistic approach.
[33] Shubham Sharma, Jette Henderson, and Joydeep Ghosh. Certifai: A common framework to provide explanations and analyse the fairness and ro- bustness of black-box models. In Proceedingsof the AAAI/ACM Conference on AI, Ethics, and Society, pages 166–172, 2020.
[34] Karimi A H , Barthe G , Belle B , et al. Model-Agnostic Counterfactual Explanations for Consequential Decisions[J]. 2019.
[35] Ustun B , Spangher A , Liu Y . Actionable Recourse in Linear Classification:,10.1145/3287560.3287566[P]. 2018.

Academic Degree Assessment Sub committee
数学
Domestic book classification number
O212.1
Data Source
人工提交
Document TypeThesis
Identifierhttp://kc.sustech.edu.cn/handle/2SGJ60CL/543954
DepartmentDepartment of Statistics and Data Science
Recommended Citation
GB/T 7714
袁宜东. 面向机器学习模型可解释性的反事实样本生成[D]. 深圳. 南方科技大学,2023.
Files in This Item:
File Name/Size DocType Version Access License
12132910-袁宜东-统计与数据科学(3556KB) Restricted Access--Fulltext Requests
Related Services
Fulltext link
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Export to Excel
Export to Csv
Altmetrics Score
Google Scholar
Similar articles in Google Scholar
[袁宜东]'s Articles
Baidu Scholar
Similar articles in Baidu Scholar
[袁宜东]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[袁宜东]'s Articles
Terms of Use
No data!
Social Bookmark/Share
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.