Requirements for Trustworthy Artificial Intelligence –
A review
Davinder Kaur, Suleyman Uslu and Arjan Durresi
Department of Computer and Information Science,
Indiana University and Purdue Universaity, IN, USA
{davikaur,suslu}@iu.edu,
[email protected]
Abstract. The field of algorithmic decision-making particularly artificial
intelligence has been drastically changing. With the availability of huge amount
of data and increase in the processing power, AI systems have been used in vast
number of high-stake applications. So, it becomes important to make these
systems reliable and trustworthy. Different approaches have been proposed to
make theses systems trustworthy. In this paper we have reviewed these
approaches and summarized them based on the principles proposed by
European Union for trustworthy AI. This review provides an overview of
different principles that are important to make AI trustworthy.
1
Introduction
In today’s world, algorithmic decision-making and artificial intelligence (AI) are
playing crucial role in our day to day lives. The area of automated decision-making
using machines is not new. However, decision-making now days is highly data driven
and complex. The decisions made by machines leave a profound impact on our
society. To give an estimate about the impact, International Data Corporation (IDC)
estimates the spending on AI systems will reach $97.9 billion dollars in 2023 and
there will be 28.4% increase over the period of 2018-2023 [1]. These numbers show
how AI is impacting our society by making decisions in almost every aspect of our
life.
Different kinds of statistical tools, artificial intelligence algorithms and machine
learning models are used to make decisions in all kind of applications such as
healthcare, government, business, judicial and political spheres. These advancements
in decision making led to fast growth in every sector. Now decisions made by
artificial intelligence and machine learning algorithms can beat some of the best
human players, serves as our personal assistant, used in medical diagnostics, used by
companies for automated customer support and much more. With enormous
applications and their impact, it is especially important to make sure that all these
systems on which we are relying on so much are reliable and trustworthy.
_______________________________________________
This is the author's manuscript of the article published in final edited form as:
Kaur, D., Uslu, S., & Durresi, A. (2021). Requirements for Trustworthy Artificial Intelligence – A Review. In L. Barolli, K. F. Li, T.
Enokido, & M. Takizawa (Eds.), Advances in Networked-Based Information Systems (pp. 105–115). Springer International
Publishing. https://doi.org/10.1007/978-3-030-57811-4_11
Now AI has the power to analyze huge amount of data and make predictions based
on that. However, these systems are so complex and opaque that it is difficult to judge
and interpret their decisions as fair and trustworthy. And there are no set standards or
mechanisms established to govern and test these systems. It is found out that these
systems could behave unfair and can lead to dangerous consequences. Recidivism
algorithm used across the country has been shown to be biased against black
defendants [2]. Recruitment algorithm used by a big corporate company was found to
be biased against women [3] and many more. These examples show that decisions
made by machines can be rogue and can have life-critical consequences. So, it is
important to design, develop, implement and oversee these systems very carefully.
With the growing need to make these systems reliable and trustworthy, researchers
have proposed different solutions. Some have proposed by making data used for
training AI systems unbiased, some proposed explainable and interpretable methods
that will make the AI systems easy to understand and interpret by the users. Some
researches have proposed overseeing methods to keep a check on AI systems and
other researches have proposed methods that enable collaborative intelligence using
both human and machines in decision making. All these proposed solutions involve
human at different levels of AI lifecycle. These solutions have one common objective
which is ensuring that AI systems should behave as promised and to create a notion of
trust towards AI among its users.
In this paper we have discussed different aspects that are important to make AI
decisions acceptable and trustworthy, policies and guidelines required to govern the
working of these systems and how human intervention is important in this changing
era of AI. This paper is organized as follows. Section 2 presents the foundational
concepts and preliminary background work in the field of trustworthy AI. Section 3
presents the review of the latest developments in the field of Trustworthy AI. Section
4 discusses technical challenges and future directions. Finally, section 5 concludes the
paper.
2
Background and Foundational Concepts
In this section we have discussed problems with traditional AI, key concepts of AI.
This section also discusses about the key principles and guidelines that should be
considered while designing, developing, implementing and overseeing the system.
2.1
Need of Trustworthy AI
The field of AI has major impact on our day to day lives. With the availability of huge
amount of data, high computational power and efficient algorithms, AI has given us
many useful solutions that benefit our society. However, with so many benefits AI
also raises some concerns. With all these advancements, AI has become complex for
the human to understand and control it. There is a need of mechanism to oversee the
decisions made by AI to be trustworthy and within the ethnic guidelines. This is only
possible if the machines or algorithms making these decisions are fair, understandable
by designers designing them, users using them and policy makers making laws to
govern them. All these concerns related to AI creates a fear among users which in
turn decrease the trust on the system.
Before looking at the guidelines proposed for making AI trustworthy, let us look at
the problems and risks related to present AI systems. [4] Once Stephen Hawking said
that “AI impact can be cataclysms unless its rapid development is controlled”. AI
systems can be dangerous and harmful if strict measures are not taken in designing,
developing, implementing and overseeing them. In today’s world, almost all the
sectors are utilizing the superpowers of AI systems in decision making and in
analyzing huge amount of data. But these superpowers of AI not always yield good
results. Lot of these AI systems failed and showed dangerous consequences. For
example, self-driving car killed a pedestrian because its self-driving system decided
not to take any action after detecting pedestrian on the road [5]. AI chat-bot become
racist after being corrupted with twitter trolls [6]. COMPAS recidivism algorithm
used by judges across the nation has showed biased against black people [2]. These
are some of the examples that shows how AI can be untrustworthy and dangerous if
their development is not controlled. So, it is important to make sure that these AI
systems do not cause any kind of harm to the mankind.
2.2
Requirements to make AI Trustworthy
Artificial intelligence is used to make decisions in high stake applications like
healthcare, transportation, judicial system and many more. With the increase in the
use of AI systems in decision making it becomes very important to develop guidelines
and policies that ensures that AI will not cause any intentional or unintentional harm
both to the society and the users using it. AI is designed by us and its our
responsibility to make sure it is only for good [7]. Several researchers and experts in
this field have proposed different guidelines and policies to make AI trustworthy. [8]
European union proposed four ethnic principles (respect of human autonomy,
prevention of harm, fairness, explicability) and seven key requirements(human
agency and oversight, technical robustness, privacy and data governance,
transparency, diversity and fairness, societal well-being and accountability ) to make
AI trustworthy.[9] considers explainability, integrity, conscious development,
reproducibility and regulations important to make AI trustworthy. [10] did a review
on all guidelines proposed by different organizations and research institutes to make
AI trustworthy. They said despite of so many guidelines available, there is a difficulty
in coming to the consensus about what properties make AI ethical and trustworthy.
Following are the properties that are important to make AI system trustworthy:
Accuracy and Robustness: Accuracy of the model refers to the model ability to
correctly predict the outcomes by generating less false positives and false negatives.
Robustness refers to the model ability to perform accurately under uncertain
conditions.
Non-Discrimination: Non-Discrimination refers to the model ability to treat all the
users equally without discriminating against any section of the society. This means
absence of any type of bias and discrimination.
Explainability: Explainability of the model enable the users of the model to correctly
understand the working of the model. This property facilitates users to correctly
predict the outcomes for given input and the reasons that could lead to model failure.
Transparency: Transparency of the model provides a clear picture of the model to
the users. It allow users to clearly understand the model by seeing whether the model
have been tested or not, what criteria it has been tested on, if the input output of the
model make sense to the users and if the users of the system clearly understand the
decisions made by the model.
Accountability: Accountability of the model refers to the model ability to justify the
decisions made by it to the users of the system. This includes taking responsibility of
all the decisions made whether they are good or caused some errors and unexpected
results.
Integrity: Integrity of the model defines that the model should output results or make
decisions within set parameters. These parameters can be operational, ethical or
technical and can be different for different applications.
Reproducibility: Reproducibility of the model ensures that all the decisions made by
the system can be reproduced if the same input parameters and conditions are
provided to the system.
Privacy: Privacy of the model means that the model should protect the data on which
it is trained on and the identity of the users using it.
Security: Security of the model makes sure that the model is secure from outside
attacks, that can change and modify the decisions made by the system.
Regulations: Government and policy makers should develop laws and guidelines to
govern the development and working of AI systems.
Human Agency and Oversight: This is the most important property that enforces
that AI system should always be in control of humans to prevent harm.
3
Review
Researchers have proposed vast number of solutions focusing on different properties
of trustworthy AI. In this review we have mapped these properties to the four
principles(principle of respect for human autonomy, principle of prevention of harm,
principle of fairness and principle of explicability) introduced by European Union[11]
to make AI ethical, lawful and robust. Figure 1 shows this mapping of properties to
the principles of trustworthy AI.
Fig. 1. It shows the mapping of different properties to the principles of trustworthy AI.
3.1
The Principle of Respect for Human Autonomy
This is the most important principle to design trustworthy AI. It ensures that the AI
systems should be designed to complement and empower human cognitive ability
instead of replacing them. [12] This new era of AI requires collaborative thinking
where humans and machines work together towards a common goal. Making humans
and machines work together will help to reduce incorrect and undesired results and
will help to avoid accidents. The designing of AI systems should be human centric
which means that the humans should be involved at different levels of AI lifecycle
[11]. They should be involved in planning, designing, development and oversight
phase of the AI system based on the application requirements. Humans should be in
the center to set limits, flag errors made by machines, override wrong decisions, help
to improve AI system by providing feedback. Human involvement is important to
keep machine decision within moral and ethical compliance. European Union [13]
have proposed some guidelines of how humans should be involved in AI decision
making:
o
o
For high stake applications, decisions made by the AI systems should only
become effective if it has been examined and authenticated by the human
experts. For example, in the medical AI systems where a wrong decision
could lead to dangerous consequences, doctors should validate the decision
made by the AI system based on their expertise and experience before
implementing them.
For the applications that require AI decisions to be effective immediately,
there should be a way by which humans can intervene to review the AI
decision and if needed can override the decisions. For example, in the loan
approval AI system if the application is rejected by the AI system, it should
o
be possible for the human/loan expert to review the application again and
change the decision if needed.
And humans should able to oversee the working of AI systems and able to
stop and interfere the working if he/she thinks that the AI system is not
working appropriately or the decisions made by the AI systems is not safe
anymore. For example, in autonomous vehicles if some sensor failed or the
vehicle is not operating properly driver should able to overtake the controls
of the vehicle.
Several methods have been proposed for human-machine collaboration. These
methods have showed how human involvement can increase the trust and accuracy of
the AI systems. [14] proposed analyst-in-loop AI system for intrusion detection which
take feedback from security analysts to decrease the false-negatives generated by the
system. This system improves the detection rate of AI system by three times. [12]
showed how AI is screening huge debit and credit card logs to flag questionable
transactions that can be evaluated by humans.
Some researchers have proposed a collaboration mechanism for human- machine
interactions based on trust framework [15]. [16][17][18] used human feedback based
decision-making system for resource allocation in FEW (Food Energy Water) sector.
[19] proposed fake user detection system in social networks taking in account
community knowledge along with machine learning algorithm. [20] proposed a
human-machine collaboration mechanism to govern the interaction between police
and machine for crime hotspot detection to facilitate greater accuracy.
All these proposed solutions showed that by combining the superpowers of both
humans and machines help to improve accuracy and decrease the harm caused by AI
systems which in turn make the AI systems trustworthy. So, this principle enforces
that AI systems should empower humans, not replace them.
3.2
The Principle of Prevention of Harm
This principle ensures that the AI system should not cause any unintentional or
intentional harm to the humans and the society. It also guarantees that the AI systems
operate in safe and secure environment without causing any kind of harm to anybody.
AI systems should be reliable in decision making task. Lot of elements should be
taken into consideration while designing and implementing AI systems so that they do
not cause any type of harm and can behave reliably. These elements are discussed
below:
Accuracy and Robustness: This property ensures that the AI systems should have
high accuracy and are robust. The accuracy of the system should be above certain
threshold for reliable decision making. The system should be robust, that is it should
able to work in adversaries and able to handle errors. And the results or the decisions
made by the AI systems should be reproducible if provided with same input and
similar conditions. Researchers have proposed different methods to deal with
adversaries and making AI system accurate. [21] proposed feature squeezing method
to decrease the complexity of input space, hence making it less prone to
adversaries.[22] proposed inputting adversarial examples in the training set to make
system robust.
Accountability: This property deals with the state of being responsible and
accountable for all the good and bad decisions made by AI systems [23]. As
algorithms cannot take responsibility for their decisions, designers of the AI systems
should take responsibility for their working by proper testing, auditing and overseeing
framework. It also deals with designing of laws and regulations for the controlled
development of AI systems. Several researchers have proposed different auditing and
testing methods to prevent harm caused by them. [23] proposed an internal auditing
framework for algorithmic auditing to keep check on the development life cycle of AI
systems. This auditing can be done by experts on each step of the development
process. This method will help to prevent and mitigate the harm before it even
occurred. [24] explained the importance of community involvement in designing
algorithms to address algorithmic harm.[25] explains different type of accountability
and how different level of users can be accountable for the decisions made by the AI
system. This paper discussed accountability based on socio-economic aspect of the
society.
Privacy and Security: Privacy of the AI system deals with the protection of the data
on which AI system is being trained on, identity of users using it, internal working of
the system and intellectual property if known can lead to dangerous consequences.
Security of the AI system deals with the protection of system from outside attacks that
can interfere and disturb the working of the AI system. Different methods have been
proposed to enforce privacy and security of the AI systems. [26] proposed a pipeline
for data protection when two or more agencies are involved in development of AI
system. [27] discuss different type of attacks that can happen on AI system and what
measures should be taken to prevent them.[28] discusses different privacy laws for
data protection in research.[29] proposed a method of ignoring and forgetting
malicious data in training phase so that AI system can reclaim security when attacked.
All these properties will help to prevent the unintentional and intentional harm caused
by the AI systems.
3.3
The Principle of Fairness
The principle of fairness makes sure that the decisions made by the AI systems should
not be biased and discriminatory to any kind of users. As these AI systems have been
used in wide range of applications, where a discriminatory behavior of the system will
make the system unfair, hence decreasing the trust on the system. This principle
ensures that the AI systems should treat all the users equally without favoring any
particular section of the society. AI systems should be obligated to hold moral and
ethnic values. These systems are supposed to ease the decision-making process but if
not designed and implemented properly can lead to bias and unfairness. So, it is very
important to make these systems fair and unbiased. Before looking at the solutions let
us look at the reasons for unfairness and bias.
Different types of Bias: AI systems can suffer from different types of bias. [30][31]
discussed different reasons for unfairness. One main reason of unfairness of AI
system is if the data on which it is trained on is biased and crooked. That is if the data
is not able to represent the clear picture of the reality. For example, [32] ImageNet
dataset, which is widely used by computer vision community, does not have a fair
distribution of the geodiversity of people hence causing bias in the system using it.
Other reason for the algorithm to behave biased is if some underlying stereotypes are
present in the data. For example, AI system makes a prediction that men are more
suitable for engineering jobs than women because over all the years men have higher
percentage in engineering jobs than women and this stereotype makes the training
data biased. And other reason can be if the bias is introduced by the algorithm itself.
This can happen when the algorithm is trying to maximize its accuracy over the
training data. [33] There can be other reasons also from where bias can be introduced
into the system like if people collecting the data, designing the system or interacting
with the system are biased against particular section of the society. So, measures
should be taken to make AI systems fair, ethical and inclusive.
To make AI systems fair and unbiased, several methods and techniques have been
proposed. [34] proposed a test generation technique to detect all different
combinations of input attributes that can discriminate any individual based on gender,
race, ethnicity etc. [35] proposed third party rating system to detect bias using sets of
biased and unbiased data. [36] proposed a subsampling technique that ensures that the
sub samples used for training are both fair and diverse. [37] Facebook proposed data
traceability technique to detect bias using radioactive data labeling. [38] proposed
vetting of algorithms by multidisciplinary team of experts to make algorithms
unbiased. [39] designed an open-source toolkit to use fairness metrics and algorithms
in an industrial setting. All these proposed solutions help to ensure fairness in AI
systems.
3.4
The principle of Explicability
The principle of explicability deals with providing explainability and interpretability
to the opaque AI systems. With the increase in the complexity of the AI models, they
have become black boxes which are difficult to understand and interpret. [40] This
principle of explicability ensures that the working of these AI systems can be openly
communicated with different stakeholders who are directly or indirectly affected by
the decisions made by these AI systems. This principle makes the process of decision
making transparent, hence increasing the trust of the users on the system. [41] It
enable the users to correctly understand the reasons that lead to a particular decision.
Explainability of the system will also help the policy makers to better understand the
system to make appropriate laws, will help developers of the system to detect the
reason for errors and making the system more accurate.
Different approaches have been proposed to make AI systems explainable and
interpretable. Some approaches known as integrated approaches deal with integrating
explanation mechanism into the AI development lifecycle, while other approaches are
post-hoc approaches that treat AI systems as black boxes and build an interpretable
model over it. Some explainability approaches are global approaches that deals with
explaining the working of the whole model while other approaches are local
approaches that deal with providing explanation of a particular decision made by the
system. All these approaches have one thing in common that is making the AI system
transparent and understandable by different type of users.
Lot of work have been done in the field of explainability and interpretability. Some of
these approaches are discussed here. [42][43] proposed post-hoc method of
explanation by building a proxy model on the top of machine learning model and
providing explanations by highlighting the important input attributes that lead to a
particular decision. [44] also proposed a post hoc method which generates diverse
counterfactual explanations to provide explainability to the model. [45][46]
approaches consider the internal working of the system which takes into account the
internal representation of the AI system to provide explanations. [47] proposed an
action-based influence model to provide explanations which is based on how humans
do reasoning and provide explanations to real world problems. Other popular
technique of providing explanations is through visualization, which highlight the area
of the image that lead to model prediction [48][49].
All these principles of trustworthy AI if followed properly ensures the reliability of
the AI system hence increasing the trust on the system.
4
Technical Challenges and Future Directions
Lot of research has been done and still going on to make AI systems trustworthy.
There are some technical challenges that can hinder the development of trustworthy
AI. One of the main challenges is the lack of clear requirements and standards for
making AI trustworthy. The definition of the principles and the properties are still
vague and there can be a conflict between the principles in various application
domains [11]. For example, following the principle of explicability can disobey the
principle of prevention of harm, as more interpretable and transparent the model is,
more prone it is to outside attacks. Hence there should be a tradeoff between these
principles based on the application requirements and there is a need for strict laws that
can govern the working of AI systems. Another challenge is that one solution that
worked for one problem may not works for another problem. For example,
explanation provided to developers of the system may not make sense to the users
with non-technical background. Hence more context specific solutions are needed.
There is also a need to involve multi-disciplinary team of experts to develop AI
systems. In a nutshell, this area of trustworthy AI is new, lot of research is still needed
to make AI systems reliable and trustworthy.
5
Conclusion
With the increase in the adoption of artificial intelligence in various application
domains, it becomes particularly important to make these systems reliable and
trustworthy. Different types of approaches have been developed to make these
systems accurate, robust, fair, explainable and safe. In this review we have
summarized these approaches and provided some future directions.
Acknowledgments. This work was partially supported by the National Science
Foundation under Grant No. 1547411 and by the U.S. Department of Agriculture
(USDA) National Institute of Food and Agriculture (NIFA) (Award Number 201767003-26057) via an interagency partnership between USDA-NIFA and the National
Science Foundation (NSF) on the research program Innovations at the Nexus of Food,
Energy and Water Systems.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
International Data Corporation IDC. (2019) “Worldwide Spending on Artificial Intelligence
Systems Will Be Nearly $98 Billion in 2023, According to New IDC Spending Guide”
Available: https://www.idc.com/getdoc.jsp?containerId=prUS45481219
Angwin, Julia, et al. "Machine bias. ProPublica." See https://www. propublica.
org/article/machine-bias-risk-assessments-in-criminal-sentencing (2016).
Dastin, Jeffrey. "Amazon scraps secret AI recruiting tool that showed bias against women."
San Fransico, CA: Reuters. Retrieved on October 9 (2018): 2018.
Mike Thomas. “Six dangerous risks of Artificial Intelligence” Builtin. January 14,2019
Sam Levin and Julia Carrie Wong “Self-driving Uber kills Arizona woman in first fatal crash
involving pedestrian” TheGuardian, 19 Mar 2018
Schlesinger, Ari, Kenton P. O'Hara, and Alex S. Taylor. "Let's talk about race: Identity,
chatbots, and AI." Proceedings of the 2018 CHI Conference on Human Factors in Computing
Systems. 2018.
Rossi, Francesca. "Building trust in artificial intelligence." Journal of international affairs
72.1 (2018): 127-134.
Goodman, Bryce, and Seth Flaxman. "European Union regulations on algorithmic decisionmaking and a “right to explanation”." AI magazine 38.3 (2017): 50-57.
Naveen Joshi. “How we can build Trustworthy AI” Forbes, Jul 30, 2019
Jobin, Anna, Marcello Ienca, and Effy Vayena. "The global landscape of AI ethics
guidelines." Nature Machine Intelligence 1.9 (2019): 389-399.
Smuha, Nathalie A. "The eu approach to ethics guidelines for trustworthy artificial
intelligence." CRi-Computer Law Review International (2019).
Paul R. Daugherty and H. James Wilson. 2018. Human + Machine: Reimagining Work in the
Age of AI. Harvard Business Review Press, Boston, MA, USA.
European Commission. "White paper on artificial intelligence–a European approach to
excellence and trust." (2020).
Veeramachaneni, Kalyan, et al. "AI^ 2: training a big data machine to defend." 2016 IEEE
2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE
International Conference on High Performance and Smart Computing (HPSC).
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
Ruan, Y., Zhang, P., Alfantoukh, L., Durresi, A.: Measurement theory-based trust
management framework for online social communities. ACM Trans. Internet Technol. 17 (2),
24 (2017). Article 16
Uslu, Suleyman, et al. "Control Theoretical Modeling of Trust-Based Decision Making in
Food-Energy-Water Management." Conference on Complex, Intelligent, and Software
Intensive Systems. Springer, Cham, 2020.
Uslu, Suleyman, et al. "Trust-based decision making for food-energy-water actors."
International Conference on Advanced Information Networking and Applications. Springer,
Cham, 2020
Uslu, Suleyman, et al. "Trust-based game-theoretical decision making for food-energy-water
management." International Conference on Broadband and Wireless Computing,
Communication and Applications. Springer, Cham, 2019.
Kaur, Davinder, Suleyman Uslu, and Arjan Durresi. "Trust-based security mechanism for
detecting clusters of fake users in social networks." Workshops of the International
Conference on Advanced Information Networking and Applications. Springer, Cham, 2019
Kaur, Davinder, et al. "Trust-based human-machine collaboration mechanism for predicting
crimes." International Conference on Advanced Information Networking and Applications.
Springer, Cham, 2020.
Xu, Weilin, David Evans, and Yanjun Qi. “Feature Squeezing: Detecting Adversarial
Examples in Deep Neural Networks.” Proceedings 2018 Network and Distributed System
Security Symposium (2018): n. pag. Crossref. Web.
Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. "Explaining and harnessing
adversarial examples." arXiv preprint arXiv:1412.6572 (2014).
Raji, Inioluwa Deborah, et al. "Closing the AI accountability gap: defining an end-to-end
framework for internal algorithmic auditing." Proceedings of the 2020 Conference on
Fairness, Accountability, and Transparency. 2020
Katell, Michael, et al. "Toward situated interventions for algorithmic equity: lessons from the
field." Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency.
2020.
Wieringa, Maranke. "What to account for when accounting for algorithms: a systematic
literature review on algorithmic accountability." Proceedings of the 2020 Conference on
Fairness, Accountability, and Transparency. 2020.
Mehri, Vida Ahmadi, Dragos Ilie, and Kurt Tutschku. "Privacy and DRM Requirements for
Collaborative Development of AI Applications." Proceedings of the 13th International
Conference on Availability, Reliability and Security. 2018.
He, Yingzhe, et al. "Towards privacy and security of deep learning systems: a survey." arXiv
preprint arXiv:1911.12562 (2019).
Hintze, Mike. "Science and Privacy: Data Protection Laws and Their Impact on Research."
Wash. JL Tech. & Arts 14 (2018): 103.
Y. Cao and J. Yang, "Towards Making Systems Forget with Machine Unlearning," 2015
IEEE Symposium on Security and Privacy, San Jose, CA, 2015, pp. 463-480, doi:
10.1109/SP.2015.35.
Ragot, Martin, Nicolas Martin, and Salomé Cojean. "AI-generated vs. Human Artworks. A
Perception Bias Towards Artificial Intelligence?." Extended Abstracts of the 2020 CHI
Conference on Human Factors in Computing Systems. 2020
Annie Brown. “Biased algorithms learn from biased data : 3 kinds of biases found in AI
datasets” Forbes. Feb 7, 2020
Stock, Pierre, and Moustapha Cisse. "Convnets and imagenet beyond accuracy:
Understanding mistakes and uncovering biases." Proceedings of the European Conference on
Computer Vision (ECCV). 2018
Mehrabi, Ninareh, et al. "A survey on bias and fairness in machine learning." arXiv preprint
arXiv:1908.09635 (2019).
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
Agarwal, Aniya, et al. "Automated test generation to detect individual discrimination in AI
models." arXiv preprint arXiv:1809.03260 (2018)
Srivastava, Biplav, and Francesca Rossi. "Towards composable bias rating of AI services."
Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. 2018
Celis, L. Elisa, et al. "How to be fair and diverse?." arXiv preprint arXiv:1610.07183 (2016).
Sablayrolles, Alexandre, et al. "Radioactive data: tracing through training." arXiv preprint
arXiv:2002.00937 (2020).
Lepri, Bruno, et al. "Fair, transparent, and accountable algorithmic decision-making
processes." Philosophy & Technology 31.4 (2018): 611-627
Bellamy, Rachel KE, et al. "AI Fairness 360: An extensible toolkit for detecting and
mitigating algorithmic bias." IBM Journal of Research and Development 63.4/5 (2019): 4-1.
Mueller, Shane T., et al. "Explanation in human-AI systems: A literature meta-review,
synopsis of key ideas and publications, and bibliography for explainable AI." arXiv preprint
arXiv:1902.01876 (2019).
Wang, Danding, et al. "Designing theory-driven user-centric explainable AI." Proceedings of
the 2019 CHI conference on human factors in computing systems. 2019.
Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. "" Why should I trust you?"
Explaining the predictions of any classifier." Proceedings of the 22nd ACM SIGKDD
international conference on knowledge discovery and data mining. 2016
Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding convolutional
networks." European conference on computer vision. Springer, Cham, 2014.
Mothilal, Ramaravind K., Amit Sharma, and Chenhao Tan. "Explaining machine learning
classifiers through diverse counterfactual explanations." Proceedings of the 2020 Conference
on Fairness, Accountability, and Transparency. 2020
Zhang, Quan-shi, and Song-Chun Zhu. "Visual interpretability for deep learning: a survey."
Frontiers of Information Technology & Electronic Engineering 19.1 (2018): 27-39.
Kim, Been, et al. "Interpretability beyond feature attribution: Quantitative testing with
concept activation vectors (tcav)." International conference on machine learning. 2018.
Madumal, Prashan, et al. "Explainable reinforcement learning through a causal lens." arXiv
preprint arXiv:1905.10958 (2019
Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding convolutional
networks." European conference on computer vision. Springer, Cham, 2014.
Mahendran, Aravindh, and Andrea Vedaldi. "Understanding deep image representations by
inverting them." Proceedings of the IEEE conference on computer vision and pattern
recognition. 2015