Our total inventory of values on AI in health from the 115 articles in our sample amounts to a total of 100 different values. Table one [to be published in a data repository] shows each these values. Due to the sheer number of values, we can’t discuss each in this review. Instead, we will present a selection of values according to their frequency of mention, divided into three categories: 1) the ten most highly discussed values in our sample, 2) some broad values often discussed in medical AI ethics conversations and 3) Values that we found to be unique and relative values relative to the domain of medical AI and our sample. We include this last section of values because consideration of these values may signify key underdiscussed ethical issues for medical AI, and may point the direction to new or emerging values in the field. What follows is a list of each value found in the sample and the number of papers in our sample that discussed the value. The table is formatted in descending list of frequency.
We would also like to describe two resources that [will be available once published in a data repository, to be done between pre-print submission and journal submission]. First, there is a table that offers a list of values found in the review, along with relevant information about each of the values. Second, there is a table that describes relevant metadata about each paper. These resources may be useful for further research regarding particular values (especially those not described in the remainder of the article), finding papers in our sample useful for some inquiry.
Before turning to a discussion of specific values, we want to make a note about the nature of this empirical inquiry. In analyzing over 100 papers and finding almost 100 values, this paper must leave out many interesting discussions and points made about values. We take these discussions to be some important introductory points about each value; in reality one could easily perform a standalone literature review of any one value found in this review. What we hope to provide is a zoomed-out map of values in this domain, which means forgoing many interesting details found as we performed the review. What values are out there, in medical AI literature? What do they usually mean, and how do they tend to be discussed? These are the kinds of questions we hope to provide answers to. Moreover, values themselves are not always explicitly mentioned, even if they are heavily implicated or underlie a certain point. In a similar way, values are sometimes mentioned offhand, without the paper being dedicated to any rigorous point about that value. In cases of the former, we may not have picked up on these strong undertones or implications. In cases of the latter, mapping the literature based off mere mention would not have been useful in that it would cloud the few papers dedicated to interesting points about a value with the many papers that just mention it. And finally, values are complex, often ambiguous, and sometimes hard to differentiate from non-values. If the reader takes nothing else from this review, we hope that she takes away the fact that value inquiry is anything but explicit and simple, and the conceptual work has only begun.
3.2.1 Highly Discussed Values
This section describes and relays key points of discussion for the ten most mentioned moral values. We remain as objective and descriptive as we are able, leaving interpretation and commentary to the discussion section. For a table that offers information on each value found in our scoping review, readers of the published version will be able to consult the data repository publication of values.
3.2.1.1 Trust
We identified trust as the most discussed value in our collection of papers. We found that 52 papers put forth at least some notable claim regarding trust or trustworthiness. This is no surprise; trustworthiness is a broad property largely synonymous with being ethically sufficient. As we will see, “trustworthy” is a somewhat thin term—in the VSD context, it does not refer to a particular feature or aspect of a technology, but rather whether the concerns had by those using the technology are addressed.
A number of definitions of trust were employed in the sample. Many understand cases of trust as those in which an agent chooses to be vulnerable by giving some power to someone (or something) else: “a psychological state comprising the intention to accept vulnerability based on positive expectations of the intentions and behaviors of another” [74], direct quotation of [75]. Under this notion, trust is giving power to someone (or something), as a result of a positive evaluation of that person or thing. Liu et al. [76] look at multiple definitions of trust, highlighting again how definitions of trust center on willing to be vulnerable in light of considerations about the trusted individual or technology. Skirting vulnerability, Sendak et al. use “trust” to refer to “a belief held by individuals that the system in place is appropriate and accurate” [5]. Common to each definition is that trust depends upon some kind of belief in the adequacy of that which is being trusted to perform actions that impact oneself, usually in light of considerations about attributes or features of the trusted object.
Because a technology is not going to be used if it is not trusted, papers frequently discuss what enables trust in a given technology. One point at which there is disagreement concerns whether transparency (or interpretability or explainability) is required for patients and practitioners to trust medical AI. La Rosa and Danks note that trust can be attained two ways: on the one hand, “trust can be grounded in reliability, in the sense of the trustee being predictable by the trustor” (“behavioral trust”) [32]. On the other hand, “trust can be grounded in an understanding of the ‘mechanisms’ […] by which the behavior or actions are generated” (“understanding trust”) [32]. If “behavioral trust” is attainable and sufficient, it would seem that “understanding trust” is not necessary, and medical AI can become trustworthy while remaining opaque. Given that there is general concern regarding medical AI, and given that many scholars find transparency to build trust (for a short list of sources, see [1]), there are reasons to claim that decreasing opacity is pro tanto a route to trust. Even so, alternative pathways to trust—such as accuracy or reliability—are sometimes defended over explainability or transparency [5, 34].
A number of other moral values were noted as means to trust. Cohen et al. (2020) relay seven requirements for trust: “’(1) human agency and oversight, (2) technical robustness and safety, (3) privacy and data governance, (4) transparency, (5) diversity, non-discrimination and fairness, (6) societal and environmental well-being, and (7) accountability’” [77, 78]. If stakeholders feel that their autonomy is called into question, this can also be a barrier to trust [79] [45]. In a survey with stakeholders, trust was associated with “reliability, professionality, expertise, sensitivity, (legal) rules, keeping promises and the security of medical and personal data” [74].
Though trust is mostly an “end” value to which other values are means, there are certain contexts in which trust forms a “means” value. For example, when collecting and using data, (informed) consent is only given when the individual trusts that to which she is consenting [80]. Interestingly, medical AI may be unduly trusted in some cases. Domain experts have been observed to hastily rely on AI agent judgments even when instructed to think critically about each judgment [81]. AI may be trusted too readily because of its (at least seeming) objectivity; AI health algorithms replace subjective and fallible human judgments with objective ones based on rigorous data—or so it seems [27]. If sufficiently accurate and powerful, AI has the potential to substitute trust for certainty [82].
3.2.1.2 Privacy
Privacy was the second most discussed moral value in the literature. In its strictest sense, it is often defined as “the right to be left alone” [83] [84]. Different understandings include “The right to control over personal information, including the ability to exercise control over the information about oneself,” “The right to personhood, including the protection of one’s personality, individuality, and dignity,” and many others [84] citing [85]. Even just under the first definition, privacy can be split into three notions: 1) freedom from influence or observation from other people (relational privacy), 2) control over personal data (informational privacy), and 3) freedom from surveillance [83]. Scholars argue that what privacy entails depends on the context in which privacy is being considered, and that privacy is the “appropriate flow of personal information” for a given task or goal [86] [74].
Privacy is a key value because the importance of the privacy of health data is almost uncontested, but AI demands access to health data, and the more health data is handled, the more potential there is for breeches of privacy [64]. One study determines privacy to be the “most important socio-ethical risk” associated with a particular algorithmic health technology, “digital twins” [70]. Simultaneously, the authors note that “patients who have received a cancer diagnosis are less inclined to think about privacy or comfort,” which means that even if privacy is a top concern, it isn’t a concern for all patients equally. Positions on privacy differ notably among stakeholders when surveyed empirically in papers in the sample, with some far more concerned than others [70, 87].
Security of data is a key means to privacy in cases where sensitive information is used. Transparency regarding how data is collected, stored, and processed also fosters privacy by ensuring that data is treated acceptably and proportional to the task at hand [88]. While privacy is often a data concern, in different contexts privacy means something completely different. Take, for example, van Wynsberghe’s example of the “wee-bot,” and its ability to increase the amount of corporeal privacy had by patients needing assistance in urine sample collection scenarios by replacing human help with robot help [60]. This looks quite distinct from the sense in which privacy is invoked regarding one’s personal health data. In this way, the value of privacy can range from data privacy to bodily privacy and beyond, depending on the context.
3.2.1.3 Transparency
Transparency is often used as an umbrella term for many other concepts, if not as a synonym for them. These terms include explainability, explicability, understandability, intelligibility, interpretability, and more. Generally, the terms refer to specific ways of seeing into what is often described as the “black box”—the hidden algorithmic mechanisms inside the algorithms that take input to output. We may input something to an algorithm, and it may give an output result, but the process by which it attains that result is opaque. With algorithms as complicated as they must be for AI to function, it is hard to see—let alone know—what the algorithm is doing with the input, what data it uses in its processes, and why it gives the result that it does. Transparency thus works against AI’s natural opacity. In writing about rights regarding transparency respected by the EU General Data Protection Regulation, Strikwerda et al. 2022 argue that it “grants data subjects the right to receive concise, easily accessible and easily understandable information about the processing of their data and their rights of the matter” [74]. Zhang et al. (2022) argue that transparency “generally has two meanings: one is the comprehensibility of design logic and the other is the present-at-hand state of tool use,” and go on to propose a further notion of transparency, “as the intentional and cognitive content that the human actor can offer to the human-machine relationship, which can be distinguished by the machine’s and fully utilized in the human-machine relationship [48]. Ferrario (2022) borrows the definition of transparency as “algorithmic procedures that make the inner workings of a black box algorithm interpretable to humans” [89] directly quoting [90].
Transparency plays a key role in enabling other values. Scholars tend to argue that transparency is important fundamentally as a means to other valuable ends, not in itself [74]. Transparency is important for identifying and addressing errors, biases, and failures [91]. Transparency is also correlated with willingness to trust the algorithm [2, 34, 92, 93]. Moreover, if patients or doctors are entitled to understand those technologies impacting their health then transparency will be important as a means for this understanding [2, 94]. Further, transparency has been argued to enable accountability and responsibility [2, 79, 95]. It is also argued to be important for justice, avoiding harm, professional integrity, public benefit [2], informed consent, liability [34], credibility [96], and even accuracy and robustness [58].
It is worth noting that transparency is not a new value in healthcare. Patients usually want transparency from the professionals involved in their healthcare [34]. But, transparency is simultaneously a perennial difficulty in healthcare. Some scholars argue that demands for explainability and interpretability are becoming disproportionately high in comparison to other domains of medicine. Sendak et al. note that the human body is a so-called “black box,” the processes within which we are often unsure of [5]. Further, we do not always require exact knowledge of the mechanisms by which certain drugs and treatments are effective. Using the case of Asprin, which they claim was used for 70 years before its “pharmacological mechanisms” were understood, Zhang and Zhang (2023) argue that “some believe doctors may be able to use some black box models in clinical practice as long as there is sufficient evidence that these models are reliable” [92, citing 94]. Scholars point to these facts to argue that we may be asking more of medical AI than we do of other medical technologies.
Though we generally avoid a discussion of value tradeoffs here, it is worth quickly discussing one that stands apart from the rest: that between transparency (or explainibility, interpretability, etc.) and accuracy (or performance, power, etc.). Though not universally seen to really be in conflict, the dilemma between transparency/interpretability and accuracy was far and away the one most discussed by papers in the review [18, 27, 30, 34, 72, 78, 96, 98]. There is some evidence that, despite the seeming importance of transparency, stakeholders actually care more about effectiveness than transparency. In the words of Konig et al., “The results from our studies on two algorithms used in the public sector strongly suggest that when citizens must make trade-offs between transparency and stakeholder involvement in algorithm design, and the algorithm’s effectiveness, they clearly prioritize the latter” [31].
Finally, an important claim made by a paper in the sample was that a fifth bioethical principle, explicability, that would account for the contemporary demand to see into the ”black box,” is unnecessary (in the domain of radiology) because the familiar principles would lead to explicability when needed [99]. There is debate over whether a new bioethical principle along the lines of transparency is needed in light of AI’s opacity. According to Ursin et al., explicability follows from the bioethical principles of justice and respect for human autonomy such that a technology is unjust insofar as it is not understood by patients [72].
We have taken transparency to be the broadest term for values that go against AI’s “black box” tendencies. The value is especially interesting because it acts as a means to numerous important values, seems to work against at least one in most cases (accuracy), and is one of the more controversial values in the medical AI context.
3.2.1.4 Autonomy
The principle of autonomy in bioethics can be understood as “the right that individuals have to make free and meaningful decisions about their treatments” [64]. Alternatively, autonomy is generally “the ability to live one’s life and make one’s own decisions according to one’s own values and rules, without being restricted by anyone else” [69]. Finally, autonomy can refer to “an individual’s capacity to self-determine” [100]. Different definitions will be important for different use cases, but it is worth noting that the principle’s above definition leaves out concerns regarding autonomy that are impacted by health and healthcare, but not directly connected to decisions about treatments. In technologies that “re-able,” for example, autonomy is supported by empowering individuals independently perform actions that they value [61].
Concerns regarding autonomy in the context of medical AI arise not only with regard to patient treatment decisions, but also with regard to responsibility, data privacy, and informed consent [61, 64, 92, 100, 101]. Especially if AI is developed to assist in patient’s decisions regarding treatment options, it is argued that autonomy is threatened by algorithms [3]. To avoid a return to paternalism in medicine, the patients’ own priorities and values must be taken into account when using the algorithm [3, 102]. Scholars have diverse viewpoints on the impact that AI will have on patient autonomy. On the one hand, McDougall argues that “involving AI in ranking treatment options poses a significant threat to patent autonomy,” and further, “Unless AI systems are carefully designed to be value-flexible […], we risk a shift back to more paternalistic medicine in a different guise” [103]. On the other hand, AI may increase autonomy and freedom by reducing the time and energy patients have to spend obtaining medical care [70], or by empowering patients with better understanding of their health and so a greater autonomy in life more broadly [104].
3.2.1.5 Freedom from Bias
Similarly to concerns regarding privacy, concerns regarding bias stem from consideration about the data on which medical AI depends. Just as AI has the potential to eliminate the subjective and unintentional biases held by clinicians, it can also perpetuate biases that exist in the training data [27, 105, 106]. Smits et al. claim that bias arises when a technology “systematically and unfairly discriminate[s] against certain individuals or groups of individuals in favor of others” [68], directly quoting [107].
Bias is often understood in terms of fairness [84, 101, 108, 109]. AI models are biased or unfair if they make inaccurate judgments for specific populations, specifically along the lines of gender or race [2, 110]. Because data may fail to be representative for particular groups, AI health algorithms run the risk of perpetuating or even furthering health disparities [96]. Because of this, representativeness and diversity are key values for the data going into medical AI [96]. Accessibility, moreover, has been pointed out as an issue of bias: if medical AI is more accessible for those with higher levels of privilege, then bias on the basis of social determinates of health is perpetuated [2]. Freedom from bias can also be understood in terms of justice [92, 108] and lack of discrimination [38, 108]. Scholars often argue that, if our goal is freedom from bias cf. [108], then we will need transparency, interpretability, explainability, or the like in order to combat bias in the algorithm [71, 91, 96, 111, 112].
3.2.1.6 Justice
Justice is a prominent bioethical principle [79]. In our sample, justice is often understood in terms such as distributive justice [74] rights-based justice, legal justice [79], procedural justice or interactional justice [108]. Umbrello et al. 2021 relay Floridi et al. 2018’s three notions of justice as a method of defining fairness: correcting wrongs already done, benefiting people in a shared or sharable way, and keeping new wrongs from coming to be [110, citing 111]. Sometimes focusing on particular notions, “justice” had many senses in papers in our sample.
In the context of medical AI, justice is commonly invoked when inequity, unfairness, or discrimination is a concern [2, 45, 61, 70, 79, 106, 108, 112, 115]. In our sample, other words often used in similar ways as “justice” include “equality,” “fairness,” “diversity,” “nondiscrimination,” “inclusion,” “access,” and “objectivity” [45, 61, 101, 115]. In the best case, medical AI promotes equity by removing human bias in favor of accurate, objective, unbiased judgments [46]; in worse cases, medical AI perpetuates or exacerbates inequities through bias or inaccessibility. According to some scholars, justice means equitable distribution of not just benefits but also harms, risks and costs [45, 79]. And for some, Justice demands that salient explanations be given in certain contexts [99]. Justice is thus mostly uncontroversial at a general level, though what the specifics entail may be more controversial.
3.2.1.7 Safety
Safety is defined by a paper in the sample as the lack of danger or risk [116]. Buruk et al. 2020 understand safety in terms of accuracy and security [101]; an algorithm can only be as safe as it is accurate, and those stakeholders whose safety is called into question by the security of data are put at risk by a lack of security. Safety was also mentioned in relation to non-maleficence [45] and the avoidance of harm [96].
Safety concerns regarding data were a reiterated point of concern, with scholars noting the importance of security of data and digital platforms [22, 23]. In different use cases, though, safety can mean something entirely different: with algorithms that drive robots, drones, or other autonomous agents, physical safety of the patient is the kind of safety in question [45, 46].
Safety is important for an algorithm to be trustworthy, but scholars also note a connection between safety and transparency. As Abramoff et al. 2022 note regarding diagnostic algorithms, an important chain of values centers safety: because the lack of transparency can lead to failures to recognize bias, and because bias is a safety concern, the desire for safety leads to a demand that the algorithm can be understood (at least by the physician) [109]. Similarly, interpretability is advocated by some authors as important for safety [4]. We do not yet know all the ways in which medical AI is unsafe [117]. Given safety’s importance for trustworthiness, non-maleficence, and more, calls for robust safety assessments (such as in [118]) make a lot of sense.
3.2.1.8 Explainability
Though conceptually similar to transparency, explainability usually refers to a more specific way of being transparent. One way of understanding explainability is as a “characteristic of an AI-driven system allowing a person to reconstruct why a certain AI came up with the presented positions” [6]. Alternatively, explainable AI is “Auditable, comprehensible, and intelligible by human beings at varying levels of comprehension and expertise” [27] directly citing [119]. Or, explainability can be understood as providing explanations where explanation means “an interface between human and system that accurately approximates the model of the system and is comprehensible to the human” [7]. Like many values, though, explainability has no universally agreed upon definition [76].
To further precisify kinds of explanations, van der Waa et al. delineate three types: “1) confidence explanations (explain how confident the agent is), 2) feature attributions (explain which observations are attributed to an agent’s decision), and 3) contrastive explanations (explain why the agent made a certain decision over another)” [120]. Or, explanations can be divided four ways: system logic and reasoning, system reliability, information sources, and personalization [121]. These different kinds of explanations can be useful because different kinds of explanations may be more or less important in different contexts.
Explainability is taken to be a value that enables similar functions as does transparency. For example, some scholars argue that explainability helps in the effort to be free of bias [111, 122]. According to Kempt et al., explainability has four goods: “preconditional good,” “instrumental good,” “conditional good,” and “democratic good” [111]. It may also be the case that what we are after when we look for explainability is contestability, because we need to know when we have reason not to accept the algorithm’s result—for reasons of bias, inaccuracy, or the like [123]. Not surprisingly, Explainability is often seen as a key means to trust [7, 32, 76, 94, 96, 121]. According to some, though, what stakeholders ultimately want to know for trust is that the algorithm’s outputs are valid—not necessarily a comprehensive understanding of the algorithm’s processes [63]. If this is so, then explainability may be neither necessary nor sufficient for trust, given that there is some way of knowing validity without explanations. It is largely taken that explainability (or interpretability) and accuracy are at odds with each other in AI algorithms, but some argue that this is not necessarily the case [34, 124]. Some scholars are optimistic that explainability can be achieved with little or no cost to accuracy, while others are skeptical that explaining AI is an attainable goal [96].
In explainability we see a particular kind of transparency, pertaining to the communicability and comprehensibility of the processes by which an algorithm arrives as its judgment. Like transparency, explainability enables key values but perhaps at the cost of values such as accuracy.
3.2.1.9 Accountability
Accountability is often discussed in parallel with responsibility and liability. To distinguish accountability from responsibility, scholars note that accountability is how responsibility is made visible and put into action [45]. Some articles build an accountable party into their definition of accountability, or at least restrict discussions of accountability to the accountability of developers. According to Sendak et al., accountability is “the way in which technologists and designers can be held responsible for the performance of a system…” [5]. In the broader VSD context, Friedman and Hendry claim that accountability “refers to properties that ensure that the actions of a person, people, or institution may be traced uniquely to the person, people, or institution" [40]. In this sense, accountability connects what happens to who (or what?) does it.
Especially in cases where AI will impact a diagnosis or treatment, ambiguity arises regarding who the responsible party is—and this ambiguity may call into question our very notions of accountability and responsibility in medical scenarios [27]. AI used to achieve decisions, predictions, or classifications interrupts the “direct accountability” that existed before AI support; wherein providers were directly accountable for judgments [30].
It is argued that transparency is a means to accountability, and that the desire for accountability is a motivating force to oppose black box AI [94, 98, 101, 115] c.f. 5]. One paper compares AI to an X-Ray machine to illustrate the kind of understanding practitioners need in order to be accountable and informed users of their tools. A doctor doesn’t need to understand exactly the mechanics of the X-Ray machine, but instead she only needs to know “performance characteristics” and limitations of the machine [2]. In this way, the inner-functioning of an algorithm need not be entirely transparent in order for the practitioner to be an accountable user, but instead meet a standard of understanding. Also important means to accountability are traceability, controllability, and that adverse impacts are reported [101].
Questions about accountability often concern who should be held accountable when an algorithm is wrong, or when a clinician (wrongly) turns away from the output of an algorithm—the clinician, the developer, the data provider, or someone else [82]? A robot may be at fault in the sense of being the cause of some harm, but it can’t be punished, and therefore can’t be held responsible in any putative sense beyond being adjusted or removed from operation [47]. Even still, it might be the case that we determine that AI can be independently and solely liable [96]. Can it thus be accountable? Questions like these remain for medical AI.
3.2.1.10 Accuracy
All is for not if the algorithms aren’t accurate. So, uncontroversially, accuracy is a value for medical AI in line with its purpose to improve medical care and patient outcomes [20]. Along these lines, accuracy is an important means value to trustworthiness [5]—and has even been said to be able to replace trust with certainty, if sufficiently accurate [125]. There is empirical evidence that accuracy is of paramount importance to stakeholders [126].
What is controversial is the accuracy vs. transparency tradeoff, and when to sacrifice one for the other. On the side of accuracy, some scholars suggest that an AI is suitably accurate if it rivals the accuracy of clinicians, or is clinically acceptable [25, 91]. When choosing between accuracy and transparency, some scholars tend to endorse accuracy [108]. Others tend to endorse transparency, because algorithms are fundamentally tools that practitioners use without delegating responsibility to [72]. Accuracy is often at odds with certain values because accurate algorithms require high levels of data, and this data oftentimes is treated as private or confidential [24]. So, though accuracy is of key import, it faces tradeoffs against values such as transparency and privacy.