Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
68 views21 pages

AI Agents Autonomy

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views21 pages

AI Agents Autonomy

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Cognition, Technology & Work (2024) 26:435–455

https://doi.org/10.1007/s10111-024-00765-7

ORIGINAL ARTICLE

Understanding the influence of AI autonomy on AI explainability


levels in human‑AI teams using a mixed methods approach
Allyson I. Hauptman1 · Beau G. Schelble1 · Wen Duan1 · Christopher Flathmann1 · Nathan J. McNeese1

Received: 4 December 2023 / Accepted: 17 April 2024 / Published online: 18 May 2024
This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2024

Abstract
An obstacle to effective teaming between humans and AI is the agent’s "black box" design. AI explanations have proven
benefits, but few studies have explored the effects that explanations can have in a teaming environment with AI agents oper-
ating at heightened levels of autonomy. We conducted two complementary studies, an experiment and participatory design
sessions, investigating the effect that varying levels of AI explainability and AI autonomy have on the participants’ perceived
trust and competence of an AI teammate to address this research gap. The results of the experiment were counter-intuitive,
where the participants actually perceived the lower explainability agent as both more trustworthy and more competent. The
participatory design sessions further revealed how a team’s need to know influences when and what teammates need explained
from AI teammates. Based on these findings, several design recommendations were developed for the HCI community to
guide how AI teammates should share decision information with their human counterparts considering the careful balance
between trust and competence in human-AI teams.

Keywords Human-AI teaming · Adaptive autonomy · Explainable AI · Artificial intelligence

1 Introduction Consequentially, AI agents, in many situations, have become


more than tools used by the team, but rather part of the
Modern advances in artificial intelligence (AI) continue team (O’Neill et al. 2020; McNeese et al. 2018). These new
to enable the creation of AI agents that can operate with human-AI teams are able to leverage the technical strengths
increasingly higher levels of autonomy (LOA) (Chen et al. of AI and present humans and organizations with the ability
2022). These higher LOA center around agents capable of to overcome existing struggles with all-human teams, such
performing tasks from start to finish with minimal human as operating in data-intensive and geographically distant
input and direct control (O’Neill et al. 2020; Parasuraman contexts (Nyre-Yu et al. 2019; Chen 2023). While the unique
et al. 2000a), which enable AI agents to fulfill independent information processing capabilities of AI make the prospect
roles in a variety of teams, organizations, and task environ- of these teammates new and exciting, their use also comes
ments (McNeese et al. 2018; Wilson and Daugherty 2018). with unique challenges for teams.
AI agents capable of taking on independent team roles
* Allyson I. Hauptman can operate with less human monitoring and control. Still,
[email protected] in complex environments involving elevated levels of uncer-
Beau G. Schelble tainty and risk, this lack of human oversight can lead to
[email protected] disastrous outcomes (Pedreschi et al. 2019; Suzanne Barber
Wen Duan et al. 2000). This is because as systems execute decisions
[email protected] more independently, human situational awareness of the sys-
Christopher Flathmann tem’s decisions decreases (Wickens et al. 2010). This issue is
[email protected] exacerbated by human distrust of AI systems that make deci-
Nathan J. McNeese sions within a "black box" algorithm, which hides what and
[email protected] how the AI is processing information to make its decisions
(Castelvecchi 2016). In response to this, methods for AI to
1
School of Computing, Clemson University, 821 McMillan provide explanations for their decisions have been developed
Rd., Clemson 29631, SC, USA

Vol.:(0123456789)
436 Cognition, Technology & Work (2024) 26:435–455

as one way to reduce the mystery of highly autonomous AI’s perceive, accept, and trust from that technology, particu-
"black box" decision-making nature (Shin 2021a; Weitz larly in teams (Huvila et al. 2022). However, individuals’
et al. 2019). However, there is a trade-off between too much information needs may not be static and constant as they
and too little explanation (Dhanorkar et al. 2021). While interact with technology. For instance, increasing famili-
explanations provide teammates with detailed information to arity with a specific technology eliminates the need to
better understand the rationale and intention behind the AI’s understand every detail of how it works (Hauptman et al.
decision, sometimes too much information can lead to cogni- 2022). Additionally, the degree to which a human is "in
tive overload and an inability for humans to focus on their the loop" of AI’s decision-making process may fundamen-
own tasks, which can significantly frustrate a team’s ability tally change how much and what information humans need
to work interdependently (Wang et al. 2019). Additionally, to know and, in turn, change how they interact with and
AI agent explanations must fit the communication needs of perceive the AI (Abbass 2019). Despite research into how
their human teammates (Stowers et al. 2021), which vary AI explainability affects human behaviors, little is known
based explicitly on the team’s working environment (Jarrahi with respect to the relationship between how much an AI
et al. 2022). This means that the specific information that an teammate explains with how much autonomy it exhibits in
AI agent communicates in its explanations is also extremely executing its tasks. Lower autonomy systems must gener-
important to consider. ally communicate more with humans due to the require-
Previous research on AI autonomy has found that it would ment for human input in their decisions. Thus, AI that
be beneficial if AI teammates were capable of operating at provides a high or low level of explainability may also
multiple levels of autonomy, based on changing tasks and be perceived by a human teammate as even more or less
environments (Hauptman et al. 2022; Zieba et al. 2010). The autonomous. In order to investigate this relationship, this
established benefits of dynamic autonomy levels raise the study explores the following research questions:
question of whether AI teammates should also possess dif- RQ1: How does teaming with an AI agent with a high
ferent levels of explainability. There is already evidence to or low level of explainability affect the human teammates’
support the idea that explainability should not be a static perceived trust and competence of the AI at both a low and
feature, as human-computer interaction (HCI) research has high level of autonomy?
found that AI needs to explain itself differently based upon RQ2: How should the content of AI explanations
what and to whom it is communicating (Dhanorkar et al. change as the AI teammate’s autonomy level changes?
2021). This is especially important for human-AI teams Given the complex and context-dependent nature of
(HATs) because humans want the AI to adapt its interaction teaming and explainability requirements, this research
behaviors to be as helpful as it can be while keeping humans takes a mixed methods approach, utilizing two studies to
knowledgeable of essential information (Liao et al. 2020). In answer the above research questions. In the first study,
fact, research shows just the perception of AI as adaptive can we conducted a 2x2 (LOA x Explainability Level) online
increase human performance (Kosch et al. 2023). Despite networking experiment to examine the effects of differ-
robust research into how to make AI algorithms more trans- ent LOAs and AI explainability levels on participants’
parent and explainable to the user (Larsson and Heintz 2020; perceived trust and competency of their AI teammates.
Waltl and Vogl 2018; Hussain et al. 2021), there have been Then, in the second study, we held participatory design
increased calls for more research into the content and fre- sessions with twelve of those participants in order to fur-
quency of explanations that humans need while interacting ther understand the explainability needs and desires of
with an AI agent (Weber et al. 2015; Schoenherr et al. 2023). human teammates for AI agents with varying LOAs. The
Explainability and autonomy levels substantially con- identified dimensions of the dynamic relationship between
tribute to trust development and growth in human-AI the levels of autonomy and explainability of AI teammates
teams. The ability to understand an AI agent’s capabili- are heavily grounded in both the participants’ professional
ties and decisions is fundamental to a human’s notion of its experiences and interactions with the AI in these studies.
trustworthiness (Jacovi et al. 2021; Caldwell et al. 2022). The resulting discussion and design recommendations pro-
This is because it allows them to predict the AI’s future vide an empirical starting point for the HCI community to
behavior (Jacovi et al. 2021). In fact, research into explain- model and understand the optimal explainability levels for
able agents in human–machine teaming has shown that AI teammates operating with different autonomy levels.
explanations can substantially increase human teammate This greatly contributes to the body of human-AI teaming
trust in the robot’s decisions (Wang et al. 2016). Previous literature as the community seeks to envision and design
research on information needs shows that human inter- artificial agents that can work closely with and support
actions with technology affect the information they will humans in complex team environments.
Cognition, Technology & Work (2024) 26:435–455 437

2 Related work While these reviews focus widely on how the AI itself
should be designed, they lack a human-centered approach
In this section, we will lay the groundwork for our studies, to AI explanations. A recent study on the role of informa-
beginning with the need for and types of AI explanations, tion exchange in designing explainable systems argued that
followed by levels of AI autonomy. Finally, we will articu- the current trend towards using AI techniques to explain
late the research gaps that motivate our research. AI is insufficient, and the explanation recipients need to
be more involved in how AI explanations are created and
given (Xie et al. 2022). There are various reasons for this
2.1 AI explanations need, including the importance of effective human-cen-
tered AI explanations in building trust in AI algorithms
Previously, AI models have often been described as a black and overcoming gaps in AI transparency (Shin 2021a).
box into which information is simply input; the box “does User-centered explanation solutions attempt to allevi-
its magic” and produces some form of output (Xu et al. ate these issues by developing AI that explains not only
2019). Research has shown that these black-box models what it did but also why it did it in ways a human would
can have significant negative impacts when AI is used in understand (Wang et al. 2019). In regards to the what, the
complex situations (Cohen et al. 2021), such as the ina- AI must provide its output in a readable manner, a con-
bility to track where something went wrong (Yu and Alì cept often referred to as interpretability (Lipton 2018).
2019). Some within the AI community have indicated a Research shows that this interpretability encourages user
distinct lack of work into the ethics surrounding AI design trust in AI algorithms (Shin 2021b). As a function of that
(Slota et al. 2022). Cohen and colleagues found that minor interpretability, the audience must be able to grasp what
mistakes in the training phase often led to severe issues the output means, referred to as the agent’s understand-
with the model that could be relatively difficult to find ability (Joyce et al. 2023). While these terms often over-
and understand because of the model’s lack of explanation lap, interpretability refers to the AI’s ability to explain an
(Cohen et al. 2021). Additionally, evaluations of medi- abstract concept, while understandability refers to the AI’s
cal AI technologies have demonstrated that black-box AI ability to make it understandable to an end-user (Vilone
agents hinder their use and effectiveness due to ethical and Longo 2020). Both of these aspects contribute to the
concerns (Duan et al. 2019). Opaque AI can have major delivery of an effective AI explanation (Marcinkevičs and
negative implications for the humans with whom it inter- Vogt 2020). This gap in considering how the explanations
acts. For instance, research on AI-enabled recommender provided by an AI teammate are received by a human
systems showed that opaque recommendations could teammate is a driving motivation behind this research.
decrease user self-confidence (Shin 2021a). In response This is why the AI explanations in the high explainability
to these challenges, a quickly growing area of research is condition in the first study include what information the
ways to design AI to explain better reasoning and actions AI considered in accomplishing its task.
to humans (Xu et al. 2019; de Lemos and Grześ 2019; Explainability exists on a spectrum regarding the type
Pokam et al. 2019). User-centered explanation solutions and amount of explanations that the AI can provide.
attempt to alleviate these issues by developing AI that For instance, Dazeley and colleagues organized Levels
explains not only what it did but also why it did it in ways of AI Explanation into a pyramid based on human psy-
a human would understand (Wang et al. 2019). In regards chological needs (Dazeley et al. 2021). Other research-
to the what, the AI’s output must be readable by the human ers have classified an AI’s level of explainability based
audience, a concept often referred to as interpretability upon the AI’s algorithms and capabilities (Arrieta et al.
(Lipton 2018). Research shows that this interpretability 2020), (Sokol and Flach 2020). Most of these descrip-
encourages user trust in AI algorithms (Shin 2021b). As a tions can fall into two main categories, low-level vs.
function of that interpretability, the audience must be able high-level explainability models. Low-level XAI gives
to grasp what the output means, referred to as the agent’s basic information about its decision, potentially display-
understandability (Joyce et al. 2023). Both of these aspects ing the algorithm(s) behind it or giving a brief descrip-
contribute to the delivery of an effective AI explanation tion of what it is supposed to do or the results it found.
(Marcinkevičs and Vogt 2020). High-level XAI gives more detailed explanations of the
The research on explainable systems is exploding entire process, including their decision logic (Sanneman
at such a rate that multiple reviews in the HCI (Speith and Shah 2020; Miller 2019). This is arguably an essen-
2022; Mueller et al. 2019) and computer science (Vilone tial step for an AI teammate because the degree to which
and Longo 2020; Das and Rad 2020) communities have humans understand an AI agent can greatly affect their
recently proposed new methods for organizing the subject. acceptance and trust of it (Xu et al. 2019; Bansal et al.
2021). Some explainability research has articulated this as
438 Cognition, Technology & Work (2024) 26:435–455

the stakeholder variable, a concept stating that, because 2.2 Levels of autonomy


the goal of explanations is to satisfy the expectations and
goals of a stakeholder, that stakeholder’s perceptions of Addressing the various levels of autonomy (LOA) for AI
the explanations are important (Langer et al. 2021). in human-AI teaming is the final concept necessary to
Explanations are only as valuable as they are under- motivate the current research, as it coincides directly with
stood and accepted by the persons receiving them (Daze- the need for XAI. Artificial agents can be programmed
ley et al. 2021). In approaching what may be considered to operate with different levels of autonomous behavior.
a "low" vs. a "high" level of explainability, we turn to In order to categorize these levels, autonomy researchers
the point of view of research done by Lombrozo and col- have adapted the LOA (Parasuraman et al. 2000b) into
leagues, which suggests that humans perceive the level three categories of autonomy: no autonomy, partial agent
of explainability to be higher when the explanations autonomy, and high agent autonomy (O’Neill et al. 2020).
communicate more events in the most coherent manner AI that requires human input to perform any decision or
(Lombrozo 2006). This follows the research of Dazeley action is not, actually, autonomy, according to the litera-
and colleagues, who found the more contextual informa- ture, as it performs no independent role (O’Neill et al.
tion the explanations include, the higher value it is to the 2020). Agents with partial and high LOAs are capable
persons receiving the explanation (Dazeley et al. 2021). of taking on independent functions that not only define
It also reflects human-AI research that shows increased their autonomous behavior but also make them capable
acceptance of AI-generated communications that appear of taking on independent team roles (O’Neill et al. 2020),
more human-like (Shin 2022). This implies that very making them inherently more integral to the team than a
high-level explanations from an AI teammate should be simple tool.
frequent, human-readable, and provided within the con- Teams operate in dynamic, complex environments
text of the team activity. This study utilizes those princi- that change over time, and AI teammates need to be able
ples in designing high-level explanations for the AI team- to change their behavior and capabilities to match such
mate in the experiment. changes (Suzanne Barber et al. 2000). This might mean
The introduction of AI explanations directly addresses that AI teammates may need to change their LOA over
a variety of the damaging pitfalls brought about by the time, a concept known as adaptive autonomy (McGee and
black-box nature of AI agents. Some of these pitfalls, as McGregor 2016). Furthermore, teammates not only need
discussed above, lead directly to decreased trust in and to adapt to their environments but also to their human
acceptance of AI decisions (Zhou and Chen 2019). This teammates (McNeese et al. 2018; Richards and Sted-
is why understanding the effects and design implications mon 2017). This concept heavily motivates these studies’
of explainability in HATs is so important, as trust in AI’s inquiries into the intersection of autonomy and explain-
explanations is a key part of its acceptance by the humans ability. If AI teammates need to adapt their autonomy lev-
with whom it interacts (Ehsan and Riedl 2019). Still, we els in order to fulfill their team role while simultaneously
cannot assume that increasing the level of explainabil- adapting to the needs of their human teammates, then the
ity from a low-level to a high-level model will directly explanations they provide to those human teammates may
lead to increased trust and performance. In a study of need likewise to adapt as their autonomy levels change.
human-agent teaming in Minecraft, Paleja and colleagues Our review of the existing literature on autonomy and
found that while AI teammate explainability led to greater explainability levels in human-AI teaming presents a cou-
situational awareness and increased performance for nov- ple of intriguing research gaps. Previous work indicates
ices, it did not equate directly to increased performance that the black-box design of AI agents frustrates the human
for more experienced individuals (Paleja et al. 2021). In ability to understand and trust in an AI teammate’s deci-
fact, when the AI’s explanations evolved to include a full sions (von Eschenbach 2021), but to what degree that
decision tree, the novice participants experienced cogni- frustration varies as AI autonomy varies is uncertain.
tive overload (Paleja et al. 2021). The literature clearly Additionally, despite this recorded frustration, there is
shows a need to strike the right balance between an AI also evidence that higher-level explainability models also
teammate’s explanations and its human teammates’ cog- come with negative consequences and do not always lead
nitive capabilities in order to promote intra-team trust to increased trust and performance (Paleja et al. 2021). To
and performance in HATs (Nakahashi and Yamada 2021), address these gaps, we designed two complementary stud-
particularly in complex and high-risk environments (Ha ies that jointly provide a systematic understanding of the
et al. 2020). relationship between humans’ nuanced needs for explain-
ability and their AI teammate’s level of autonomy.
Cognition, Technology & Work (2024) 26:435–455 439

3 Study 1 allows for multiple networking tasks that could be performed


by both the human and the AI team members simultane-
3.1 Methods ously. It also showcased a very realistic human-AI teaming
scenario, where an AI agent can quickly execute computer
The experiment conducted for Study 1 specifically explores commands while a human accomplishes the physical tasks
the effect of AI autonomy level and explainability on of which an AI agent is incapable. All four conditions of
humans’ perceptions of competence and trust in their AI the task were selected for beginner-level participants, such
teammates. Study 1 utilized a mixed 2 (AI Autonomy Level: that they would all be equally challenging and time-con-
Low, High) x 2 (AI Explainability Level: Low, High) exper- suming for participants. Packet Tracer is also a lightweight
imental design, with the autonomy level of the AI agent program, meaning we could place the program on a virtual
manipulated between-subjects and the AI explainability machine that our participants could log into from anywhere
level manipulated within-subjects. Participants teamed up in the world. Screenshots of the virtual platform participants
with a single AI teammate to complete the Cisco network engaged with are shown in Fig. 1.
simulation program Packet Tracer. These human-AI team All four tasks focused on the setup and configuration of
dyads completed two iterations of the Packet Tracer activity a small-scale local network. During the experiment, partici-
(described below). In the following section, we will over- pants played the part of the Physical Network Tech, respon-
view the procedures for developing and implementing the sible for powering the devices and moving physical cables,
experimental platform and performing the experiment with which in Packet Tracer equates to the participant dragging
the participants. and dropping the cables between the correct devices. Mean-
while, the AI agent played the role of Software Tech, respon-
sible for all the actual device configurations. This job role
3.1.1 Networking task also helped minimize the effect of lower experience levels,
as the tasks participants needed to perform were relatively
Study 1 used Cisco’s educational networking simulation simple and easy to learn through a short practice exercise.
program, Packet Tracer, one of the most widely used visual Pilot sessions for this study showed that a practice exer-
learning methods for computer networking (Janitor et al. cise prior to the start of the actual experiment was indeed
2010). This program permits users to simulate the physical extremely helpful to inexperienced participants, and so all
cabling of networking devices and the software configura- participants did a practice exercise with the first author at
tion of the devices, making it ideal for the current study, as it

Fig. 1  Screenshot of the experi-


ment platform. Participants
are presented with the proper
network devices and cables and
are responsible for selecting and
moving the blue console cable
between devices for the AI to
access the right device
440 Cognition, Technology & Work (2024) 26:435–455

the start of their virtual session to ensure they understood Table 1  Participant demographic information
their tasks and how to communicate with their AI teammate.
Participants: 44 ( MAge = 34.63)
Men Women Non-binary/ Other
3.1.2 AI teammate Third gender
25 19 0 0
The AI teammate implemented in Study 1 utilized the Wiz- Caucasian Black/African- American- Asian Other
ard of Oz methodology (WoZ), a common technique within American Indian/
the HCI community (Kelley 2018). This technique enables Alaska
researchers to simulate more advanced design features like Native
AI teammate communication to garner insights regarding 31 4 1 6 2
AI teammates of the future. The virtual platform further At least some information technology/Networking experience
supported this technique, as the participants did not know 37
that the chats they had with the AI were actually being con-
ducted by a confederate researcher following a pre-made
script developed throughout several piloting sessions to 3.1.3 Participants
ensure accuracy and applicability. All communication with
the participants from the researchers occurred using a sepa- Following approval from the Clemson University Institu-
rate chat to maintain the script. tional Review Board, the current study recruited 44 partici-
Between-subjects manipulation: autonomy level Exit pants, with 19 identifying as women and the rest identifying
interviews from the pilot sessions showed that because the as men. The average age of participants was 34.63 (for addi-
AI exists only as a chat agent, an explicit permission phrase tional demographic information, see Table 1). Based on an
was the best option to effectively delineate autonomy levels a priori power analysis with an effect size of 0.13, in order
to participants. Specifically, in the "Low Autonomy" condi- to meet power, this experiment required a minimum sample
tion, the AI teammate had the confederate ask permission size of 42 total participants, which was achieved. Partici-
for all actions taken during the exercises and could not move pants were recruited using email solicitation and snowball
forward in the task without the participant granting that per- sampling of individuals with experience in information tech-
mission. Alternatively, in the "High Autonomy" condition, nology and/or computer science disciplines. This inclusion
the AI teammate had the confederate inform the participant criterion was implemented to help control for the potential
what the AI would do but did not ask or require their permis- confound of subject matter expertise by recruiting partici-
sion to perform the action. For both conditions, the confed- pants with generally equal levels of knowledge and aptitude
erate had a set of predetermined responses to any questions for the Packet Tracer task, which was achieved as shown in
posed to the agent by a participant that was in line with the Table 1. However, significant experience in networking work
AI agent’s supposed autonomy level. was not an explicit inclusion criterion, as the Packet Tracer
Within-subjects manipulation: explainability Finally, the task included training on the specific topics necessary to
pilot sessions also informed the design of the AI explain- complete the task successfully and was designed for begin-
ability manipulations by tying it to the Packet Tracer task. ner knowledge levels.
In particular, pilot participants wanted the explanations to
be contextually tied to the interface. As such, the "High 3.1.4 Procedure
Explainability" AI teammate was defined by the AI opening
the console with the commands it used to program the net- When participants agreed to participate in the experiment,
working device and explaining why it used the commands. they received an email with the task and descriptions of the
Whereas the "Low Explainability" AI was distinguished by exercises they would perform. They also received instruc-
the AI simply telling the participant when it was starting tions for logging into the virtual machine using Chrome
and completing a task. These are apparent changes in the Remote Desktop. In case the participant was not familiar
amount of information that the AI teammate was providing with the Packet Tracer program, a tutorial video was also
to the participant in terms of both content and frequency, included. Five minutes before their designated time, they
which was deemed necessary after exit interviews from the received the access code for the machine and the link to the
pilot sessions that indicated the need for additional informa- survey. Prior to beginning the experiment, participants com-
tion in the "High Explainability" condition in the form of pleted the pre-task survey, which covered informed consent
the console display being included for these participants to and demographic information.
perceive the manipulation as expected. This visual form of After completing the pre-task survey, participants went on
explanation has been shown to allow participants to better to complete a training period. The training period included
calibrate trust in AI (Liu et al. 2023). written instructions with illustrations that described the task
Cognition, Technology & Work (2024) 26:435–455 441

and how to complete it with their AI teammate, defining the Lumineau (Lumineau 2017) and adapted from previous use
two roles and their interdependencies. This written training in human-AI teaming research (Schelble et al. 2022a, b).
period was followed up with a live training phase where Perceived competency of the agent Perceived Compe-
participants engaged in a live practice round of the task with tency was measured in the post-task survey after each inter-
their AI teammate and the ability to ask questions with the action exercise with the AI agent using a 3-item 5-point Lik-
researcher should they have any. Once the training session ert scale (1=strongly disagree, 5=strongly agree) adapted
was completed, the participants were randomly assigned to by the authors based on similar perception of competence
one of the two between-subjects conditions of either low or scales utilized in AI research Gieselmann and Sassenberg
high AI teammate autonomy. All participants performed two (2023). Items included “The autonomous agent I worked
exercises, one with an AI agent with High Explainability with was competent at its role," “The autonomous agent I
and one with an agent with Low Explainability. Half of the worked with was capable of completing its assigned tasks,”
participants received the High Explainability condition first, “The autonomous agent I worked with was capable of joint
and half received the Low Explainability condition first. This problem solving” and had a reliability of 𝛼 = .79.
counter-balancing minimized the effect of participants hav- Perceived awareness of the AI’s actions Situation aware-
ing increased comfort and understanding with the exercises ness is an important human factor for human-centered AI
in the second condition they received and helped mitigate design (Chignell et al. 2023) and needed to be included in
any potential spill-over effects between within-subjects some fashion. We based this perceptual awareness meas-
conditions. ure on Tier 1 of the three-tier situational awareness model,
Following this exercise, participants began interfacing where an entity must accurately perceive their surroundings
with their AI teammate for the first exercise. Upon the com- (Endsley 1995). Thus, perceived situational awareness was
pletion of the first exercise, participants were prompted to measured in the post-task survey after each interaction exer-
complete the next part of the survey, which measured their cise with the AI agent using a 1-item 5-point Likert scale
perceptions of the agent as trustworthy and competent for (1=strongly disagree, 5=strongly agree) developed by the
that exercise. Once complete, they conducted the second authors. The item stated, “I felt aware of the actions my
exercise, followed by the remaining portions of the survey. autonomous teammate was taking.”
Once the survey was complete, the remote session was ter- Understanding of the AI’s actions The second tier in
minated. Participants received a debrief message following the three-tiered model of situational awareness addresses
the completion of the survey explaining the intent of the an entity’s ability to make sense of what they perceive, or
exercise, the use of the WoZ setup, and thanking them for understand their surroundings (Endsley 1995). Thus, we
their time. measured the participant’s understanding of the AI’s actions
in the post-task survey after each interaction exercise with
3.1.5 Measures the AI agent using a 1-item 5-point Likert scale (1=strongly
disagree, 5=strongly agree) developed by the authors. The
Human-AI interaction research has shown that the percep- item stated, “I understood why my autonomous teammate
tions that users have of an AI agent are vital to the design of took certain actions.” The final element of the three-tiered
explainable systems, as these perceptions directly affect the situational awareness model was explored in the participa-
user’s acceptance and trust in the agent (Shin 2020). Thus, tory design sessions in Study 2.
the main measures utilized in this study were self-reported
perceptions of the participants on 5-item Likert scales, as 3.2 Results
described in the following paragraphs.
Trust in the agent Human trust in their AI teammates is To address the stated research questions, a series of 2 (Level
integral to both their acceptance of the AI and the team’s of Autonomy: Low, High) x 2 (AI Explainability: Low,
overall performance (Costa et al. 2018; Centeio Jorge et al. High) mixed model ANOVAs were conducted on partici-
2022). For this reason, trust was measured in the post-task pants’ survey responses after each teaming experience. The
survey after each interaction exercise with the AI agent level of autonomy factor was conducted between-subjects,
using a 3-item 5-point Likert scale (1=strongly disagree, while AI explainability was analyzed as a within-subjects
5=strongly agree). Items included “The autonomous agent factor. The following sub-section reviews analyses on trust,
I worked with was trustworthy,” “The autonomous agent perceived competence, perceived awareness, and under-
I worked with could be trusted to complete the assigned standing of the AI teammate, concluding with an analysis
tasks,” and "I did not feel the need to monitor the autono- of the chat data to reveal participants’ objective need for
mous agent’s actions" and had a reliability of 𝛼 = .80. These AI explainability during the task. The following results
questions were based on the outcomes of trust defined by address RQ1, which sought to investigate how increases in
442 Cognition, Technology & Work (2024) 26:435–455

Fig. 2  Trust in the AI on a scale of 1 to 5, by the level of autonomy Fig. 3  Perceived competency of the AI on a scale of 1 to 5, by the
for high and low explainability (error bars represent standard error of level of autonomy for high and low explainability (error bars repre-
the mean) sent standard error of the mean)

the information given by AI explanations change humans’ on perceived competence (F(1, 42) = 5.73, p = 0.02, 𝜂 2 =
perception of their AI teammates at varying LOA. 0.12; see Fig. 3), and this was a medium-sized effect (Cohen
1988). Specifically, participants rated the high explainability
3.2.1 Trust in the AI teammate AI as significantly less competent (M = 1.67, SE = 0.11)
than they rated the low explainability AI (M = 1.84, SE =
The main effect of AI teammate autonomy level on trust in 0.11). Lastly, the interaction effect between AI teammate
the AI teammate was non-significant (F(1, 42) = 0.30, p = autonomy and explainability level was non-significant (F(1,
0.59, 𝜂 2 = 0.01). However, the main effect of AI teammate 42) = 0.19, p = 0.67, 𝜂 2 < .01).
explainability on trust in the AI was significant (F(1, 42) = These results show that the high autonomy AI teammate
4.42, p = 0.04, 𝜂 2 = 0.10; see Fig. 2) and this was a medium- was perceived as significantly more competent at completing
sized effect (Cohen 1988). Specifically, participants trusted its task work than the low autonomy AI teammate. These
the high-explainability AI teammate less (M = 1.77, SE = results provide insight into RQ1 by showing participants
0.14) than they trusted the low-explainability AI teammate related less explainability from the AI with a higher level
(M = 1.96, SE = 0.14). Lastly, the interaction effect between of competence. This result also presents further support for
autonomy and explainability levels was non-significant (F(1, the previous finding on trust. Specifically, it is intriguing that
42) = 0.57, p = 0.45, 𝜂 2 = 0.01). participants felt the less explainable AI was both more com-
This result shows that the participants felt the AI team- petent and more trustworthy than the low explainability AI.
mate that explained all of the actions it took in completing This shows that increasing AI explainability is not always
its team tasks through the chat was less trustworthy than the appropriate or helpful for teaming. This disconnect between
AI that only told them when it was starting and completing the XAI movement and these results sets up a notable exam-
a task. This result suggests that in some teams the addi- ple of how adaptive autonomy may be useful not only in
tional communications from the AI teammate, possibly due autonomy levels but also in explainability levels, especially
to communication overload, is actually counterproductive to when it comes to complex social contexts like teaming.
building trust. This result indicates that humans working in
human-AI teams want information from their AI teammates
only at appropriate intervals. In this case, this additional 3.2.3 Perceived awareness of AI teammate actions
information hurt the participant’s trust in the AI when it
came during the task itself. The main effect of AI teammate autonomy level on par-
ticipants’ awareness of AI actions was non-significant (F(1,
3.2.2 Perceived competence of the AI teammate 42) = 3.15, p = 0.08, 𝜂 2 < 0.01). Furthermore, the main
effect of AI teammate explainability level on awareness was
There was no significant main effect of AI teammate auton- also non-significant (F(1, 42) = 1.99, p = 0.17, 𝜂 2 = 0.01).
omy level on participants’ perceived competence of the AI The interaction effect between AI teammate autonomy and
(F(1, 42) = 1.50, p = 0.23, 𝜂 2 < 0.01). However, there was a explainability levels on awareness was also non-significant
significant main effect of AI teammate explainability level (F(1, 42) = 0.11, p = 0.75, 𝜂 2 < 0.01) (see Fig. 4).
Cognition, Technology & Work (2024) 26:435–455 443

These results show that participants in this experiment


did not feel that increased explainability or autonomy sig-
nificantly affected their understanding of the AI teammate’s
actions. One reason for this may be that the task the partici-
pants were asked to perform was familiar to the majority of
the participant population, which was targeted for having
IT and/or networking experience. It is worth noting that,
based on the chat log data, only 14 percent of participants
requested any additional explanation from the AIs. This
suggests that trust and understanding are not closely tied
together when it comes to AI explanations and that increas-
ing one will not directly cause an increase in the other. This
discrepancy emphasizes that other factors, such as the con-
Fig. 4  Awareness of AI actions on a scale of 1 to 5, by the level of
tent explored in Study 2, are important considerations for
autonomy for high and low explainability (error bars represent stand- designing AI that best supports both human teammate trust
ard error of the mean) and understanding.

4 Study 2

While Study 1 explored how the amount of information that


an AI teammate provides at different LOAs affects human
teammate perceptions (RQ1), it did not address what infor-
mation AI teammates should communicate. The following
section details the methods and results of Study 2, which
encompassed the two qualitative participatory design ses-
sions and exploration of RQ2.

4.1 Methods

In order to further understand and expand upon the results of


Fig. 5  Understanding of AI actions on a scale of 1 to 5, by the level Study 1, twelve of the study’s participants were recruited to
of autonomy for high and low explainability (error bars represent
standard error of the mean)
participate in one of two participatory design sessions. Such
sessions have been shown to produce realistic, innovative
design solutions within the HCI community (Thieme et al.
2023). These participatory design sessions took place over
While participants’ awareness of their AI teammate’s
Zoom after the experiment’s completion. In this way, par-
actions was not significantly affected by the explainability
ticipants had a fresh idea of what kinds of teaming scenarios
or autonomy level of the AI teammate, values for awareness
and roles an AI teammate might occupy and the information
were higher for participants in the high autonomy condition,
they would need to provide to human teammates. Study 2
a result that should be considered in future work.
utilized a similar IT networking scenario for the partici-
pants in order to explore the content of an AI teammate’s
3.2.4 Understanding of the AI teammate
explanations.
The main effect of AI teammate autonomy level on partici-
pants’ understanding of the AI teammate was non-significant 4.1.1 Participants
(F(1, 42) = 2.40, p = 0.13, 𝜂 2 < 0.01). Additionally, the main
effect of AI teammate explainability level on understanding All participants who completed Study 1 were asked if they
was non-significant (F(1, 42) = 1.80, p = 0.19, 𝜂 2 = .01). would be willing to participate in a participatory design ses-
Lastly, the interaction effect between AI teammate autonomy sion relating to the experiment. For our sessions, we decided
level and explainability level was non-significant (F(1, 42) that a flexible, conversational workshop would most benefit
= 0.13, p = 0.71, 𝜂 2 < 0.01) (see Fig. 5). our focus on the needs of the human teammate (Weber et al.
2015). We aimed to schedule five to seven participants per
444 Cognition, Technology & Work (2024) 26:435–455

Table 2  Participatory design session I and when AI teammates should explain their decisions to
Session Gender Age Occupation Ethnicity
the team:
DQ1: What would you want/need your AI teammate to
1 Woman 34 Cyber security American Indian explain?
1 Man 33 Cyber security Asian DQ2: How would you want/need it communicated (tex-
1 Man 29 Network engineering White tual, visual, audible, physical methods)?
1 Man 33 Software development White DQ3: When does the amount of explainability increase/
1 Man 33 Cyber security White decrease?
2 Woman 67 Insurance sales White
2 Woman 27 Graduate student White 4.1.3 Session procedure
2 Woman 26 Graduate student White
2 Man 27 Software development White Participants received a Zoom invitation and Jamboard link
2 Man 69 Electrical engineering White 10 min prior to the session. Once all participants were
(Ret.)
logged on, the lead author reviewed the consent to the study
2 Woman 66 IT project management White
and received verbal agreement to record the session. After
(Ret.)
initiating the recording, the lead author reviewed the sce-
2 Man 33 Copyright design White
nario and design questions and answered any questions from
the participants before entering into the semi-structured ses-
sion. Participants were then asked to individually brainstorm
session, as we were sensitive to the fact that should the group their answers to the design questions (how much and in what
become too large, it is easy for a few individuals to dominate way they would want their AI teammate to explain its work
the conversation (Weber et al. 2015). We provided the initial and under what conditions that vary) on the Jamboard. Once
twenty-three volunteers with the time slots of the sessions, all the participants announced their completion, the group
and through this schedule ended up with twelve total par- came together and discussed their thoughts. Throughout
ticipants, five for Session 1 and seven for Session 2. The this process, the participants changed the notes on the Jam-
demographics of these participants are reflected in Table 2. boards, the final products of which were used for analysis.
The sessions concluded with the first author reviewing the
4.1.2 Design scenario and questions design questions, the group answers to the questions, and
inviting any additional or closing comments.
Prior to the sessions, the lead author conducted a pilot ses-
sion with three individuals who provided feedback on the 4.1.4 Qualitative analysis
details of the scenario and procedure for the session. The
primary outcome from the pilot session was the switch to At the conclusion of the session, the entire Zoom session
Google Jamboard for collaboration between participants; was automatically transcribed by the Zoom software. The
whereas, in the pilot, the group utilized a shared Google first author then reviewed both recordings by hand to fix
Doc. Both participatory design sessions were conducted over transcription errors and provided these copies to the other
Zoom, with the lead author directing the session. For each authors for analysis. The Zoom transcripts and Jamboards
session, participants were described the scenario, the design were coded using a thematic coding process (Gavin 2008;
questions, and the schedule for the session. The scenario Braun and Clarke 2012), following which the authors con-
reflected that of the experiment, in which the participants ducted axial coding to develop main themes prevalent in
played the role of IT professionals for an IT help team on the data (Scott and Medaugh 2017). This reflexive process
a university campus. Their AI teammate was in charge of permitted the study data to guide the analysis (Blair 2015).
making software configuration changes to devices connected First, the authors line by line coded the transcripts. Next,
to the campus network as needed. The participants were these codes were grouped into like categories. Finally, the
told that the agent progressively changed to lower levels of authors combined groups into large themes that related to
autonomy throughout the incident response cycle, accord- the design and research questions. Once these themes were
ing to a previous study on adaptive AI in incident response developed, they were considered in concert with the quan-
(Hauptman et al. 2022). Specifically, the AI teammate would titative data from the experiment in order to determine how
begin the incident response at close to full autonomy and they did or did not help explain the results of the study,
decrease to partial autonomy as the incident response cycle as well as to determine the main themes in how an adap-
entered the containment phase. Participants were presented tive AI teammate should functionally explain itself to its
with three Design Questions that incorporated what, how, teammates. The participatory design sessions were vital to
uncovering this portion of the design recommendations, as
Cognition, Technology & Work (2024) 26:435–455 445

they allowed the participants to think through situations and 4.2.1 DQ1: Explanations of confidence and situation
interaction methods beyond what was presented in the exper-
iment. Indeed, as we will show in the following sections, Both sessions first focused on the what aspect of AI expla-
the sessions revealed several important considerations for nations or the main contents that human teammates would
designing an explainable AI teammate with differing levels need an AI teammate to explain to them. Two main themes
of autonomy and explanation. emerged from the PD sessions as the most important for an
AI teammate to explain: 1) the decision logic behind and the
4.2 Results confidence in the AI teammate’s decisions; and 2) contents
(i.e., terminologies, situation description) that help align
The participatory design sessions provided two main arti- teammates’ knowledge and understanding of the shared task
facts for analysis: the Jamboards, shown in Figs. 6 and and situation.
Fig. 7, and session transcripts that both help answer the Early in both sessions, participants expressed the desire
three DQs presented to the participants. DQ1 and DQ2 are to see the logic that an AI teammate used to make a deci-
nested under the study’s RQ1 and DQ3 under RQ2. While sion, believing that it was far more important for the AI to
the experiment provided some significant data for answer- explain the logic path behind its decisions, as opposed to
ing RQ1, it provided no significant data in terms of RQ2. In what specific tasks it was doing:
contrast, by not specifically probing the participants about
"I’m primarily concerned with the specific data points
autonomy level, the participants brought it into their dis-
that the AI used to make a determination, as well as the
cussions and provided us with ample data that addressed
logic that it used" (Male, 27, Software Development).
RQ2. From the data collected between the two sessions, we
identified seven main themes that the participants agreed "I would want a step-by-step indicator of the logic"
upon in regard to the three design questions. In this section, (Female, 26, Graduate Student).
we will use the participants’ written and verbal comments
As the participants explained above, explanations of the AI’s
to illustrate these themes in detail.
logic path would show the team that the AI possessed the

Fig. 6  Participatory design session 1 Jamboard. This session was competence in an AI through increased explainability early on in its
much more talkative, and the notes were more constructed after the incorporation into the team
fact. Participants emphasized the importance of building trust and
446 Cognition, Technology & Work (2024) 26:435–455

Fig. 7  Participatory design session 2 Jamboard. This group of partici- of the AI teammate in terms of changing its explainability and auton-
pants spent a lot of time getting their thoughts on the board before the omy based on risk and team experience
discussion. This group emphasized the importance of the adaptability

proper data points to be confident in its decision. This ele- to the team that it should be operating at a lower autonomy
ment of confidence became an essential topic of discussion level because human might want to oversee and intervene in
that participants reiterated throughout the sessions as the its less confident decisions.
concept of AI autonomy levels came into play. Participants Another important aspect of explainability is that it serves
desired for the AI to explain its confidence in its decision to ensure a shared understanding. For instance, explana-
through some form of a visual indicator and associate confi- tions provided to the team should allow team members to
dence level with the AI’s autonomy level. As the participant align their use and understanding of the meanings of the
details in the following quote, she would be more comfort- terminologies:
able with an AI operating at a high autonomy level when she
"I really want to make sure our words are similar for
can see that its confidence level is high:
execution because words are really important, and I
"In cases where something is very routine, or where it’s want us to be on the same page" (Female, 34, Cyber
something new for the AI, going back to the confidence Security).
indicator mentioned, it has a set point, some number
Indeed, one term could have various meanings depending
as configured of confidence, then it can go ahead and
on the context and disciplinary background. Reversely,
do it, and just report the outcomes" (Female, 66, IT
individuals from different backgrounds might use different
Project Management).
words or terms to mean the same thing. Not sharing the same
It would help people understand and dynamically adjust vocabulary will lead to miscommunication, hindering team
whether or not the AI was or should be operating at the effectiveness and efficiency. The need for shared terminol-
appropriate autonomy level if it was able to explain its deci- ogy is all the more crucial to generating shared understand-
sions in terms of confidence levels. For instance, if an AI ing, especially within multi-functional multi-disciplinary
teammate was operating at a high autonomy level during teams. Participants wanted the AI to not only explain the
routine operations and it encountered a new situation in terms themselves used to accommodate team members who
which its confidence level in its decision logic dropped, an are not familiar with the terms but also be able to detect
explanation of its now lower confidence would communicate and explain those used by human team members. The above
Cognition, Technology & Work (2024) 26:435–455 447

participant used the example of a "computer security sur- "Whatever recommendation that the agent may have
vey." While a survey means an examination for computer is submitted directly into the ticketing system, so that
scientists, psychologists immediately associate it with ques- way as you approve it and stuff, it’s logged in the same
tionnaires. To quickly align team members’ understanding of fashion as everybody else’s work" (Male, 33, Cyber
what actions the AI intends to perform with the survey, the Security).
AI needs to account for all the team members’ knowledge
"People are used to seeing certain data to make deci-
and background in its explanation.
sions, and they are used to seeing the data in a certain
In addition to aligning team members’ understanding of
form" (Male, 33, Copyright Design).
terminology, participants emphasized the importance of
aligning team members’ awareness of their shared task and What both of these participants further emphasized in these
situation through AI’s explanation: quotes and discussion is that mainly when AI teammates are
at lower autonomy levels and need humans to make a formal
"It provides everyone else with awareness about
decision based upon their explanations, the recommenda-
what’s going on so that they can make their own analy-
tions and requests to act need to be submitted in the same
sis" (Male, 27, Software Development).
format and via the same platform the team uses to consider
In discussing the needed level of explanation and auton- all of its decisions. In this way, teammates can assess and
omy, the above participant desired the AI to align every- make good decisions in the same manner they do for their
one’s understanding and awareness of their shared situ- personal tasks.
ation such that the human teammates can leverage the In addition to AI explanations integrating into team plat-
information and explanation provided by the AI to make forms, participants also emphasized the importance of AI
their own analysis and judgment to check against that of explanation methods to fit into team culture:
the AI’s. The greater awareness the AI’s explanation pro-
"I feel like the way that the AI is presented with this
vides human teammates, the better and the more efficient
team should be in conjunction with the way the team
the latter could give their "informed" input to the AI’s
interacts with each other" (Female, 27, Graduate Stu-
decisions and actions.
dent).
4.2.2 DQ2: Communication through existing channels Participants put this into the perspective of the modality of
and norms team interaction. Teams that meet daily in person would
respond better to an AI teammate that can communicate
Next, the session conversations turned toward how AI team- through a physical platform, a visual or physical interface.
mates should provide explanations to teammates. These con- Teams that communicate primarily through online collabo-
versations revealed the need for seamless integration of the ration platforms would respond best to AI teammates with
AI explanations into the team via existing communication an account and communicate in line with that platform. Most
platforms and channels (without creating a new one) while importantly, as this participant so aptly summarizes here, is
its communication style and modality fitting into the team that whatever modality the AI utilizes to provide its expla-
(or organizational) culture: nations to the team, it needs to be either a collaborative or
representative decision:
"Whatever kind of thing the rest of the team is cur-
rently collaborating on, the ability to seamlessly kind "It has to be a collaborative decision between you
of add that to whatever the AI reports on" (Female, 34, and your teammates of like who, what kind of entity
Cyber Security). would I feel most comfortable with having on my team"
(Female, 26, Graduate Student)
Participants felt that an important aspect of the AI being part
of the team, as opposed to just a tool used by the team, is that
it communicated in line with the communication methods 4.2.3 DQ3: One‑size explanations do not fit all
the team itself uses. The examples of a team Slack channel
and Skype were both mentioned in this regard. Similarly, The third and final design question that participants con-
participants explained that the formal explanations of an AI sidered in the participatory design sessions was under what
teammate’s action should also align with how human team- conditions, if any, they would want the explanation types
mates document their actions. Within the context of the ses- of the AI teammates to change. In addressing this question,
sion scenario, participants suggested that the AI submit its participants were almost all opposed to the idea that an AI
explanations to whatever ticketing system the campus uses teammate’s amount of explanation and its autonomy level
for its IT issues:
448 Cognition, Technology & Work (2024) 26:435–455

follow a simple linear relationship; instead, the amount and team. This aligns with teamwork research that shows the
the timing of AI teammate explanations should be based on need for increased communication between team members
the human teammates’ moment-by-moment need to know. as task risk increases (Leonard et al. 2004).
As the following participant emphasized, the AI needs to be As participants considered the third design question, they
able to make certain assumptions about what its teammates were again asked to consider how AI autonomy level played
already know in order not to overload them with excessive into when and how much explanation is needed, if at all. The
and redundant explanations. collective opinions were that in terms of autonomy level,
textual and written explanations are required to increase as
"I think that regardless of what the agent is communi-
an AI’s autonomy level increases. The main reason for this
cating, I think it needs to be very efficient responses,
is for auditing purposes:
not too wordy because that could be a lot as well, so
I feel like the agent has to assume the user has some So, you can very easily reconstruct whatever issue
level of expertise" (Female, 27, Graduate Student) happened from there, so it makes the entire post-analy-
sis process a lot easier" (Female, 67, Insurance Sales).
There’s a trade-off between explaining too much and too
little. On the one hand, the AI should be as brief as possi- Participants discussed the reality that both people and AI
ble so as to not annoy everyone by explaining every single make mistakes or actions result in unintended consequences,
thing it does. Conversely, the AI needs to provide enough so there may be the need to go back and understand an action
information and explanation so everyone on the team can that an AI took, mainly when it operated at a heightened
make proper sense of it. It requires the AI to make accu- autonomy level with less human input into the decision. Ulti-
rate assumptions about what its human teammates already mately, a human team leader will always be responsible for
know to provide an appropriate level of explanation. While the actions an AI takes under their lead. For this reason, as
humans can make relatively correct assumptions about what an AI teammate operates with less and less human oversight
other humans know (Fussell and Krauss 1992), it is yet to be and input, it is even more important that the explanations it
configured into the AI to possess such ability. makes of its actions be made in an auditable, textual man-
Additionally, the amount a person knows, and thus the ner that its teammates can consult both during and after it
amount of explanation needed, depends on various factors takes action.
such as their disciplinary background and experience. The
AI certainly would not need to explain "NLP" to a computer
scientist but would need to do so for lay people or someone 5 Discussion
newly onboard. Furthermore, as team members aggregate
information during their interaction and collaboration, the In this study, we sought to explore how changes in an AI
need to explain previously mentioned ideas should decrease. teammate’s level of explainability and level of autonomy
Another participant echoed this, stating that repeated over- individually and jointly affect a human teammate’s trust in
explanations by the AI can quickly lead to human frustration: and perception of the AI. To do this, we conducted a WoZ
experiment in which the participant worked with an AI team-
"You don’t want to have a situation where it asks, and
mate with varying levels of explainability to complete IT
I’ve told you once now, and I gave you new ground, I
networking tasks, after which we invited some of the partici-
told you a second time, and if you ask the same stu-
pants to take part in participatory design sessions to further
pid question the third time, you know, I’m going to be
clarify the design implications of the experiment. The dis-
pissed" (Male, 69, Electrical Engineering).
cussion will address that, in regards to RQ1 concerning how
Another aspect that affected the need for humans to know human teammate perceptions change as AI explainability
was the risk and complexity of the AI’s task. Participants increases, in some teaming situations increasing explainabil-
indicated that the higher risk and/or more complex an action ity is actually counterproductive to creating a trustworthy,
an AI teammate would take is, the more they would need the competent AI teammate. This is because human teammates
AI to explain its decision to the rest of the team. want AI communications only at appropriate intervals, such
that they don’t interfere with their own tasks. It will also
"Explainability increases as both risk and complexity
address, in regards to RQ2, how these perceptions change as
of the action increases" (Male, 29, Network Engineer-
an AI LOA changes, that LOA itself does not affect a team-
ing).
mate’s explainability needs. In fact, the explainability needs
Similar to the idea of the confidence indicator, a value of a human teammate may actually contribute to selecting
should be assigned to the risk associated with a task. With the optimal LOAs for the AI teammate.
higher levels of risk, the AI needs to explain more to the
Cognition, Technology & Work (2024) 26:435–455 449

5.1 Explainability and the need to know relative to human teammate’s familiarity and experience
with the task differs from the absolute task complexity, as
The experiment and the participatory design studies pre- it must account for an individual’s knowledge and experi-
sented in this paper jointly provide interesting insights into ence with the task that might differ across teammates and
how explainability and autonomy levels dynamically influ- can change moment-by-moment. This aligns with current
ence human perceptions of their AI teammates and suggest HCI research showing that the role AI plays in supporting a
important considerations for the information science com- human should consider user expertise and task complexity,
munity. The 2x2 experiment produced counter-intuitive and intelligent systems need to be capable of discriminating
results (e.g., the high explainability agent was perceived as between different users (Buckland and Florian 1991). As our
less competent and less trustworthy) that required additional participants pointed out, a thorough explanation is desirable
insight in order to understand why lower explainability AI only for the first time it is needed; it becomes annoying and
teammates were perceived as more trustworthy and compe- even detrimental to the team dynamic if the AI repeatedly
tent. The participatory design sessions revealed four main explains the same thing regardless of whether it is needed or
factors that influence the team’s "need to know," which not. However, newer teammates or newer tasks require more
determines the level of explainability needed from an AI explanation from the AI teammate, as the human teammates
teammate. These four factors are 1) the AI’s confidence in are still trying to understand what and why the AI is doing
its decision logic, 2) the absolute task complexity and task something under these new conditions and, consequently,
complexity relative to human familiarity, 3) risk, and 4) the how its actions affect the actions they themselves are taking.
ability to audit the AI’s explanations. This is in line with explainable AI research into question-
The idea of a confidence indicator was prevalent in the based explainability, which has shown users of different
participatory design sessions, most evident in the session 1 experience levels will have differing explainability needs to
Jamboard shown in Fig. 6. The participants emphasized that an be based upon questions they are likely to ask (or not ask)
indicator of the AI’s confidence in its decision logic can pro- (Liao et al. 2020). Therefore, the AI teammate’s explain-
vide them with an easy and efficient heuristic to trust that it’s ability should be tailored to the human teammates’ moment-
doing the right thing. This makes sense given the reasons why by-moment needs. The more complex the task is to human
AI teammates are attractive: they can handle large sets of data teammates (due to lack of prior knowledge or experience),
and computational workloads beyond human capability (Duan the greater the need to know.
et al. 2019). Because the AI should possess superior processing Third, the risk level associated with the task is just as
capabilities, it is logical that a human teammate would prefer important. Participants discussed the risk of the AI agent’s
to be told how confident the AI teammate is in its decision actions as a defining factor in the team’s need to know
than having to try to make sense of its explanations of those because as the risk increases, the actions that an AI team-
heavy computations, a language barrier that drives other com- mate is taking are more likely to affect the actions of the
munication needs such as natural language processing (Zhuang team at large. High-risk actions may bear additional con-
et al. 2017). This indicator would show a human teammate siderations, such as the ethical concern of having too little
how much input the AI needs from them, and mirrors recent human reason involved in the decision process (Shneider-
HCI research that shows humans desire more explanations that man 2020; Tolmeijer et al. 2022). Indeed, risk decision lit-
help them collaborate better, as opposed to just what the AI is erature shows that in evaluating decisions made in uncertain
doing (Kim et al. 2023). conditions, two of the most vital questions to ask are 1) what
The second aspect that participants identified as affect- are the potential impacts of that decision, and 2) is the deci-
ing their need to know was task complexity relative to a sion ethically good (Ersdal and Aven 2008)? Our session
human teammate’s familiarity and experience with the task. participants discussed this in terms of how you would expect
Research in all-human teams has shown that task complex- human teammates to pause and take extra care to explain to
ity heavily influences individual perceptions of teammates the team and their leaders what and why they intend to do
and intra-team trust (Choi and Cho 2019), and recent stud- something in a high-risk situation. The riskier the decision,
ies into human-robot teams have shown similar degradation the more explanation and consideration it requires.
of trust as the task complexity increases (Krausman et al. The fourth factor that affects the team’s need to know
2022; Zhang et al. 2023). Our study supports a relationship is the ability to return to and audit an AI agent’s expla-
between task complexity and the perceptions of teammates nations. In discussing how teams would need to receive
in human-AI teams. The fact that the task complexity in our explanations from an AI teammate at different autonomy
experiment was not very high made our participants feel that levels, the participants repeatedly returned to the idea of
the AI teammate with lower explainability was more compe- integrating the AI’s explanations into the team’s existing
tent because a task as simple as this does not require much communication and auditing platforms, particularly in
explanation. Additionally, this notion of task complexity design session two, shown in Fig. 7. In the sessions, there
450 Cognition, Technology & Work (2024) 26:435–455

was often a tug of war between participants in not want- AI teammates. The first of these recommendations focuses
ing to be distracted by their AI teammates and not trust- on the concept of adaptive explainability based on a team’s
ing them to always make the right decision. This clash need to know. The second recommendation is that AI team-
was fueled by the reality that if an AI teammate makes a mates can operate at higher LOAs when certain explain-
wrong decision or the team performs poorly, then at the ability conditions are met.
end of the day, it is the human teammates who will be
held responsible. This concept of human accountability 5.2.1 AI teammates should exhibit adaptive explainability
is widely considered the first principle of AI ethics (Lim based on their team’s need to know
and Kwon 2021). Thus, there is a requirement for human
teammates to understand what and why an AI teammate The first recommendation derived from the results of these
made certain decisions not only before but also during studies is that AI teammates need to assess a task’s com-
and after the fact. Participants explained that the easier plexity and risk level and use these values to determine the
these explanations are to review and audit, the more com- team’s need to know. This need to know, as discussed in the
fortable they would be with a more autonomous team- previous section, determines the level of explainability the
mate. This concept can be considered an extension of AI teammate provides over time. In terms of the scenario
the auditing processes placed on human employees/team- utilized in the participatory design sessions, the AI team-
mates to decrease insider threats (Colwill 2009). Organi- mate could read an incident ticket submitted to the team
zations require humans to document and explain their and determine how complex its response actions would need
actions in a recordable format, such that if there is a ques- to be and the risk level of negative impacts to the campus
tion of their trustworthiness or recklessness, teammates network of those actions. Using a predefined decision matrix
and superiors can review those explanations of actions. to aggregate the values (as suggested by one of our par-
Likewise, it would be prudent to enforce similar auditing ticipants), the AI teammate, going into the incident, would
mechanisms for autonomous teammates. This reinforces know the appropriate level of explainability to provide to
previous HCI research that has established the need for its teammates in responding to this specific incident ticket.
traceability of an AI agent’s decisions in order to ensure Additionally, the AI’s user interface should display a visual
there is proper accountability for any repercussions of its indicator of this aggregated complexity-risk value. This
actions (Lim and Kwon 2021). informs the rest of the team what level of explainability to
This desire to receive communications only at certain expect from their teammate over the course of the task. For
times helps explain our experiment results, where our par- instance, if the way the team interacts with the autonomous
ticipants actually rated the low explainability agent as teammate is over a team chat channel, there could be an
more trustworthy and competent. As the AI provides more explainability icon next to its avatar indicating this value.
explanations to a human teammate during a team task, it This recommendation has several implications for the HCI
appears to want more human input into its actions. Thus, community, particularly in regard to AI interface design.
even though participants would like to be able to review This recommendation specifically charges AI designers to
the AI teammate’s actions, its constant communications create interfaces that allow for side-by-side displays of AI
decrease its apparent independence. The factors identified explanations with this complexity-risk value.
above may be key to helping to resolve this issue, as they
identify the most important times and types of explana- 5.2.2 AI teammates should have higher autonomy
tions that AI teammates should provide. Targeted, adap- when they can provide high levels of written, archival
tive explanations promote trust in the AI by providing explainability
reasoning to human teammates when they need it most,
without overloading or annoying them with information When AI teammates possess lower levels of autonomy,
unrelated to their own team role. Combined, these four human teammates have more time to consider and process
factors represent the team’s need to know an explana- the explanations, ask for more details if necessary, and take
tion from an AI teammate in media res (in the middle of notes of the explanations as needed. In other words, there
things). This need to know can also be used to inform the is ample time for humans to understand why their AI team-
optimal levels of autonomy for an AI teammate, as we mate is doing something and how that is going to affect
will describe in the following design recommendations. the team. When AI teammates operate with higher levels
of autonomy, the time to understand the AI’s decision is
5.2 Design recommendations shortened, and teammates may need to go back and consult
those explanations during or after a task. These explana-
We will now present two important design recommendations tions are key to providing a degree of accountability over
that should be incorporated into the future development of the AI (Raji et al. 2020). For this reason, written, archival
Cognition, Technology & Work (2024) 26:435–455 451

explanations can allow AI teammates to operate at higher already shown that team composition affects a variety of
levels of autonomy because their explanations are available teaming factors (Schelble et al. 2022a). For instance, while
to the team for an extended period of time. An example of consulting multiple text explanations from a single AI team-
this within this study’s context would be if the AI provides mate may be helpful to the team, three providing real-time
explanations to the team through an online ticketing system. textual explanations may overwhelm the team’s cognitive
When the software agent is connected to a network and able capabilities while focusing on their own tasks, a further fac-
to submit these explanations to the system as it conducts its tor to consider in equipping agents with adaptive explain-
actions, it could possess higher levels of autonomy because ability. In terms of our participants, it is worth noting that
the rest of the team is able to consult the explanations in the our participatory design sample contained only two ethnic
system at any time. If the autonomous teammate is deployed minorities, thus presenting a largely white perspective on
on a computer system that is not currently connected to the the questions. A more diverse pool may present additional
network and can only communicate through a chat resident important findings. Finally, the experiment and design ses-
on that volatile system, it would need to operate at a lower sions in this paper focused on a singular, relatively low-risk
level of autonomy, as the longevity of its explanations likely task and environment. Because the results of this study
depends on the awareness and note-taking of one of its team- indicate that task complexity and risk have a lot of bearing
mates. In this way, the ability of the autonomous teammate on optimal explainability levels, it will be important that
to provide archival explanations serves as a guiding factor the community studies the effects of varying task and risk
in determining its optimal autonomy level(s). For the HCI complexity.
community, this is an important design recommendation that
requires organizations to consider not only the tasks that
an AI teammate will perform but also when and where it 6 Conclusion
will perform them. AI utilized for more than one team or
operating environment may need to be capable of commu- As humans and autonomous agents collaborate and work
nicating on several platforms, such that it can be adjusted to more independently as teammates, the explanations that
use whichever method is preferred by the team at the time. autonomous agents provide to their human counterparts
It would then also need to be able to be deployed at multiple become more and more critical. In this study, we showed
levels of autonomy, based upon the types of explanations that the level and type of explainability an artificially intelli-
selected. gent agent provides significantly affect the team’s perceived
competence and trust in that agent. Counter-intuitively, the
5.3 Limitations and future work participants in our teaming scenario perceived the AI agent
with a lower level of explainability as more trustworthy and
The research presented in this paper has limitations that competent than one with a high level of explainability. Our
should be considered when interpreting these results. First, participatory design sessions helped explore this paradox
the autonomous teammate with whom our participants inter- and guided our creation of two crucial design recommenda-
acted during the experiment communicated purely over text. tions for the HCI community concerning the details, fre-
The cognitive effort to process the textual explanations may quency, and modality of AI explanations centered on the
be greater than that to hear the explanations using audi- unique concept of adaptive explainability. These recom-
tory methods. Future research should further explore the mendations enable the information science community to
effects of different forms of explanations, such as auditory model and design adaptable AI agents that humans can per-
and pictorial, as research shows that there are significant ceive as capable, trustworthy teammates at varying levels
advantages to using explanation methods that cater more of autonomy.
specifically to individual learning needs (Wolf and Ringland
2020). Second, our studies involved only one autonomous
teammate, and it will be important to assess how different Appendix A
team compositions of more autonomous and human team-
mates affect how humans perceive the explanations of their Data analysis of the significant effects of AI explainability
autonomous teammates, as human-AI teaming research has level (Figs. 8, 9, 10, 11, 12 and 13).
452 Cognition, Technology & Work (2024) 26:435–455

Fig. 8  F statistics for trust


model

Fig. 9  Pairwise comparisons of


explainability levels on trust in
the AI

Fig. 10  Effect of explainability


levels on trust in the AI

Fig. 11  F Statistics for compe-


tence model

Fig. 12  Pairwise comparisons


of explainability levels on per-
ceived competency of the AI
Cognition, Technology & Work (2024) 26:435–455 453

Fig. 13  Effect of explainability


levels on perceived competency
of the AI

IEEE International Conference on Human-Robot Interaction,


1155–1157
Funding Open access funding provided by the Carolinas Consortium. Chen Z (2023) Collaboration among recruiters and artificial intelli-
Open access funding provided by the Carolinas Consortium. gence: removing human prejudices in employment. Cogn Tech-
nol Work 25(1):135–149
Data availability Anonymized data can be made available upon request Chen J, Sun J, Wang G (2022) From unmanned systems to autono-
by contacting the first author. mous intelligent systems. Engineering 12:16–19
Chignell M, Wang L, Zare A, Li J (2023) The evolution of hci and
Open Access This article is licensed under a Creative Commons Attri- human factors: Integrating human and artificial intelligence.
bution 4.0 International License, which permits use, sharing, adapta- ACM Trans Comp Human Inter 30(2):1–30
tion, distribution and reproduction in any medium or format, as long Choi O-K, Cho E (2019) The mechanism of trust affecting collab-
as you give appropriate credit to the original author(s) and the source, oration in virtual teams and the moderating roles of the cul-
provide a link to the Creative Commons licence, and indicate if changes ture of autonomy and task complexity. Comp Human Behav
were made. The images or other third party material in this article are 91:305–315
included in the article’s Creative Commons licence, unless indicated Cohen SN, Snow D, Szpruch L (2021) Black-box model risk in
otherwise in a credit line to the material. If material is not included in finance. arXiv preprint arXiv:​2102.​04757
the article’s Creative Commons licence and your intended use is not Cohen J (1988) Statistical power analysis for the behavioral sciences.
permitted by statutory regulation or exceeds the permitted use, you will Academic press, Newyork
need to obtain permission directly from the copyright holder. To view a Colwill C (2009) Human factors in information security: The insider
copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. threat-who can you trust these days? Inform Secur Tech Report
14(4):186–196
Costa AC, Fulmer CA, Anderson NR (2018) Trust in work teams:
An integrative review, multilevel model, and future directions.
References J Organ Behav 39(2):169–184
Das A, Rad P (2020) Opportunities and challenges in explainable
Abbass HA (2019) Social integration of artificial intelligence: func- artificial intelligence (xai): A survey. arXiv preprint arXiv:​2006.​
tions, automation allocation logic and human-autonomy trust. 11371
Cogn Comput 11(2):159–171 Dazeley R, Vamplew P, Foale C, Young C, Aryal S, Cruz F (2021)
Arrieta AB, Díaz-Rodríguez N, Del Ser J, Bennetot A, Tabik S, Bar- Levels of explainable artificial intelligence for human-aligned
bado A, García S, Gil-López S, Molina D, Benjamins R et al conversational explanations. Artif Intell 299:103525
(2020) Explainable artificial intelligence (xai): Concepts, taxono- de Lemos R, Grześ M (2019) Self-adaptive artificial intelligence.
mies, opportunities and challenges toward responsible ai. Inform In 2019 IEEE/ACM 14th International Symposium on Soft-
Fusion 58:82–115 ware Engineering for Adaptive and Self-Managing Systems
Bansal G, Wu T, Zhou J, Fok R, Nushi B, Kamar E, Ribeiro MT, and (SEAMS), IEEE, 155–156
Weld D (2021) Does the whole exceed its parts? the effect of Dhanorkar S, Wolf CT, Qian K, Xu A, Popa L, Li Y (2021) Who
ai explanations on complementary team performance. In Pro- needs to know what, when?: Broadening the explainable ai (xai)
ceedings of the 2021 CHI Conference on Human Factors in design space by looking at explanations across the ai lifecycle.
Computing Systems, 1–16 In Designing Interactive Systems Conference 2021:1591–1602
Blair E (2015) A reflexive exploration of two qualitative data coding Duan Y, Edwards JS, Dwivedi YK (2019) Artificial intelligence for
techniques. J Methods Meas Soc Sci 6(1):14–29 decision making in the era of big data-evolution, challenges and
Braun V, Clarke V (2012). Thematic analysis research agenda. Int J Inf Technol 48:63–71
Buckland MK, Florian D (1991) Expertise, task complexity, and arti- Ehsan U, Riedl M (2019) On design and evaluation of human-cen-
ficial intelligence: A conceptual framework. J Am Soc Inform tered explainable ai systems. Glasgow’19
Sci 42(9):635–643 Endsley MR (1995) Toward a theory of situation awareness in
Caldwell S, Sweetser P, O’Donnell N, Knight MJ, Aitchison M, dynamic systems. Human Factors 37(1):32–64
Gedeon T, Johnson D, Brereton M, Gallagher M, Conroy D Ersdal G, Aven T (2008) Risk informed decision-making and its
(2022) An agile new research framework for hybrid human-ai ethical basis. Reliab Eng Syst Saf 93(2):197–205
teaming: Trust, transparency, and transferability. ACM Trans Fussell SR, Krauss RM (1992) Coordination of knowledge in com-
Inter Intell Syst 12(3):1–36 munication: effects of speakers’ assumptions about what others
Castelvecchi D (2016) Can we open the black box of ai? Nature know. J Pers Soc Psychol 62(3):378
News 538(7623):20 Gavin H (2008) Thematic analysis. Understanding research methods
Centeio Jorge C, Tielman ML, Jonker CM (2022) Artificial trust as and statistics in psychology, 273–282
a tool in human-ai teams. In Proceedings of the 2022 ACM/
454 Cognition, Technology & Work (2024) 26:435–455

Gieselmann M, Sassenberg K (2023) The more competent, the bet- Liu J, Marriott K, Dwyer T, Tack G (2023) Increasing user trust
ter? the effects of perceived competencies on disclosure towards in optimisation through feedback and interaction. ACM Trans
conversational artificial intelligence. Social Sci Comp Rev Comp Human Inter 29(5):1–34
41(6):2342–2363 Lombrozo T (2006) The structure and function of explanations.
Ha T, Kim S, Seo D, Lee S (2020) Effects of explanation types and Trends Cogn Sci 10(10):464–470
perceived risk on trust in autonomous vehicles. Trans Res Part Lumineau F (2017) How contracts influence trust and distrust. J
F 73:271–280 Manage 43(5):1553–1577
Hauptman AI, Schelble BG, McNeese NJ, Madathil KC (2022) Adapt Marcinkevičs R, Vogt JE (2020) Interpretability and explainability:
and overcome: Perceptions of adaptive autonomous agents for A machine learning zoo mini-tour. arXiv preprint arXiv:​2012.​
human-ai teaming. Computers in Human Behavior, 107451 01805
Hussain F, Hussain R, Hossain E (2021) Explainable artificial intel- McGee ET, McGregor JD (2016) Using dynamic adaptive systems
ligence (xai): An engineering perspective. arXiv preprint arXiv:​ in safety-critical domains. In Proceedings of the 11th Interna-
2101.​03613 tional Symposium on Software Engineering for Adaptive and
Huvila I, Enwald H, Eriksson-Backa K, Liu Y-H, Hirvonen N (2022) Self-Managing Systems, 115–121
Information behavior and practices research informing informa- McNeese NJ, Demir M, Cooke NJ, Myers C (2018) Teaming with
tion systems design. J Assoc Inf Sci Technol 73(7):1043–1057 a synthetic teammate: Insights into human-autonomy teaming.
Jacovi A, Marasović A, Miller T, Goldberg Y (2021). Formalizing Human Factors 60(2):262–273
trust in artificial intelligence: Prerequisites, causes and goals of Miller T (2019) Explanation in artificial intelligence: Insights from
human trust in ai. In Proceedings of the 2021 ACM conference the social sciences. Artif Intell 267:1–38
on fairness, accountability, and transparency, 624–635 Mueller ST, Hoffman RR, Clancey W, Emrey A, Klein G (2019)
Janitor J, Jakab F, Kniewald K (2010) Visual learning tools for teach- Explanation in human-ai systems: A literature meta-review,
ing/learning computer networks: Cisco networking academy and synopsis of key ideas and publications, and bibliography for
packet tracer. In 2010 Sixth international conference on net- explainable ai. arXiv preprint arXiv:​1902.​01876
working and services, IEEE, 351–355 Nakahashi R, Yamada S (2021) Balancing performance and human
Jarrahi MH, Lutz C, Boyd K, Oesterlund C, Willis M (2022). Arti- autonomy with implicit guidance agent. Front Artif Intell 4:142
ficial intelligence in the work context Nyre-Yu, M., Gutzwiller, R. S., and Caldwell, B. S. (2019). Observ-
Joyce DW, Kormilitzin A, Smith KA, Cipriani A (2023) Explainable ing cyber security incident response: qualitative themes from
artificial intelligence for mental health through transparency field research. In Proceedings of the Human Factors and Ergo-
and interpretability for understandability. Digital Med 6(1):6 nomics Society Annual Meeting, volume 63, pages 437–441.
Kelley JF (2018) Wizard of oz (woz) a yellow brick journey. J Usa- SAGE Publications Sage CA: Los Angeles, CA
bility Stud 13(3):119–124 O’Neill, T., McNeese, N., Barron, A., and Schelble, B. (2020).
Kim SS, Watkins EA, Russakovsky O, Fong R, Monroy-Hernández Human–autonomy teaming: A review and analysis of the empir-
A (2023) " help me help the ai": Understanding how explain- ical literature. Human Factors, page 0018720820960865
ability can support human-ai interaction. In Proceedings of the Paleja R, Ghuy M, Ranawaka Arachchige N, Jensen R, Gombolay M
2023 CHI Conference on Human Factors in Computing Sys- (2021) The utility of explainable ai in ad hoc human-machine
tems, 1–17 teaming. Adv Neural Inform Process Syst 34:610–623
Kosch T, Welsch R, Chuang L, Schmidt A (2023) The placebo effect Parasuraman R, Sheridan TB, Wickens CD (2000) A model for types
of artificial intelligence in human-computer interaction. ACM and levels of human interaction with automation. IEEE Trans Syst
Trans Comp Human Inter 29(6):1–32 Man Cybern Part A 30(3):286–297
Krausman A, Neubauer C, Forster D, Lakhmani S, Baker AL, Parasuraman R, Sheridan TB, Wickens CD (2000) A model for types
Fitzhugh SM, Gremillion G, Wright JL, Metcalfe JS, Schaefer and levels of human interaction with automation. IEEE Trans Syst
KE (2022) Trust measurement in human-autonomy teams: Man Cybern Part A 30(3):286–297
Development of a conceptual toolkit. ACM Transactions on Pedreschi D, Giannotti F, Guidotti R, Monreale A, Ruggieri S, Turini F
Human-Robot Interaction (2019) Meaningful explanations of black box ai decision systems.
Langer M, Oster D, Speith T, Hermanns H, Kästner L, Schmidt E, In Proceedings of the AAAI conference on artificial intelligence
Sesing A, Baum K (2021) What do we want from explainable 33:9780–9784
artificial intelligence (xai)?-a stakeholder perspective on xai and Pokam R, Debernard S, Chauvin C, Langlois S (2019) Principles of
a conceptual model guiding interdisciplinary xai research. Artif transparency for autonomous vehicles: first results of an experi-
Intell 296:103473 ment with an augmented reality human-machine interface. Cogn
Larsson S, Heintz F (2020) Transparency in artificial intelligence. Technol Work 21:643–656
Internet Policy Rev 9(2):10 Raji ID, Smart A, White RN, Mitchell M, Gebru T, Hutchinson B,
Leonard M, Graham S, Bonacum D (2004) The human factor: the Smith-Loud J, Theron D, Barnes P (2020) Closing the ai account-
critical importance of effective teamwork and communication ability gap: Defining an end-to-end framework for internal algo-
in providing safe care. BMJ Quality Safety 13(suppl 1):i85–i90 rithmic auditing. In Proceedings of the 2020 conference on fair-
Liao QV, Gruen D, Miller S (2020) Questioning the ai: informing ness, accountability, and transparency, 33–44
design practices for explainable ai user experiences. In Pro- Richards D, Stedmon A (2017) Designing for human-agent collectives:
ceedings of the 2020 CHI Conference on Human Factors in display considerations. Cogn Technol Work 19:251–261
Computing Systems, 1–15 Sanneman L, Shah JA (2020) A situation awareness-based framework
Lim JH, Kwon HY (2021) A study on the modeling of major factors for design and evaluation of explainable ai. In International Work-
for the principles of ai ethics. In DG. O2021: The 22nd Annual shop on Explainable, Transparent Autonomous Agents and Multi-
International Conference on Digital Government Research, Agent Systems, Springer, 94–110
208–218 Schelble BG, Flathmann C, McNeese NJ, Freeman G, Mallick R
Lipton ZC (2018) The mythos of model interpretability: In machine (2022a) Let’s think together! assessing shared mental models,
learning, the concept of interpretability is both important and performance, and trust in human-agent teams. Proceedings of the
slippery. Queue 16(3):31–57 ACM on Human-Computer Interaction, 6(GROUP):1–29
Cognition, Technology & Work (2024) 26:435–455 455

Schelble, B. G., Lopez, J., Textor, C., Zhang, R., McNeese, N. J., Pak, Waltl B, Vogl R (2018) Increasing transparency in algorithmic-deci-
R., and Freeman, G. (2022b). Towards ethical ai: Empirically sion-making with explainable ai. Datenschutz und Datensicher-
investigating dimensions of ai ethics, trust repair, and performance heit-DuD 42(10):613–617
in human-ai teaming. Human Factors, page 00187208221116952 Wang N, Pynadath DV, Hill SG (2016) The impact of pomdp-generated
Schoenherr JR, Abbas R, Michael K, Rivas P, Anderson TD (2023) explanations on trust and performance in human-robot teams. In
Designing ai using a human-centered approach: Explainability Proceedings of the 2016 international conference on autonomous
and accuracy toward trustworthiness. IEEE Trans Technol Soc agents & multiagent systems, 997–1005
4(1):9–23 Wang D, Yang Q, Abdul A, Lim BY (2019) Designing theory-driven
Scott C, Medaugh M (2017) Axial coding. The international encyclo- user-centric explainable ai. In Proceedings of the 2019 CHI con-
pedia of communication research methods 10:9781118901731 ference on human factors in computing systems, 1–15
Shin D (2020) User perceptions of algorithmic decisions in the person- Weber S, Harbach M, Smith M (2015) Participatory design for secu-
alized ai system: Perceptual evaluation of fairness, accountability, rity-related user interfaces. Proc, USEC, 15 pp
transparency, and explainability. J Broadcasting Electron Media Weitz K, Schiller D, Schlagowski R, Huber T, André E (2019) " do
64(4):541–565 you trust me?" increasing user-trust by integrating virtual agents
Shin D (2021) The effects of explainability and causability on percep- in explainable ai interaction design. In Proceedings of the 19th
tion, trust, and acceptance: Implications for explainable ai. Int J ACM International Conference on Intelligent Virtual Agents, 7–9
Human Comp Stud 146:102551 Wickens CD, Li H, Santamaria A, Sebok A, Sarter NB (2010) Stages
Shin D (2021) Why does explainability matter in news analytic and levels of automation: An integrated meta-analysis. In Proceed-
systems? proposing explainable analytic journalism. J Stud ings of the human factors and ergonomics society annual meeting,
22(8):1047–1065 vol. 54, Sage Publications Sage CA: Los Angeles, CA, 389–393
Shin D (2022) The perception of humanness in conversational jour- Wilson HJ, Daugherty PR (2018) Collaborative intelligence: Humans
nalism: An algorithmic information-processing perspective. New and ai are joining forces. Harvard Business Review 96(4):114–123
Media Soc 24(12):2680–2704 Wolf CT, Ringland KE (2020) Designing accessible, explainable ai
Shneiderman B (2020) Bridging the gap between ethics and practice: (xai) experiences. ACM SIGACCESS Access Comput 125:1–1
guidelines for reliable, safe, and trustworthy human-centered ai Xie SL, Gao Y, Han R (2022) Information resilient society in an
systems. ACM Trans Interactive Intell Syst 10(4):1–31 ai world-is xai sufficient? Proc Assoc Inform Sci Technol
Slota SC, Fleischmann KR, Greenberg S, Verma N, Cummings B, Li 59(1):522–526
L, Shenefiel C (2022) Locating the work of artificial intelligence Xu F, Uszkoreit H, Du Y, Fan W, Zhao D, Zhu J (2019) Explain-
ethics. J Assoc Inform Sci Technol 74:311–322 able ai: A brief survey on history, research areas, approaches and
Sokol K, Flach P (2020) Explainability fact sheets. In Proceedings of challenges. In CCF international conference on natural language
the 2020 Conference on Fairness, Accountability, and Transpar- processing and Chinese computing, Springer, 563–574
ency. ACM Yu R, Alì GS (2019) What’s inside the black box? ai challenges for
Speith T (2022) A review of taxonomies of explainable artificial intel- lawyers and researchers. Legal Inform Manage 19(1):2–13
ligence (xai) methods. In 2022 ACM Conference on Fairness, Zhang Y, Li Z, Guo H, Wang L, Chen Q, Jiang W, Fan M, Zhou G,
Accountability, and Transparency, 2239–2250 Gong J (2023) " i am the follower, also the boss": Exploring dif-
Stowers K, Brady LL, MacLellan C, Wohleber R, Salas E (2021) ferent levels of autonomy and machine forms of guiding robots for
Improving teamwork competencies in human-machine teams: the visually impaired. In Proceedings of the 2023 CHI Conference
Perspectives from team science, Front Psychol 1669 on Human Factors in Computing Systems, 1–22
Suzanne Barber K, Goel A, Martin CE (2000) Dynamic adaptive Zhou J, Chen F (2019) Towards trustworthy human-ai teaming under
autonomy in multi-agent systems. J ExperimenTheoretical Artif uncertainty. In IJCAI 2019 workshop on explainable AI (XAI)
Intell 12(2):129–147 Zhuang Y-T, Wu F, Chen C, Pan Y-H (2017) Challenges and opportuni-
Thieme A, Hanratty M, Lyons M, Palacios J, Marques RF, Morrison ties: from big data to knowledge in ai 2.0. Front Inform Technol
C, Doherty G (2023) Designing human-centered ai for mental Electron Eng 18(1):3–14
health: Developing clinically relevant applications for online cbt Zieba S, Polet P, Vanderhaegen F, Debernard S (2010) Principles of
treatment. ACM Trans Comp Human Inter 30(2):1–50 adjustable autonomy: a framework for resilient human-machine
Tolmeijer S, Christen M, Kandul S, Kneer M, Bernstein A (2022) cooperation. Cogn Technol Work 12(3):193–203
Capable but amoral? comparing ai and human expert collabora-
tion in ethical decision making. In Proceedings of the 2022 CHI Publisher's Note Springer Nature remains neutral with regard to
Conference on Human Factors in Computing Systems, 1–17 jurisdictional claims in published maps and institutional affiliations.
Vilone, G. and Longo, L. (2020). Explainable artificial intelligence: a
systematic review. arXiv preprint arXiv:​2006.​00093
von Eschenbach WJ (2021) Transparency and the black box problem:
Why we do not trust ai. Philos Technol 34(4):1607–1622

You might also like