0% found this document useful (0 votes)

63 views44 pages

How Should My Chatbot Interact

How should my chatbot interact

Uploaded by

TempLuigi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views44 pages

How Should My Chatbot Interact

How should my chatbot interact

Uploaded by

TempLuigi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

How should my chatbot interact?

A survey on human-chatbot interaction design

ANA PAULA CHAVES, Northern Arizona University, USA and Federal University of Technology–Paraná, Brazil
MARCO GEROSA, Northern Arizona University, USA
The growing popularity of chatbots has brought new needs for HCI since it has changed the patterns of human interactions with
computers. The conversational aspect of the interaction increases the necessity for chatbots to present social behaviors that are habitual
in human-human conversations. In this survey, we argue that chatbots should be enriched with social characteristics that are coherent
arXiv:1904.02743v1 [cs.HC] 4 Apr 2019

with users’ expectations, ultimately avoiding frustration and dissatisfaction. We bring together the literature on disembodied, text-based
chatbots to derive a conceptual model of social characteristics for chatbots. We analyzed 58 papers from various domains to understand
how social characteristics can benefit the interactions and identify the challenges and strategies to designing them. Additionally, we
discussed how characteristics may influence one another. Our results provide relevant opportunities to both researchers and designers
to advance human-chatbot interactions.

Additional Key Words and Phrases: chatbots, social characteristics, human-chatbot interaction

1 INTRODUCTION
Chatbots are computer programs that interact with users in natural language [Shawar and Atwell 2007]. The origin
of the chatbots concept dates back to 1950 [Turing 1950]. ELIZA [Weizenbaum 1966] and A.L.I.C.E. [Wallace 2009]
are examples of early chatbots technologies, where the main goal was to mimic human conversations. Over the years,
the chatbot concept has evolved. Today, chatbots may have characteristics that distinguish one agent from the others,
which resulted in several synonyms, such as multimodal agents, chatterbots, and conversational interfaces. In this
survey, we use the term “chatbot” to refer to a disembodied conversational agent that held a natural language conversation
via text-based environment to either engage the user in a general-purpose or task-oriented conversation.
Chatbots are changing the patterns of interactions between humans and computers [Følstad and Brandtzæg 2017].
Many instant messenger tools, such as Skype, Facebook Messenger, and Telegram provide platforms to develop and
deploy chatbots, who either engage with users in general conversations or help them solve domain specific tasks
[Dale 2016]. As the messaging tools become platforms, traditional websites and apps are providing space for this new
form of human-computer interaction (HCI) [Følstad and Brandtzæg 2017]. For example, in the 2018 F8 Conference,
Facebook announced having 300K chatbots active on Facebook Messenger [Boiteux 2019]. The BotList website indexes
thousands of chatbots for education, entertainment, games, health, productivity, travel, fun, and several other categories.
The growth of chatbot technology is changing how companies engage with their customers [Brandtzaeg and Følstad
2018; Gnewuch et al. 2017], students engage with their learning groups [Hayashi 2015; Tegos et al. 2016], and patients
self-monitor the progress of their treatment [Fitzpatrick et al. 2017], among many other applications.
However, chatbots still fail to meet users’ expectations [Brandtzaeg and Følstad 2018; Jain et al. 2018b; Luger and
Sellen 2016; Zamora 2017]. While many studies on chatbots design focus on improving chatbots’ functional performance
Authors’ addresses: Ana Paula Chaves, Northern Arizona University, Flagstaff, AZ, USA, Federal University of Technology–Paraná, Campo Mourão,
Brazil, [email protected]; Marco Gerosa, Northern Arizona University, Flagstaff, AZ, USA, [email protected].

© 2018 Association for Computing Machinery.

Manuscript submitted to ACM

Manuscript submitted to ACM 1

2 Chaves and Gerosa

and accuracy (see e.g. [Jiang and E Banchs 2017; Maslowski et al. 2017]), the literature has consistently suggested that
chatbots’ interactional goals should also include social capabilities [Jain et al. 2018b; Liao et al. 2018]. According to
the Media Equation theory [Reeves and Nass 1996], people naturally respond to social situations when interacting
with computers [Fogg 2003; Nass et al. 1994]. As chatbots are designed to interact with users in a way that mimics
person-to-person conversations, new challenges in HCI arise [Følstad and Brandtzæg 2017; Nguyen and Sidorova
2018]. [Neururer et al. 2018] state that making a conversational agent acceptable to users is primarily a social, not only
technical, problem to solve. In fact, studies on chatbots have shown that people prefer agents who: conform to gender
stereotypes associated with tasks [Forlizzi et al. 2007]; self-disclose and show reciprocity when recommending [Lee
and Choi 2017]; and demonstrate a positive attitude and mood [Thies et al. 2017]. When chatbots do not meet these
expectations, the user may experience frustration and dissatisfaction [Luger and Sellen 2016; Zamora 2017].
Although chatbots’ social characteristics have been explored in the literature, this knowledge is spread across several
domains in which chatbots have been studied, such as customer services, education, finances, and travel. In the HCI
domain, some studies focus on investigating the social aspects of human-chatbot interactions (see, e.g., [Ciechanowski
et al. 2018; Ho et al. 2018; Lee and Choi 2017]). However, most studies focus on a single or small set of characteristics (e.g.,
[Mairesse and Walker 2009; Schlesinger et al. 2018]); in other studies, the social characteristics emerged as secondary,
exploratory results (e.g., [Tallyn et al. 2018; Toxtli et al. 2018]). It has become difficult to find evidence regarding what
characteristics are important for designing a particular chatbot, and what research opportunities exist in the field. A lack
of studies bring together the social characteristics that influence the way users perceive and behave toward chatbots.
To fill this gap, this survey compiles research initiatives for understanding the impact of chatbots’ social characteristics
on the interaction. We bring together literature that is spread across several research areas. From our analysis of 58
scientific studies, we derive a conceptual model of social characteristics, aiming to help researchers and designers
identify what characteristics are relevant to their context and how their design choices influence the way humans
perceive the chatbots. The research question that guided our investigation was: What chatbot social characteristics
benefit human interaction and what are the challenges and strategies associated with them?
To answer this question, we discuss why designing a chatbot with a particular characteristic can enrich the human-
chatbot interaction. Our results can both provide insight into whether the characteristic is desirable for a particular
chatbot, as well as inspire researchers’ further investigations. In addition, we discuss the interrelationship among the
identified characteristics. We stated 22 propositions about how social characteristics may influence one another. In the
next section, we present an overview of the studies included in this survey.

2 OVERVIEW OF THE SURVEYED LITERATURE

The literature presents no coherent definition of chatbots; thus, to find relevant studies we used a search string that
includes the synonyms chatbots, chatterbots, conversational agents, conversational interfaces, conversational systems,
conversation systems, dialogue systems, digital assistants, intelligent assistants, conversational user interfaces, and
conversational UI. We explicitly left out studies that relate to embodiment (e.g., ECA, multimodal, robots, eye-gaze,
gesture), and speech input mode (e.g., speech-based, speech-recognition, voice-based). We did not include the term
social bots, because it refers to chatbots that produce content for social networks such as Twitter [Ferrara et al. 2016].
We did not include personal assistants either, since this term consistently refers to commercially available, voice-based
assistants such as Google Assistant, Amazon Alexa, Apple Siri, and Microsoft Cortana. We decided to not include terms
that relate to social characteristics/traits, because most studies do not explicitly label their results as such.
After filtering the search results, we had 58 remaining studies. Most of the selected studies are recent publications
Manuscript submitted to ACM
How should my chatbot interact? 3

(less than 10 years). The publication venues include the domains of human-computer interactions (25 papers), learning
and education (9 papers), information and interactive systems (8 papers), virtual agents (5 papers), artificial intelligence
(3 papers), and natural language processing (3 papers). We also found papers from health, literature & culture, internet
science, computer systems, communication, and humanities (1 paper each). Most papers (59%) focus on task-oriented
chatbots. General purpose chatbots reflect 33% of the surveyed studies. Most general purpose chatbots (16 out of 19)
are designed to handle topic-unrestricted conversations. The most representative specific-domain is education, with 9
papers, followed by customer services, with 5 papers. See the Appendix A (Supplemental Materials) for the complete
list of topics.
We analyzed the papers by searching for chatbot behavior or attributed characteristics that influence the way users
perceive it and behave toward it. Noticeably, the characteristics and categories are seldom explicitly pointed to in the
literature, so the conceptual model was derived using a qualitative coding process inspired in methods such as Grounded
Theory [Auerbach and Silverstein 2003] (open coding stage). For each study (documents), we selected relevant statements
from the paper (quotes) and labeled them as a characteristic (code). After coding all the studies, a second researcher
reviewed the produced set of characteristics and discussion sessions were performed to identify characteristics that
could be merged, renamed, or removed. At the end, the characteristics were grouped into the categories, depending on
whether the characteristic relates to the chatbot’s virtual representation, conversational behavior, or social protocols.
Finally, the quotes for each characteristic were labeled as references to benefits, challenges, or strategies.
We derived a total of 11 social characteristics, and grouped them into three categories: conversational intelligence,
social intelligence, and personification. The next section describes the derived conceptual model.

3 CHATBOTS SOCIAL CHARACTERISTICS

This section describes the identified social characteristics grouped into categories. As Table 1 depicts, the category
conversational intelligence includes characteristics that help the chatbot manage interactions. Social intelligence
focuses on habitual social protocols, while personification refers to the chatbot perceived identity and personality
representations. In the following subsections, we independently describe the identified social characteristics, and
point to the relationships to other characteristics when relevant. Then, we summarize the relationship among the
characteristics in Section 4. For each category, a table with an overview of the included studies is provided in the
supplementary materials (Appendix A). The supplementary materials also include a table for each social characteristic,
listing the studies associated with the reported benefits, challenges, and strategies. Finally, the supplementary materials
also highlight five constructs that can be used to assess whether social characteristics are reaching the intended design
goals.

3.1 Conversational Intelligence

Conversational intelligence enables the chatbot to actively participate in the conversation and to demonstrate
awareness of the topic discussed, the evolving conversational context, and the dialogue flow. Therefore, conversational
Intelligence refers to the ability of a chatbot to effectively converse beyond the technical capability of achieving
a conversational goal [Jain et al. 2018b]. In this section, we discuss social characteristics related to conversational
intelligence, namely: proactivity (18 studies), conscientiousness (11 studies), and communicability (6 studies). Most of
the studies rely on data that comes from the log of the conversations, interviews, and questionnaires. The questionnaires
are mostly Likert-scales, and some of them include subjective feedback. Most studies analyzed the interaction with real
chatbots, although Wizard of Oz (WoZ) settings are also common. In WoZ, participants believe to be interacting with
Manuscript submitted to ACM
4 Chaves and Gerosa

Social Characteristics Benefits Challenges Strategies

[B1] to provide additional information [C1] timing and relevance [S1] to leverage conversational context

Conversational Intelligence
[B2] to inspire users and to keep the con- [C2] privacy [S2] to select a topic randomly
Proactivity versation alive
[B3] to recover from a failure [C3] users’ perception of being controlled
[B4] to improve conversation productiv-
ity
[B5] to guide and engage users
[B1] to keep the conversation on track [C1] to handle task complexity [S1] conversational flow
Conscientiousness [B2] to demonstrate understanding [C2] to harden the conversation [S2] visual elements
[B3] to hold a continuous conversation [C3] to keep the user aware of the chatbot’s [S3] confirmation messages
context
[B1] to unveil functionalities [C1] to provide business integration [S1] to clarify the purpose of the chatbot
[B2] to manage the users’ expectations [C2] to keep visual elements consistent with [S2] to advertise the functionality and suggest
Communicability
textual inputs the next step
[S3] to provide a help functionality
[B1] to appropriately respond to harass- [C1] to deal with unfriendly users [S1] emotional reactions
ment
Damage control
[B2] to deal with testing [C2] to identify abusive utterances [S2] authoritative reactions
[B3] to deal with lack of knowledge [C3] to balance emotional reactions [S3] to ignore the user’s utterance and change
the topic
Social Intelligence

[S4] conscientiousness and communicability

[S5] to predict users’ satisfaction
[B1] to adapt the language dynamically [C1] to decide on how much to talk Not identified
Thoroughness
[B2] to exhibit believable behavior [C2] to be consistent
[B1] to increase human-likeness [C1] to deal with face-threatening acts [S1] to engage in small talk
Manners
[C2] to end a conversation gracefully [S2] to adhere turn-taking protocols
[B1] to avoid stereotyping [C1] to avoid alienation Not identified
Moral agency
[B2] to enrich interpersonal relationships [C2] to build unbiased training data and algo-
rithms
[B1] to enrich interpersonal relationships [C1] to regulate affective reactions [S1] to use social-emotional utterances
Emotional intelligence [B2] to increase engagement [S2] to manifest conscientiousness
[B3] to increase believability [S3] reciprocity and self-disclosure
[B1] to enrich interpersonal relationships [C1] privacy [S1] to learn from and about the user
Personalization [B2] to provide unique services [S2] to provide customizable agents
[B3] to reduce interactional breakdowns [S3] visual elements
[B1] to increase engagement [C1] to avoid negative stereotypes [S1] to design and elaborate on a persona
Personification

Identity
[B2] to increase human-likeness [C2] to balance the identity and the technical
capabilities
[B1] to exhibit believable behavior [C1] to adapt humor to the users’ culture [S1] to use appropriate language
Personality
[B2] to enrich interpersonal relationships [C2] to balance the personality traits [S2] to have a sense of humor

Table 1. Conceptual model of chatbots social characteristics

a chatbot when, in fact, a person (or wizard) pretends to be the automated system [Dahlbäck et al. 1993]. Only two
papers did not evaluate a particular type of interaction because they were based on a literature review [Gnewuch et al.
2017] and surveys with chatbot’s users in general [Brandtzaeg and Følstad 2017]. See the supplementary materials for
details (Appendix A).

3.1.1 Proactivity. Proactivity is the capability of a system to autonomously act on the user’s behalf [Salovaara and
Oulasvirta 2004] to reduce the amount of human effort to complete a task [Tennenhouse 2000]. In human-chatbot
conversations, a proactive behavior enables a chatbot to share initiative with the user, contributing to the conversation in
a more natural way [Morrissey and Kirakowski 2013]. Chatbots may manifest proactivity when they initiate exchanges,
suggests new topics, provide additional information, or formulate follow-up questions. In this survey, we found 18 papers
that report either chatbots with proactive behavior or implications of manifesting a proactive behavior. Proactivity
(also addressed as “intervention mode”) was explicitly addressed in seven studies [Avula et al. 2018; Chaves and Gerosa
2018; Dyke et al. 2013; Hayashi 2015; Liao et al. 2016; Schuetzler et al. 2018; Tegos et al. 2016]. In most of the studies,
however, proactivity emerged either as an exploratory result, mostly from post-intervention interviews and user’s
feedback [Duijvelshoff 2017; Jain et al. 2018b; Morrissey and Kirakowski 2013; Portela and Granell-Canut 2017; Shum
et al. 2018; Thies et al. 2017], or as a strategy to attend to domain-specific requirements (e.g., monitoring, and guidance)
[Fitzpatrick et al. 2017; Mäurer and Weihe 2015; Silvervarg and Jönsson 2013; Tallyn et al. 2018; Toxtli et al. 2018].
The surveyed literature evidences several benefits of offering proactivity in chatbots:
[B1] to provide additional, useful information: literature reveals that proactivity in chatbots adds value to
interactions [Avula et al. 2018; Morrissey and Kirakowski 2013; Thies et al. 2017]. Investigating evaluation criteria for
Manuscript submitted to ACM
How should my chatbot interact? 5

chatbots, [Morrissey and Kirakowski 2013] asked users of a general purpose chatbot to rate the chatbots’ naturalness
and report in what areas they excel. Both statistical and qualitative results confirm that taking the lead and suggesting
specialized information about the conversation theme correlates to chatbots’ naturalness. [Thies et al. 2017] corroborates
this result; in post-intervention interviews, ten out of 14 users mentioned they preferred a chatbot that takes the lead
and volunteers to provide additional information such as useful links and song playlists. In a WoZ study, [Avula et al.
2018] investigated whether proactive interventions of a chatbot contribute to a collaborative search in a group chat.
The chatbot either elicits or infers needed information from the collaborative chat and proactively intervenes in the
conversation by sharing useful search results. The intervention modes were not significantly different from each other,
but both intervention modes resulted in a statistically significant increase of enjoyment and decrease of effort when
compared to the same task with no chatbot interventions. Moreover, in a post-intervention, open-ended question, 16
out of 98 participants self-reported positive perceptions about the provided additional information.
[B2] to inspire users, and keep the conversation alive: proactively suggesting and encouraging new topics have
been shown useful to both inspire users [Avula et al. 2018; Chaves and Gerosa 2018] and keep the conversation alive
[Silvervarg and Jönsson 2013]. Participants in the study conducted by [Avula et al. 2018] self-reported that the chatbot’s
suggestions helped them to get started (7 mentions) and gave them ideas about topics to search for (4 mentions). After
iteratively evaluating prototypes for a chatbot in an educational scenario, [Silvervarg and Jönsson 2013] concluded that
proactively initiating topics makes the dialogue more fun and reveals topics the chatbot can talk about. The refined
prototype also proactively maintains the engagement by posing a follow-up when the student had not provided an
answer to the question. [Schuetzler et al. 2018] hypothesized that including follow-up questions based on the content of
previous messages will result in higher perceived partner engagement. The hypothesis was supported, with participants
in the dynamic condition rating the chatbot as more engaging. In an ethnographic data collection [Tallyn et al. 2018],
users included photos in their responses to add information about their experience; 85% of these photos were proactively
prompted by the chatbot. This result shows that prompting the user for more information stimulates them to expand
their entries. [Chaves and Gerosa 2018] also observed that chatbots’ proactive messages provided insights about the
chatbots’ knowledge, which potentially helped the conversation to continue. In this paper, we call the strategies to
convey the chatbot’s knowledge and capabilities as communicability, and we discuss it in Section 3.1.3.
[B3] to recover the chatbot from a failure: in [Portela and Granell-Canut 2017] and [Silvervarg and Jönsson
2013], proactivity is employed to naturally recover from a failure. In both studies, the approach was to introduce a new
topic when the chatbot failed to understand the user or could not find an answer, preventing the chatbot from getting
stuck and keeping the conversation alive. Additionally, in [Silvervarg and Jönsson 2013], the chatbot inserted new
topics when users are either abusive or non-sensical. We call the strategies to handle failure and abusive behavior as
damage control, and we discuss this characteristic in Section 3.2.1.
[B4] to improve conversation productivity: in task-oriented interactions, such as searching or shopping, proac-
tivity can improve the conversation productivity [Jain et al. 2018b]. In interviews with first-time users of chatbots, [Jain
et al. 2018b] found that chatbots should ask follow-up questions to resolve and maintain the context of the conversation
and reduce the search space until achieving the goal. [Avula et al. 2018] found similar results for collaborative search;
28 out of 98 participants self-reported that chatbot’s proactive interventions saved collaborators time.
[B5] to guide and engage users: in particular domains, proactivity helps chatbots to either guide users or establish
and monitor users’ goals. In [Fitzpatrick et al. 2017], the chatbot assigns a goal to the user and proactively prompts
motivational messages and reminders to keep the user engaged in the treatment. [Mäurer and Weihe 2015] suggest that
a decision-making coach chatbot needs to lead the interaction to guiding the user toward a decision. In ethnographic
Manuscript submitted to ACM
6 Chaves and Gerosa

data collection [Tallyn et al. 2018], the chatbot prompts proactive messages that guide the users on what information
they need to report. [Toxtli et al. 2018] evaluates a chatbot that manage tasks in a workplace. Proactive messages are
used to check whether the team member has concluded the tasks, and then report the outcome to the other stakeholders.
In the educational context, proactivity is used to develop tutors that engage the students and facilitate learning. In
[Hayashi 2015], the tutor chatbot was designed to provide examples of how other students made explanations about a
topic. The network analysis of the learner’s textual inputs shows that students used more key terms and provided more
important messages when receiving feedback about other group members. In [Hayashi 2015], [Dyke et al. 2013], and
[Tegos et al. 2016] the chatbots prompt utterances to encourage the students to reason about a topic. In all three studies,
the chatbot condition provided better learning outcomes and increased students’ engagement in the discussions.
The surveyed papers also highlight challenges of providing proactive interactions, such as obtaining timing and
relevance, privacy, and the user’s perception of being controlled.
[C1] timing and relevance: untimely and irrelevant proactive messages may compromise the success of the
interaction. [Portela and Granell-Canut 2017] states that untimely turn-taking behavior was perceived as annoying,
negatively affecting emotional engagement. [Liao et al. 2016] and [Chaves and Gerosa 2018] reported that proactivity can
be disruptive. [Liao et al. 2016] investigated proactivity in a workspace environment, hypothesizing that the perceived
interruption of agent proactivity negatively affects users’ opinion. The hypothesis was supported, and the authors found
that what influence the sense of interruption is the general aversion to unsolicited messages, regardless of whether
it comes from a chatbot or a colleague. [Chaves and Gerosa 2018] showed that proactively introducing new topics
resulted in a high number of ignored messages. The analysis of the conversation log reviewed that either the new
topics were not relevant, or it was not the proper time to start a new topic. [Silvervarg and Jönsson 2013] also reported
annoyance when a chatbot introduces repetitive topics.
[C2] privacy: in a work-related, group chat, [Duijvelshoff 2017] observed privacy concerns regarding the chatbot
“reading” the employees’ conversations to act proactively. During a semi-structured interview, researchers presented a
mockup of the chatbot to employees from two different enterprises and collected perceptions of usefulness, intrusiveness,
and privacy. Employees reported the feeling that the chatbot represented their supervisors’ interests, which conveyed
as sense of workplace surveillance. Privacy concerns may result in under-motivated users, discomfort about disclosing
information, and lack of engagement [Duijvelshoff 2017].
[C3] user’s perception of being controlled: proactivity can be annoying when the chatbot conveys the impression
of trying to control the user. [Tallyn et al. 2018] report that seven out of 13 participants reported irritation with the
chatbot; one of the most frequent reasons was due to the chatbot directing them to specific places. For the task
management chatbot, [Toxtli et al. 2018] reported to have adapted the follow-up questions approach to place questions
in a time negotiated with the user. In a previous implementation, the chatbot checked the status of the task twice a day,
which participants considered too frequent and annoying.
The surveyed literature also reveals two strategies to provide proactivity: leveraging the conversational context and
randomly selecting a topic. [S1] Leveraging the conversational context is the most frequent strategy [Avula et al.
2018; Chaves and Gerosa 2018; Duijvelshoff 2017; Shum et al. 2018; Thies et al. 2017], in which proactive messages
relate to contextual information provided in the conversation to increase the usefulness of interventions [Avula et al.
2018; Duijvelshoff 2017; Shum et al. 2018]. [Shum et al. 2018] argue that general purpose, emotional aware chatbots
should recognize user’s interests and intents from the conversational context to proactively offer comfort and relevant
services. In [Duijvelshoff 2017], the chatbot leverages conversational context to suggest new topics and propose to
add documents or links to assist employees in a work-related group chat. The chatbots studied by [Chaves and Gerosa
Manuscript submitted to ACM
How should my chatbot interact? 7

2018] introduce new topics based on keywords from previous utterances posted in the chat. According to [Shum et al.
2018], leveraging the context can also help to smoothly guide the user to a target topic. One surveyed paper [Portela
and Granell-Canut 2017] proposes a chatbot that [S2] selects a topic randomly but also observes that the lack of
context is the major problem of this approach. Contextualized proactive interventions also suggest that the chatbot is
attentive to the conversation, which conveys conscientiousness, which is discussed in the next section.

3.1.2 Conscientiousness. Conscientiousness is a chatbot’s capacity to demonstrate attentiveness to the conversation

at hand [Duijst 2017; Dyke et al. 2013]. It enables a chatbot to follow the conversational flow, show understanding about
the context, and interpret each utterance as a meaningful part of the whole conversation [Morrissey and Kirakowski
2013]. In this survey, we found 11 papers that reported findings related to conscientiousness for chatbot design. Four
studies explicitly investigated influences of conscientiousness for chatbots [Ayedoun et al. 2017; Coniam 2008; Jain et al.
2018a; Schuetzler et al. 2018]. In the remaining studies, conscientiousness emerged in exploratory findings. In [Dyke
et al. 2013], conscientiousness emerged from the analysis of conversational logs, while [Gnewuch et al. 2017] elicited
conscientiousness aspects as a requirement for chatbot design when surveying the literature on chatbots customer
services. In the remaining studies [Brandtzaeg and Følstad 2017; Duijst 2017; Jain et al. 2018b; Morrissey and Kirakowski
2013; Tallyn et al. 2018], conscientiousness issues were self-reported by the users in post-intervention interviews and
subjective feedback to open-ended questions.
The surveyed literature evidenced benefits of designing conscientious chatbots:
[B1] to provide meaningful answers: some chatbots use simplistic approaches, like pattern matching rules based
on keywords or template phrases applied in the last user utterance, to find the most appropriate response [Abdul-Kader
and Woods 2015; Ramesh et al. 2017]. However, as the chatbot does not interpret the meaning and the intent of users’
utterances, the best-selected response may still sound irrelevant to the conversation [Coniam 2008; Duijst 2017; Dyke
et al. 2013]. As shown by [Duijst 2017], when a chatbot does not interpret the meaning of users’ utterances, users
show frustration and the chatbot’s credibility is compromised. This argument is supported by [Dyke et al. 2013] when
studying chatbots to facilitate collaborative learning. The authors proposed a chatbot that promotes Academically
Productive Talk moves. Exploratory results show that the chatbot performed inappropriate interventions, which was
perceived as a lack of attention to the conversation. In [Jain et al. 2018b], participants complained that some chatbots
seemed “completely scripted,” ignoring user’s inputs that did not fall into the script. In this case, users needed to adapt
their inputs to match the chatbot script to be understood, which resulted in dissatisfaction. Besides avoiding frustration,
[Schuetzler et al. 2018] showed that conscientiousness also influences chatbots’ perceived humanness and social presence.
The authors invited participants to interact with a chatbot that shows an image and asks the users to describe it. To
perform the task, the chatbot could either asks the same generic follow-up questions each time (nonrelevant condition)
or responded with a follow-up question related to the last participant’s input (relevant condition), demonstrating
attention to the information provided by the user. Statistical analysis of survey rates supported that the relevant
condition increased chatbots’ perceived humanness and social presence. In [Ayedoun et al. 2017], to motivate the
students to communicate in a second language, the proposed chatbot interprets users’ input to detect breakdowns and
react using an appropriate communication strategy. In interviews, participants reported “appreciating the help they
got from the chatbot to understand and express what they have got to say” [Ayedoun et al. 2017]. This study was later
extended to the context of ECAs (see [Ayedoun et al. 2018] for details).
[B2] to hold a continuous conversation: a conversation with a chatbot should maintain a “sense of continuity
over time” [Jain et al. 2018b] to demonstrate that the chatbot is undertaking efforts to track the conversation. To do so, it
Manuscript submitted to ACM
8 Chaves and Gerosa

is essential to maintain the topic. When evaluating the naturalness of a chatbot, [Morrissey and Kirakowski 2013] found
that maintaining a theme is convincing, while failure to do so is unconvincing. Furthermore, based on the literature on
customer support chatbots, [Gnewuch et al. 2017] argue that comfortably conversing on any topic related to the service
offering is a requirement for task-oriented chatbots. [Coniam 2008] reviewed popular chatbots for practicing second
language. The author showed that most chatbots in this field cannot hold continuous conversations, since they are
developed to answer the user’s last input. Therefore, they did not have the sense of topic, which resulted in instances
of inappropriate responses. When the chatbots could change the topic, they could not sustain it afterward. Showing
conscientiousness also requires the chatbot to understand and track the context, which is particularly important in
task-oriented scenarios. In [Jain et al. 2018b], first-time users stressed positive experience with chatbots that retained
information from previous turns. Two participants also expected the chatbots to retain this context across sessions,
thus reducing the need for extra user’s input per interaction. Keeping the context across sessions was highlighted as a
strategy to convey personalization and empathy (see Sections 3.2.6 and 3.2.5).
[B3] to steer the conversation toward a productive direction: in task-oriented interactions, a chatbot should
understand the purpose of the interaction and strive to conduct the conversation toward this goal in an efficient,
productive way [Ayedoun et al. 2017; Duijst 2017]. [Brandtzaeg and Følstad 2017] show that productivity is the key
motivation factor for using chatbots (68% of the participants mentioned it as the main reason for using chatbots).
First-time users in [Jain et al. 2018b] self-reported that interacting with chatbots should be more productive than using
websites, phone apps, and search engines. In this sense, [Duijst 2017] compared the user experience when interacting
with a chatbot for solving either simple or complex tasks in a financial context. The authors found that, for complex
tasks, to keep the conversation on track, the user must be aware of the next steps or why something is happening. In
the educational context, [Ayedoun et al. 2017] proposed a dialogue management approach based on communication
strategies to enrich a chatbot with the capability to express its meaning when faced with difficulties. Statistical results
show that the communication strategies, combined with affective backchannel (which is detailed in Section 3.2.5), are
effective in motivating students to communicate and maintain the task flow. Thirty-two participants out of 40 reported
that they preferred to interact with a chatbot with these characteristics. Noticeably, the chatbot’s attentiveness to the
interactional goal may not be evident to the user if the chatbot passively waits for the user to control the interaction.
Thus, conscientiousness relates to proactive ability, as discussed in Section 3.1.1.
Nevertheless, challenges in designing conscientious chatbots are also evident in the literature:
[C1] to handle task complexity: as the complexity of tasks increases, more turns are required to achieve a goal;
hence, more mistakes may be made. This argument was supported by both [Duijst 2017] and [Dyke et al. 2013], where
the complexity of the task compromised the experience and satisfaction in using the chatbot. [Duijst 2017] also highlight
that complex tasks require more effort to correct eventual mistakes. Therefore, it is an open challenge to design flexible
workflows, where the chatbot recovers from failures and keeps the interaction moving productively toward the goal,
despite potential misunderstandings [Gnewuch et al. 2017]. Recovering from failure is discussed in Section 3.2.1.
[C2] to harden the conversation: aiming to assure the conversational structure–and to hide natural language
limitations–chatbots are designed to restrict free-text inputs from the user [Duijst 2017; Jain et al. 2018b; Tallyn et al.
2018]. However, limiting the choices of interaction may convey a lack of attention to the users’ inputs. In [Duijst
2017], one participant mentioned the feeling of “going through a form or a fixed menu.” According to [Jain et al. 2018b],
participants consider the chatbot’s understanding of free-text input as a criterion to determine whether it can be
considered a chatbot, since chatbots are supposed to chat. In the ethnographic data collection study, [Tallyn et al. 2018]
reported that eight out of ten participants described the interaction using pre-set responses as too restrictive, although
Manuscript submitted to ACM
How should my chatbot interact? 9

they fulfilled the purpose of nudging participants to report their activities. Thus, the challenge lies in how to leverage
the benefits of suggesting predefined inputs without limiting conversational capabilities.
[C3] to keep the user aware of the chatbot’s context: a chatbot should provide a way to inform the user of the
current context, especially for complex tasks. According to [Jain et al. 2018a], context can be inferred from explicit
user input, or assumed based data from previous interactions. In both cases, user and chatbot should be on the same
page about the chatbot’s contextual state [Jain et al. 2018a], giving the users the opportunity to clarify possible
misunderstandings [Gnewuch et al. 2017]. [Jain et al. 2018b] highlighted that participants reported negative experience
when finding “mismatching between chatbot’s real context and their assumptions of the chatbot context.”
We identified three strategies used to provide understanding to a chatbot, as following:
[S1] conversation workflow: designing a conversational blueprint helps to conduct the conversation strictly and
productively to the goal [Duijst 2017]. However, [Gnewuch et al. 2017] argue that the workflow should be flexible to
handle both multi-turn and one-turn, question-answer interactions; besides, it should be unambiguous in order for
users to efficiently achieve their goals. In addition, [Duijst 2017] discuss that the workflow should make it easy to
fix mistakes; otherwise, the users need to restart the workflow, which leads to frustration. In [Ayedoun et al. 2017],
the conversation workflow included communicative strategies to detect a learner’s breakdowns and pitfalls. In that
study, when the student does not respond, the chatbot uses a comprehension-check question to detect whether the
student understood what was said. Then, it reacts to the user’s input by adopting one of the proposed communication
strategies (e.g., asking for repetition or simplifying the previous sentence). The conversation workflow could also allow
the chatbot to be proactive. For example, participants in [Jain et al. 2018b] suggested that proactive follow-up questions
would anticipate the resolution of the context, reducing the effort required from the user to achieve the goal.
[S2] visual elements: user-interface resources–such as quick replies, cards, and carousels–are used to structure the
conversation and reduce issues regarding understanding [Duijst 2017; Jain et al. 2018b; Tallyn et al. 2018]. Using these
resources, the chatbot shows the next possible utterances [Jain et al. 2018b] and conveys the conversational workflow
step-by-step [Duijst 2017; Tallyn et al. 2018]. Visual elements are also used to show the user what the chatbot can (or
cannot) do. This is another conversational characteristic, which will be discussed in the Section 3.1.3.
[S3] context window: to keep the user aware of the current chatbot’ context, [Jain et al. 2018a] developed a chatbot
for shopping that shows a context window on the side of the conversation. In this window, the user can click on specific
attributes and change them to fix inconsistencies. A survey showed that the chatbot outperformed a default chatbot
(without the context window) for the mental demand and effort constructs. However, when the chatbots are built in
third-party apps (e.g., Facebook Messenger), an extra window may not be possible.
[S4] confirmation messages: a conversation workflow may include confirmation messages to convey the chatbots’
context to the user [Jain et al. 2018a]. In [Duijst 2017], when trying to block a stolen credit card, a confirmation message
is used to verify the given personal data. In [Ayedoun et al. 2017], confirmation messages are used as a communicative
strategy to check whether the system understanding about a particular utterance matches what the learner actually
meant. Balancing the number of confirmation messages (see [Duijst 2017]) and the right moment to introduce them
into the conversation flow is still under-investigated.
The surveyed literature supports that failing to demonstrate understanding about the users’ individual utterances,
the conversational context, and the interactional goals results in frustration and loss of credibility. However, most of the
results are exploratory findings; there is a lack studies to investigate the extent to which the provided strategies influence
users’ behavior and perceptions. In addition, conscientiousness is by itself a personality trait; the more conscientiousness
a chatbot manifests, the more it can be perceived as attentive, organized, and efficient. The relationship between
Manuscript submitted to ACM
10 Chaves and Gerosa

conscientiousness and personality is highlighted in Section 4.

3.1.3 Communicability. Interactive software is communicative by its nature, since users achieve their goals by
exchanging messages with the system [Prates et al. 2000; Sundar et al. 2016]. In this context, communicability is defined
as the capacity of a software to convey to users its underlying design intent and interactive principles [Prates et al. 2000].
Providing communicability helps users to interpret the codes used by designers to convey the interactional possibilities
embedded in the software [De Souza et al. 1999], which improves system learnability [Grossman et al. 2009]. In the
chatbot context, communicability is, therefore, the capability of a chatbot to convey its features to users [Valério et al.
2017]. The problematic around chatbots’ communicability lies in the nature of the interface: instead of buttons, menus,
and links, chatbots unveil their capabilities through the conversational turns, one sentence at a time [Valério et al. 2017],
bringing new challenges in the system learnability field. The lack of communicability may lead users to give up on
using the chatbot when they cannot understand the available functionalities and how to use them [Valério et al. 2017].
In this survey, we found six papers that describes communicability, although investigating communicability is the
main purpose of only one [Valério et al. 2017]. Conversational logs revealed communicability needs in three studies
[Lasek and Jessa 2013; Liao et al. 2018]; in the other three studies [Duijst 2017; Gnewuch et al. 2017; Jain et al. 2018b]
communicability issues were self-reported by the users in post-intervention interviews and subjective feedback.
The surveyed literature reports two main benefits of communicability for chatbots:
[B1] to unveil functionalities: while interacting with chatbots, users may not know that a desired functionality is
available or how to use it [Jain et al. 2018b; Valério et al. 2017]. Most participants in [Jain et al. 2018b]’s study mentioned
that they did not understand the functionalities of at least one of the chatbots and none of them mentioned searching
for the functionalities in other sources (e.g., Google search or the chatbot website) rather than exploring options during
the interaction. In a study about playful interactions in a work environment, [Liao et al. 2018] observed that 22% of
the participants explicitly asked the chatbot about its capabilities (e.g., “what can you do?”), and 1.8% of all the users’
messages were ability-check questions. In a study about hotel chatbots, [Lasek and Jessa 2013] verified that 63% of the
conversations were initiated by clicking an option displayed in the chatbot welcome message. A semiotic inspection of
news-related chatbots [Valério et al. 2017] evidenced that communicability strategies are effective in providing clues
about the chatbot’s features and ideas about what to do and how.
[B2] to manage users’ expectations: [Jain et al. 2018b] observed that when first-time users do not understand
chatbots’ capabilities and limitations, they have high expectations and, consequently, end up more frustrated when
the chatbots fail. Some participants blamed themselves for not knowing how to communicate and gave up. In [Liao
et al. 2018], quantitative results evidenced that ability-check questions can be considered signals of users struggling
with functional affordances. Users posed ability-check questions after encountering errors as a means of establishing a
common ground between the chatbot’s capabilities and their own expectations. According to the authors, ability-check
questions helped users to understand the system and reduce uncertainty [Liao et al. 2018]. In [Duijst 2017], users also
demonstrated the importance of understanding chatbots’ capabilities in advance. Since the tasks related to financial
support, users expected the chatbot to validate the personal data provided and to provide feedback after completing the
task (e.g., explaining how long it would take for the credit card to be blocked). Therefore, communicability helps users
to gain a sense of which type of messages or functionalities a chatbot can handle.
The surveyed literature also highlights two challenges of providing communicability:
[C1] to provide business integration: communicated chatbots’ functionalities should be performed as much as
possible within the chat interface [Jain et al. 2018b]. Chatbots often act as an intermediary between users and services.
Manuscript submitted to ACM
How should my chatbot interact? 11

In this case, to overcome technical challenges, chatbots answer users’ inputs with links to external sources, where
the request will be addressed. First-time users expressed dissatisfaction with this strategy in [Jain et al. 2018b]. Six
participants complained that the chatbot redirected them to external websites. According to [Gnewuch et al. 2017],
business integration is a requirement for designing chatbots, so that the chatbot can solve the users’ requests without
transferring the interaction to another user interface.
[C2] to keep visual elements consistent with textual inputs: in the semiotic engineering evaluation, [Valério
et al. 2017] observed that some chatbots responded differently depending on whether the user accesses a visual element
in the user-interface or types the desired functionality in the text-input box, even if both input modes result in the
same utterance. This inconsistency produces the user’s feeling of misinterpreting the affordances, which has a negative
impact on the system learnability.
As an outcome of the semiotic inspection process, [Valério et al. 2017] present a set of strategies to provide communi-
cability. Some of them are also emphasized in other studies, as follows:
[S1] to clarify the purpose of the chatbot: First-time users in [Jain et al. 2018b] highlighted that a clarification
about the chatbots’ purpose should be placed in the introductory message. [Gnewuch et al. 2017] found similar inference
from the literature on customer services chatbots. The authors argue that providing an opening message with insights
into the chatbots’ capabilities while not flooding the users with unnecessary information is a requirement for chatbots
design. In addition, a chatbot could give a short tour throughout the main functionalities at the beginning of the first
sessions [Valério et al. 2017].
[S2] to advertise the functionality and suggest the next step: when the chatbot is not able to answer the user,
or when it notices that the user is silent, it may suggest available features to stimulate the user to engage [Jain et al.
2018b; Valério et al. 2017]. In [Jain et al. 2018b], six participants mentioned that they appreciated when the chatbot
suggested responses, for example by saying “try a few of these commands: ...” [Jain et al. 2018b]. [Valério et al. 2017]
shows that chatbots use visual elements, such as cards, carousel, and menus (persistent or not) to show contextualized
clues about the next answer, which both fulfills communicability purpose and spares users from having to type.
[S3] to provide a help functionality: chatbots should recognize a “help” input from the user, so it can provide
instructions on how to proceed [Valério et al. 2017]. [Jain et al. 2018b] reported users that highlighted this functionality
as useful for the reviewed chatbots. Also, results from [Liao et al. 2018] shows that chatbots should be able to answer
ability checking questions (e.g., “what can you do?” or “can you do [functionality]?”).
The literature states the importance of communicating chatbots’ functionality to the success of the interaction.
Failing to provide communicability leads the users to frustration and they often give up when they do not know how
to proceed. The literature on interactive systems has highlighted the system learnability as the most fundamental
component for usability [Grossman et al. 2009], and an easy-to-learn system should lead the user to perform well, even
during their initial interactions. Thus, researchers in the chatbots domain can leverage the vast literature on systems
learnability to identify metrics and evaluation methodologies, as well as proposing new forms of communicability
strategies that reduce the learnability issues in chatbots interaction. Communicability may also be used as a strategy to
avoid mistakes (damage control), which will be discussed in Section 3.2.1.

In summary, the conversational intelligence category includes characteristics that help a chatbot to perform a proactive,
attentive, and informative role in the interaction. The highlighted benefits relate to how a chatbot manages the conversation to
make it productive, interesting, and neat. To achieve that, designers and researchers should care for the timing and relevance of

Manuscript submitted to ACM

12 Chaves and Gerosa

provided information, privacy, interactional flexibility, and consistency.

3.2 Social Intelligence

Social Intelligence refers to the ability of an individual to produce adequate social behavior for the purpose of
achieving desired goals [Björkqvist et al. 2000]. In the HCI domain, the Media Equation theory [Reeves and Nass 1996]
posit that people react to computers as social actors. Hence, when developing chatbots, it is necessary to account for
the socially acceptable protocols for conversational interactions [Walker 2009; Wallis and Norling 2005]. Chatbots
should be able to respond to social cues during the conversation, accept differences, and manage conflicts [Salovey and
Mayer 1990] as well as be empathic and demonstrate caring [Björkqvist et al. 2000], which ultimately increase chatbots’
authenticity [Neururer et al. 2018]. In this section, we discuss the social characteristics related to social intelligence,
namely: damage control (12 papers), thoroughness (13 papers), manners (10 papers), moral agency (6 papers), emotional
intelligence (13 papers), and personalization (11 papers). Although the focus of the investigations is diverse, we found
more studies where the focus of the investigation relates to a specific social characteristic, particularly moral agency
and emotional intelligence, when compared to the conversational intelligence category.

3.2.1 Damage control. Damage control is the ability of a chatbot to deal with either conflict or failure situations.
Although the Media Equation theory argues that humans socially respond to computers as they respond to other people
[Reeves and Nass 1996], the literature has shown that interactions with conversational agents are not quite equal to
human-human interactions [Luger and Sellen 2016; Mou and Xu 2017; Shechtman and Horowitz 2003]. When talking
to a chatbot, humans are more likely to harass [Hill et al. 2015], test the agent’s capabilities and knowledge [Wallis
and Norling 2005], and feel disappointed with mistakes [Jain et al. 2018b; Mäurer and Weihe 2015]. When a chatbot
does not respond appropriately, it may encourage the abusive behavior [Curry and Rieser 2018] or disappoint the user
[Jain et al. 2018b; Mäurer and Weihe 2015], which ultimately leads the conversation to fail [Jain et al. 2018b]. Thus,
it is necessary to enrich chatbots with the ability to recover from failures and handle inappropriate talk in a socially
acceptable manner [Jain et al. 2018b; Silvervarg and Jönsson 2013; Wallis and Norling 2005].
In this survey, we found 12 studies that discuss damage control as a relevant characteristic for chatbots, two of which
focus on conflict situations, such as testing and flaming [Silvervarg and Jönsson 2013; Wallis and Norling 2005]. In the
remaining studies [Curry and Rieser 2018; De Angeli et al. 2001a; Duijst 2017; Gnewuch et al. 2017; Jain et al. 2018b;
Jenkins et al. 2007; Lasek and Jessa 2013; Liao et al. 2018; Mäurer and Weihe 2015; Toxtli et al. 2018], needs for damage
control emerged from the analysis of conversational logs and users’ feedback.
The surveyed literature highlights the following benefits of providing damage control in chatbots:
[B1] to appropriately respond to harassment: chatbots are more likely to be exposed to profanity than humans
would be [Hill et al. 2015]. When analyzing conversation logs from hotel chatbots, [Lasek and Jessa 2013] observed that
4% of the conversations contained vulgar, indecent, and insulting vocabulary, and 2.8% of all statements were abusive.
Qualitative evaluation reveals that the longer the conversations last, the more users are encouraged to go beyond the
chatbots main functions. In addition, sexual expressions represented 1.8% of all statements. The researchers suggested
that having a task-oriented conversation with a company representative and not allowing small talk contributed to
inhibiting the users to use profanity. However, a similar number was found in a study with general purpose chatbots
[Curry and Rieser 2018]. When analyzing a corpus from the Amazon Alexa Prize 2017, the researchers estimated that
about 4% of the conversations included sexually explicit utterances. [Curry and Rieser 2018] used utterances from this
Manuscript submitted to ACM
How should my chatbot interact? 13

corpus to harass a set of state-of-art chatbots and analyze the responses. The results show that chatbots respond to
harassment in a variety of ways, including nonsensical, negative, and positive responses. However, the authors highlight
that the responses should align with the chatbot’s goal to avoid encouraging the behavior or reinforcing stereotypes.
[B2] to deal with testing: abusive behavior is often used to test chatbots’ social reactions [Lasek and Jessa 2013;
Wallis and Norling 2005]. During the evaluation of a virtual guide to the university campus [Wallis and Norling 2005], a
participant answered the chatbot’s introductory greeting with “moron,” likely hoping to see how the chatbot would
answer. [Wallis and Norling 2005] argue that handling this type of testing helps the chatbots to establish limits and
resolve social positioning. Other forms of testing were highlighted in [Silvervarg and Jönsson 2013], including sending
random letters, repeated greetings, laughs and acknowledgments, and posing comments and questions about the
chatbot’s intellectual capabilities. When analyzing conversations with a task management chatbot, [Liao et al. 2018]
observed that casually testing the chatbots’ “intelligence” is a manifestation of seeking satisfaction. In [Jain et al. 2018b],
first-time users appreciated when the chatbot successfully performed tasks when the user expected the chatbot to fail,
which shows that satisfaction is influenced by the ability to provide a clever response when the user tests the chatbot.
[B3] to deal with lack of knowledge: chatbots often fail in a conversation due to lack of either linguistic or world
knowledge [Wallis and Norling 2005]. Damage control enables the chatbot to admit the lack of knowledge or cover up
cleverly [Jain et al. 2018b]. When analyzing the log of a task management chatbot, [Toxtli et al. 2018] found out that
the chatbot failed to answer 10% of the exchanged messages. The authors suggest that the chatbot should be designed
to handle novel scenarios when the current knowledge is not enough to answer the requests. In some task-oriented
chatbots, though, the failure may not be caused by a novel scenario, but by an off-topic utterance. In the educational
context, [Silvervarg and Jönsson 2013] observed that students posted off-topic utterances when they did not know
what topics they could talk about, which led the chatbot to fail rather than help the users to understand its knowledge.
In task-oriented scenarios, the lack of linguistic knowledge may lead the chatbots to get lost in the conversational
workflow [Gnewuch et al. 2017], compromising the success of the interaction. [Mäurer and Weihe 2015] demonstrated
that dialogue-reference errors (e.g., user’s attempt to correct a previous answer or jumping back to an earlier question)
are one of the major reasons for failing dialogues and they mostly resulted from chatbots misunderstandings.
The literature also reveals some challenges to provide damage control:
[C1] to deal with unfriendly users: [Silvervarg and Jönsson 2013] argue that users that want to test and find the
system’s borders are likely to never have a meaningful conversation with the chatbot no matter how sophisticated it is.
Thus, there is an extent to which damage control strategies will be effective to avoid testing and abuse. In [De Angeli et al.
2001a], the authors observed human tendencies to dominate, be rude, and infer stupidity, which they call “unfriendly
partners.” After an intervention where users interacted with a chatbot for decision-making coaching, [Mäurer and
Weihe 2015] evaluated participants’ self-perceived work and cooperation with the system. The qualitative results show
that cooperative users are significantly more likely to give a higher rating for overall evaluation and decision efficiency.
The qualitative analysis of the conversation log reveals that a few interactions failed because the users’ motivations
were curiosity and mischief rather than trying to solve the decision problem.
[C2] to identify abusive utterances: several chatbots are trained on “clean” data. Because they do not understand
profanity or abuse, they may not recognize a statement as harassment, which makes it difficult to adopt answering
strategies [Curry and Rieser 2018]. [Curry and Rieser 2018] shows that data-driven chatbots often provide non-coherent
responses to harassment. Sometimes, these responses conveyed the impression of flirtatious or counter-aggression.
Providing means to identify an abusive utterance is important to adopt damage control strategies.
[C3] to adequate the response to the context: [Wallis and Norling 2005] argue that humans negotiate a conflict
Manuscript submitted to ACM
14 Chaves and Gerosa

and social positioning well before reaching abuse. In human-chatbot interactions, however, predicting users’ behavior
toward the chatbots in a specific context to develop the appropriate behavior is a challenge to overcome. Damage control
strategies need to be adapted to both the social situation and the intensity of the conflict. For example, [Curry and
Rieser 2018] showed that being evasive about sexual statements may convey the impression of flirtatious, which would
not be an acceptable behavior for a customer assistant or a tutor chatbot. In contrast, adult chatbots are supposed to
flirt, so encouraging behaviors are expected in some situations. [Wallis and Norling 2005] argue that when the chatbot
is not accepted as part of the social group it represents, it is discredited by the user, leading the interaction to fail. In
addition, designing chatbots with too strong reactions may lead to ethical concerns [Wallis and Norling 2005]. For
[Björkqvist et al. 2000], choosing between peaceful or aggressive reactions in conflict situations is optional for socially
intelligent individuals. Enriching chatbots with the ability to choose between the options is a challenge.
Damage control strategies depend on the type of failure and the target benefit, as following:
[S1] emotional reactions: [Wallis and Norling 2005] suggest that when faced with abuse, a chatbot could be seen
to take offense and respond in kind or to act hurt. The authors argue that humans might feel inhibited about hurting
the pretended feelings of a machine if the machine is willing to hurt human’s feelings too [Wallis and Norling 2005].
If escalating the aggressive behavior is not appropriate, the chatbot could withdraw from the conversation [Wallis
and Norling 2005] to demonstrate that the user’s behavior is not acceptable. In [De Angeli et al. 2001a], the authors
discuss that users appeared to be uncomfortable and annoyed whenever the chatbot pointed out any defect in the user
or reacted to aggression, as this behavior conflict the user’s perceived power relations. This strategy is also applied
in [Silvervarg and Jönsson 2013], where abusive behavior may lead the chatbot to stop responding until the student
changes the topic. [Curry and Rieser 2018] categorized responses from state-of-the-art conversational systems in a
pool of emotional reactions, both positive and negative. The reactions include humorous responses, chastising and
retaliation, and evasive responses as well as flirtation, and play-along utterances. To provide an emotional reaction,
emotional intelligence is also required. This category is presented in Section 3.2.5.
[S2] authoritative reactions: when facing testing or abuse, chatbots can communicate consequences [Silvervarg
and Jönsson 2013] or call for the authority of others [Toxtli et al. 2018; Wallis and Norling 2005]. In [Wallis and Norling
2005], although the wizard acting as a chatbot was conscientiously working as a campus guide, she answered a bogus
caller with “This is the University of Melbourne. Sorry, how can I help you?” The authors suggest that the wizard was
calling on the authority of the university to handle the conflict, where being part of a recognized institution places the
chatbot in a stronger social group. In [Silvervarg and Jönsson 2013], when students recurrently harass the chatbot, the
chatbot informs the student that further abuse will be reported to the (human) teacher (although the paper does not
clarify whether the problem is, in fact, escalated to a human). [Toxtli et al. 2018] and [Jenkins et al. 2007] also suggest
that chatbots could redirect users’ problematic requests to a human attendant in order to avoid conflict situations.
[S3] to ignore the user’s utterance and change the topic: [Wallis and Norling 2005] argue that ignoring abuse
and testing is not a good strategy because it could encourage more extreme behaviors. It also positions the chatbot
as an inferior individual, which is particularly harmful in scenarios where the chatbot should demonstrate a more
prominent or authoritative social role (e.g., a tutor). However, this strategy has been found in some studies to handle
lack of knowledge. When iteratively developing a chatbot for an educational context, [Silvervarg and Jönsson 2013]
proposed to initiate a new topic in one out of four user’s utterances that the chatbot did not understand.
[S4] conscientiousness and communicability: successfully implementing conscientiousness and communicability
may prevent errors; hence, strategies to provide these characteristics can also be used for damage control. In [Silvervarg
and Jönsson 2013], when users utter out-of-scope statements, the chatbot could make it clear what topics are appropriate
Manuscript submitted to ACM
How should my chatbot interact? 15

to the situation. For task-oriented scenarios, where the conversation should evolve toward a goal, [Wallis and Norling
2005] argue that the chatbot can clarify the purpose of the offered service when facing abusive behavior, bringing the
user back to the task. [Jain et al. 2018b] showed that describing chatbot’s capabilities after failures in the dialog was
appreciated by first-time users. In situations where the conversational workflow is susceptible to failure, [Gnewuch et al.
2017] discuss that posing confirmation messages avoids trapping the users in the wrong conversation path. Participants
in [Duijst 2017] also suggested back buttons as a strategy to fix mistakes in the workflow. In addition, the exploratory
results about the user interface showed that having visual elements such as quick replies prevent errors, since they
keep the users aware of what to ask and the chatbot is more likely to know how to respond [Duijst 2017].
[S5] to predict users’ satisfaction: chatbots should perceive both explicit and implicit feedback about users’
(dis)satisfaction [Liao et al. 2018]. To address this challenge, [Liao et al. 2018] invited participants to send a “#fail”
statement to express dissatisfaction. The results show that 42.4% of the users did it at least once, and the number of
complaints and flaming for the proposed chatbot was significantly lower than the baseline. However, the amount
of implicit feedback was also significant, which advocates for predicting user’s satisfaction from the conversation.
The most powerful conversational acts to predict user satisfaction in that study was the agent ability-check types of
questions, see discussion in Communicability section) and the explicit feedback #fail, although closings and off-topic
requests were also significant in predicting frustration. Although these results are promising, more investigation is
needed to identify other potential predictors of users’ satisfaction in real-time, in other to provide appropriate reaction.
Damage control strategies have different levels of severity. Deciding what strategy is adequate to the intensity of
the conflict is crucial [Wallis and Norling 2005]. The strategies can escalate in severity in case that the conflict is not
solved. For example, [Silvervarg and Jönsson 2013] uses a sequence of clarification, suggesting a new topic, and asking
a question about the new topic. In case of abuse, the chatbot calls for authority after two attempts of changing topics.
According to [Wallis and Norling 2005], humans also fail in conversations; they misunderstand what their partner
says and do not know things that are assumed as common knowledge by others. Hence, it is unlikely that chatbots
interactions will evolve to be conflict-free. That said, damage control intends to avoid escalating the conflicts and
manifest an unexpected behavior. In this sense, politeness can be used as a strategy to minimize the effect of lack
of knowledge (see Section 3.2.3), managing the conversation despite the possible mistakes. Regarding interpersonal
conflicts, the strategies are in line with the theory on human-human communication, which includes non-negotiation,
emotional appeal, personal rejection, and emphatic understanding [Fitzpatrick and Winke 1979]. Further research on
damage control can evaluate the adoption of human-human strategies in human-chatbots communication.

3.2.2 Thoroughness. Thoroughness is the ability of a chatbot to be precise regarding how it uses language to express
itself [Morrissey and Kirakowski 2013]. In traditional user interfaces, user communication takes place using visual
affordances, such as buttons, menus, or links. In a conversational interface, language is the main tool to achieve the
communicative goal. Thus, chatbots should coherently use language that portrays the expected style [Mairesse and
Walker 2009]. When the chatbot is not consistent about how they use language, or use unexpected patterns of language
(e.g., excessive formality), the conversation may sound strange to the user, leading to frustration. We found 13 papers
that report the importance of thoroughness in chatbots design, three of which investigate how patterns of language
influence users’ perceptions and behavior toward the chatbots [Duijst 2017; Hill et al. 2015; Mairesse and Walker 2009].
[Gnewuch et al. 2017] and [Morris 2002] suggest design principles that include concerns about language choices. Log of
conversations revealed issues regarding thoroughness in two studies [Coniam 2008; Jenkins et al. 2007]. In the remaining
papers, thoroughness emerged from interviews and users’ subjective feedback [Chaves and Gerosa 2018; Kirakowski
Manuscript submitted to ACM
16 Chaves and Gerosa

et al. 2009; Morrissey and Kirakowski 2013; Tallyn et al. 2018; Thies et al. 2017; Zamora 2017].
We found two benefits of providing thoroughness:
[B1] to adapt the language dynamically: chatbots utterances are often pre-recorded by the chatbot designer
[Mairesse and Walker 2009]. On the one hand, this approach produces high quality utterances; on the other hand, it
reduces flexibility since the chatbot is not able to adapt the tone of the conversation based on individual users and
conversational context. When analyzing interactions with a customer representative chatbot, [Jenkins et al. 2007]
observed that the chatbot proposed synonyms to keywords, and the repetition of this vocabulary led the users to imitate
it. [Hill et al. 2015] observed a similar tendency to matching language style. The authors compared human-human
conversations with human-chatbots conversations regarding language use. They found that people use, indeed, fewer
words per message and a more limited vocabulary with chatbots. However, a deeper investigation revealed that the
human-interlocutors were actually matching the patterns of language use with the chatbot, who sent fewer words per
message. When interacting with a chatbot that uses many emojis and letter reduplication [Thies et al. 2017], participants
reported a draining experience, since the chatbot’s energy was too high to match with. These outcomes show that
adapting the language to the interlocutor is a common behavior for humans, and so chatbots would benefit from
manifesting it. In addition to the interlocutor, chatbots should adapt their language use to the context which they are
implemented and adopt appropriate linguistic register [Gnewuch et al. 2017; Morrissey and Kirakowski 2013]. In the
customer services domain, [Gnewuch et al. 2017] state that chatbots are expected to fulfil the role of a human, hence,
they should produce language that corresponds to the represented service provider. In the financial scenario [Duijst
2017], some participants complained about the use of emojis in a situation of urgency (blocking a stolen credit card).
[B2] to exhibit believable behavior: because people associate social qualities to machines [Reeves and Nass
1996], chatbots are deemed to be below standard when users see them “acting as a machine” [Jenkins et al. 2007].
When analyzing the naturalness of chatbots, [Morrissey and Kirakowski 2013] found that the formal grammatical
and syntactical abilities of a chatbot are the biggest discriminators between good and poor chatbots (the other factors
being conscientiousness, manners, and proactivity). The authors highlight that chatbots should use grammar and spelling
consistently. [Coniam 2008] discusses that, even with English as Second Language (ESL) learners, basic grammar
errors, such as pronouns confusion, diminish the value of the chatbot. In addition, [Morris 2002] states that believable
chatbots need also to display unique characters through linguistic choices. In this sense, [Mairesse and Walker 2009]
demonstrated that personality can be expressed by language patterns. The authors proposed a computational framework
to produce utterances to manifest a target personality. The utterances were rated by experts in personality evaluation and
statistically compared against utterances produced by humans who manifest the target personality. The outcomes show
that a single utterance can manifest a believable personality when using the appropriate linguistic form. Participants in
[Jenkins et al. 2007] described some interactions as “robotic” when the chatbot repeated the keywords in the answers,
reducing the interaction naturalness. Similarly, in [Tallyn et al. 2018], participants complained about the “inflexibility”
of the pre-defined, handcrafted chatbot’s responses and expressed the desire for it to talk “more as a person.”
Regarding the challenges, the surveyed literature shows the following:
[C1] to decide on how much to talk: in [Jenkins et al. 2007], some participants described the chatbot’s utterances
as not having enough details, or being too generic; however, most of them appreciated finding answers in a sentence
rather than in a paragraph. Similarly, [Gnewuch et al. 2017] argue that simple questions should not be too detailed
while important transactions require more information. In three studies [Chaves and Gerosa 2018; Duijst 2017; Zamora
2017], participants complained about information overload and inefficiency caused by big blocks of texts. Balancing the
granularity of information with the sentence length is a challenge to be overcome.
Manuscript submitted to ACM
How should my chatbot interact? 17

[C2] to be consistent: chatbots should not combine different language styles. For example, in [Duijst 2017], most
users found it strange that emojis were combined with a certain level of formal contact. When analyzing the critical
incidents about an open-domain interaction, [Kirakowski et al. 2009] found that participants criticized when chatbots
used more formal language or unusual vocabulary since general-purpose chatbots focus on casual interactions.
Despite the highlighted benefits, we did not find strategies to provide thoroughness. [Morris 2002] proposed a
rule-based architecture where the language choices consider the agent’s personality, emotional state, and beliefs about
the social relationship among the interlocutors. However, they did not provide evidence of whether the proposed models
produced the expected outcome. Although the literature in computational linguistics has proposed algorithms and
statistical models to manipulate language style and matching (see e.g., [Prabhumoye et al. 2018; Zhang et al. 2018b]), to
the best of our knowledge, these strategies have not been evaluated in the context of chatbots social interactions.
This section shows that linguistic choices influence users’ perceptions of chatbots. Computer-mediated communi-
cation (CMC) field has a vast literature that shows language variation according to the media and its effect in social
perceptions (see e.g. [Baron 1984; Walther 2007]). Similarly, researchers in sociolinguistic fields [Conrad and Biber 2009]
have shown that language choices are influenced by personal style, dialect, genre, and register. For chatbots, the results
presented in [Mairesse and Walker 2009] are promising, demonstrating that automatically generated language can
manifest recognizable traits. Thus, further research in chatbot’s thoroughness could leverage CMC and sociolinguistics
theories to provide strategies that lead language to accomplish its purpose for a particular interactional context.

3.2.3 Manners. Manners refer to the ability of a chatbot to manifest polite behavior and conversational habits
[Morrissey and Kirakowski 2013]. Although individuals with different personalities, from different cultures may have
different notions of what is considered polite (see e.g., [Watts 2003]), politeness can be more generally applied as rapport
management [Brown 2015], where interlocutors strive to control the harmony between people in discourse. A chatbot
can manifest manners by adopting speech acts such as greetings, apologies, and closings [Jain et al. 2018b]; minimizing
impositions [Tallyn et al. 2018; Toxtli et al. 2018], and making interactions more personal [Jain et al. 2018b]. Manners
potentially reduces the feeling of annoyance and frustration that may lead the interaction to fail [Jain et al. 2018b].
We identified ten studies that report manners, one of which directly investigate this characteristic [Wallis and Norling
2005]. In some studies [Chaves and Gerosa 2018; Liao et al. 2018; Mäurer and Weihe 2015; Toxtli et al. 2018], manners
were observed in the analysis of conversational logs, where participants talked to the chatbot in polite, human-like
ways. Users’ feedback and interviews revealed users expectations regarding chatbots politeness and personal behavior
[Jain et al. 2018b; Jenkins et al. 2007; Kirakowski et al. 2009; Kumar et al. 2010; Morrissey and Kirakowski 2013].
The main benefit of providing manners is [B1] to increase human-likeness. Manners is highlighted in the literature
as a way to turn chatbots conversations into a more natural, convincing interaction [Kirakowski et al. 2009; Morrissey
and Kirakowski 2013]. In an in-the-wild data collection, [Toxtli et al. 2018] observed that 93% of the participants
used polite words (e.g., “thanks” or “please”) with a task management chatbot at least once, and 20% always talked
politely to the chatbot. Unfortunately, the chatbot evaluated in that study was not prepared to handle these protocols
and ultimately failed to understand. When identifying incidents from their own conversational logs with a chatbot
[Kirakowski et al. 2009], several participants identified greetings as a human-seeming characteristic. The users also
found convincing when the chatbot appropriately reacts to social cues statements, such as “how are you?”-types of
utterances. Using this result, [Morrissey and Kirakowski 2013] later suggested that greetings, apologies, social niceties,
and introductions are significant constructs to measure chatbot’s naturalness. In [Jenkins et al. 2007], the chatbot used
exclamation marks at some points and frequently offered sentences available on the website, vaguely human-like. In
Manuscript submitted to ACM
18 Chaves and Gerosa

the feedback, participants described the chatbot as rude, impolite, and cheeky.
The surveyed literature highlights two challenges to convey manners:
[C1] to deal with face-threatening acts: Face-Threatening Acts (FTA) are speech acts that threaten, either
positive or negatively, the “face” of an interlocutor [Brown and Levinson. 1987]. Politeness strategies in human-human
interactions are adopted to counteract the threat when an FTA needs to be performed [Brown and Levinson. 1987].
In [Wallis and Norling 2005], the authors discuss that the wizard performing the role of the chatbot used several
politeness strategies to counteract face threats. For instance, when she did not recognize a destination, instead of
providing a list of possible destinations, she stimulated the user to keep talking until they volunteered the information.
In chatbots design, in contrast, providing a list of options to choose is a common strategy. For example, in [Toxtli
et al. 2018], the chatbot was designed to present the user with a list of pending tasks when it did not know what task
the user was reporting as completed, although the authors acknowledged that it resulted in an unnatural interaction.
Although adopting politeness strategies is natural for humans and people usually do not consciously think about them,
implementing them for chatbots is challenging due to the complexity of identifying face-threatening acts. For example,
in the decision-making coach scenario, [Mäurer and Weihe 2015] observed that users tend to utter straightforward
and direct agreements while most of the disagreements contained modifiers that weakened their disagreement. The
adoption of politeness strategies to deal with face-threatening acts is still under-investigated in the chatbots literature.
[C2] to end a conversation gracefully: [Jain et al. 2018b] discuss that first-time users expected human-like
conversational etiquette from the chatbots, specifically introductory phrases and concluding phrases. Although several
chatbots perform well in the introduction, the concluding phrases are less explored. Most of the participants reported
being annoyed with chatbots that do not end a conversation [Jain et al. 2018b]. [Chaves and Gerosa 2018] also highlight
that chatbots need to know when the conversation ends. In that scenario, the chatbot could recognize a closing statement
(the user explicitly says “thank you” or “bye”); however, it would not end the conversation otherwise. Users that stated
a decision, but kept receiving more information from the chatbot, reported feeling confused and undecided afterward.
Thus, recognizing the right moment to end the conversation is a challenge to overcome.
The strategies highlighted in the surveyed literature for providing manners are the following:
[S1] to engage in small talk: [Liao et al. 2018] and [Kumar et al. 2010] point out that even task-oriented chatbots
engage in small talk. When categorizing the utterances from the conversational log, the authors found a significant
number of messages about the agent status (e.g., “what are you doing?”), opening and closing sentences as well as
acknowledgment statements (“ok,” “got it”). [Jain et al. 2018b] also observed that first-time users included small talk
in the introductory phrases. According to [Liao et al. 2018], these are common behaviors in human-human chat
interface, and chatbots would likely benefit from anticipating these habitual behaviors and reproducing them. However,
particularly for task-oriented chatbots, it is important to control the small talk to avoid off-topic conversations and
harassment, as discussed in Sections 3.2.1 and 3.1.2.
[S2] to adhere to turn-taking protocols: [Toxtli et al. 2018] suggest that chatbots should adopt turn-taking
protocols to know when to talk. Participants who received frequent follow-up questions from the task management
chatbot about their pending tasks perceived the chatbot as invasive. Literature in chatbot development proposes
techniques to improve chatbots’ turn-taking capabilities (see e.g., [Brown and Levinson. 1987; Candello et al. 2018; de
Bayser et al. 2017]), which can be explored as a mean of improving perceived chatbot’s manners.
Although the literature emphasizes that manners are important to approximate chatbots interactions to human
conversational protocols, this social characteristic is under-investigated in the literature. Conversational acts such as
greetings and apologies are often adopted (e.g., [Jain et al. 2018b; Jenkins et al. 2007; Mäurer and Weihe 2015]), but there is
Manuscript submitted to ACM
How should my chatbot interact? 19

a lack of studies on the rationality around the strategies and the relations with politeness models used in human-human
social interactions [Wallis and Norling 2005]. In addition, the literature points out needs for personal conversations
(e.g., addressing the users by name), but we did not find studies that focus on this type of strategies. CMC is by itself
more impersonal than face-to-face conversations [Walther 1992, 1996]; even so, current online communication media
has been successfully used to initiate, develop, and maintain interpersonal relationships [Walther 2011]. Researchers
can learn from human behaviors in CMC and adopt similar strategies to produce more personal conversations.

3.2.4 Moral agency. Machine moral agency refers to the ability of a technology to act based on social notions of
right and wrong [Banks 2018]. The lack of this ability may lead to cases such as Tay, the Microsoft’s Twitter chatbot
that became racist, sexist, and harasser in a few hours [Neff and Nagy 2016]. The case raised concerns on what makes
an artificial agent (im)moral. Whether machines can be considered (moral) agents is widely discussed in the literature
(see e.g., [Allen et al. 2006; Himma 2009; Parthemore and Whitby 2013]). In this survey, the goal is not to argue about
criteria to define a chatbot as moral, but to discuss the benefits of manifesting a perceived agency [Banks 2018] and
the implications of disregarding chatbots’ moral behavior. Hence, for the purpose of this survey, moral agency is a
manifested behavior that may be inferred by a human as morality and agency [Banks 2018].
We found six papers that address moral agency.[Banks 2018] developed and validated a metric for perceived moral
agency in conversational interfaces, including chatbots. In four studies, the authors investigated the ability of chatbots
to handle conversations where the persistence of gender [Brahnam and De Angeli 2012; De Angeli and Brahnam 2006]
and race stereotypes [Marino 2014; Schlesinger et al. 2018] may occur. In [Shum et al. 2018], moral agency is discussed
as a secondary result, where the authors discuss the impact of generating biased responses on emotional connection.
The two main reported benefits of manifesting perceived moral agency are the following:
[B1] to avoid stereotyping: chatbots are often designed with anthropomorphized characteristics (see Section 3.3),
including gender, age, and ethnicity identities. Although the chatbot’s personification is more evident in embodied
conversational agents, text-based chatbots may also be assessed by their social representation, which risks building
or reinforcing stereotypes [Marino 2014]. [Marino 2014] and [Schlesinger et al. 2018] argues that chatbots are often
developed using language registers [Marino 2014] and cultural references [Schlesinger et al. 2018] of the dominant
culture. In addition, a static image (or avatar) representing the agent may convey social grouping [Nowak and Rauh 2005].
When the chatbot is positioned in a minority identity group, it exposes the image of that group to the judgment and
flaming, which is frequent in chatbots interactions [Marino 2014]. For example, [Marino 2014] discusses the controversies
caused by a chatbot designed to answer questions about Caribbean Aboriginals culture: its representation as a Caribbean
Amerindian individual created an unintended context for stereotyping, where users projected the chatbot’s behavior as
a standard for people from the represented population. Another example is the differences in sexual discourse between
male- and female-presenting chatbots. [Brahnam and De Angeli 2012] found that female-presenting chatbots are the
object of implicit and explicit sexual attention and swear words more often than male-presenting chatbots. [De Angeli
and Brahnam 2006] show that sex talks with the male chatbot were rarely coercive or violent; his sexual preference was
often questioned, though, and he was frequently propositioned by reported male users. In contrast, the female character
received violent sexual statements, and she was threatened with rape five times in the analyzed corpora. In [Brahnam
and De Angeli 2012], when the avatars were presented as black adults, references to race often deteriorated into racist
attacks. Manifesting moral agency may, thus, prevent from obnoxious user interactions. In addition, moral agency may
prevent the chatbot itself from being biased or disrespectful to humans. [Schlesinger et al. 2018] argue that the lack of
context about the world does not redeem the chatbot from the necessity of being respectful with all the social groups.
Manuscript submitted to ACM
20 Chaves and Gerosa

[B2] to enrich interpersonal relationships: in a study on how interlocutors perceive conversational agents’
moral agency, [Banks 2018] hypothesized that perceived morality may influence a range of motivations, dynamics, and
effects of human-machine interactions. Based on this claim, the authors evaluated whether goodwill, trustworthiness,
willingness to engage, and relational certainty in future interactions are constructs to measure perceived moral agency.
Statistical results showed that all the constructs correlate with morality, which suggests that manifesting moral agency
can enrich interpersonal relationship with chatbots. Similarly, [Shum et al. 2018] suggest that to produce interpersonal
responses, chatbots should be aware of inappropriate information and avoid generating biased responses.
However, the surveyed literature also reveals challenges of manifesting moral agency:
[C1] to avoid alienation: in order to prevent a chatbot from reproducing hate-speech, or abusive talk, most chatbots
are built over “clean” data, where specific words are removed from their dictionary [De Angeli and Brahnam 2006;
Schlesinger et al. 2018]. These chatbots have no knowledge of those words and their meaning. Although this strategy is
useful to prevent unwanted behavior, it does not manifest agency, but alienates the chatbot of the topic. [De Angeli and
Brahnam 2006] show that the lack of understanding about sex-talk does not prevent the studied chatbot from harsh
verbal abuse, or even from being encouraging. From [Schlesinger et al. 2018], one can notice that the absence of racist
specific words did not prevent the chatbot Zo from uttering discriminatory exchanges. As a consequence, manifesting
moral agency requires a border understanding of the world rather than alienation, which is still an open challenge.
[C2] to build unbiased algorithms and training data: as extensively discussed in [Schlesinger et al. 2018],
machine learning algorithms and corpus-based language generation are biased toward the available training datasets.
Hence, moral agency relies on data that is biased in its nature, producing unsatisfactory results from an ethical perspective.
In [Shum et al. 2018], the authors propose a framework for developing social chatbots. The authors highlight that the
core-chat module should follow ethical design to generate unbiased, non-discriminative responses, but they do not
discuss specific strategies for that. Building unbiased training datasets and learning algorithms that connect the outputs
with individual, real-world experiences, therefore, are challenges to overcome.
Despite the relevance of moral agency to the development of socially intelligent chatbots, we did not find strategies
to address the issues. [Schlesinger et al. 2018] advocate for developing diversity-conscious databases and learning
algorithms that account for ethical concerns; however, the paper focuses on outlining the main research branches and
call the community of designers to adopt new strategies. As discussed in this section, research on perceived moral
agency is still necessary, in order to develop chatbots whose social behavior is inclusive and respectful.

3.2.5 Emotional Intelligence. Emotional intelligence is a subset of social intelligence that allows an individual to
appraise and express feelings, regulate affective reactions, and harness the emotions to solve a problem [Salovey and
Mayer 1990]. Although chatbots do not have genuine emotions [Wallis and Norling 2005], there are considerable
discussions about the role of manifesting (pretended) emotions in chatbots [Ho et al. 2018; Shum et al. 2018; Wallis and
Norling 2005]. An emotionally intelligent chatbot can recognize and control users’ feelings and demonstrate respect,
empathy, and understanding, improving the relationship between them [Li et al. 2017; Salovey and Mayer 1990].
We identified 13 studies that report emotional intelligence. Unlikely the previously discussed categories, most studies
on emotional intelligence focused on understanding the effects of chatbots’ empathy and emotional self-disclosure
[Ayedoun et al. 2017; Dohsaka et al. 2014; Fitzpatrick et al. 2017; Ho et al. 2018; Kumar et al. 2010; Lee and Choi 2017;
Miner et al. 2016; Morris 2002; Portela and Granell-Canut 2017; Shum et al. 2018]. Only three papers highlighted
emotional intelligence as an exploratory outcome [Jenkins et al. 2007; Thies et al. 2017; Zamora 2017], where needs for
emotional intelligence emerged from participants subjective feedback and post-intervention surveys.
Manuscript submitted to ACM
How should my chatbot interact? 21

The main reported benefits of developing emotionally intelligent chatbots are the following:
[B1] to enrich interpersonal relationships: the perception that the chatbot understands one’s feelings may create
a sense of belonging and acceptance [Ho et al. 2018]. [Ayedoun et al. 2017] propose that chatbots for second language
studies should use congratulatory, encouraging, sympathetic, and reassuring utterances to create a friendly atmosphere to
the learner. The authors statistically demonstrated that affective backchannel, combined with communicative strategies
(see Section 3.1.2) significantly increased learners’ confidence and desire to communicate while reducing anxiety. In
another educational study, [Kumar et al. 2010] evaluated the impact of chatbot’s affective moves on being friendly
and achieving social belonging. Qualitative results show that affective moves significantly improve the perception
of amicability, and marginally increased social belonging. According to [Wallis and Norling 2005], when a chatbot’s
emotional reaction triggers a social response from the user, then the chatbot had achieved group membership and the
users’ sympathy. [Dohsaka et al. 2014] proposed a chatbot that uses empathic and self-oriented emotional expressions
to keep users engaged in quiz-style dialog. The survey results revealed that empathic expressions significantly improved
user satisfaction. In addition, the empathic expressions also improved the user ratings of the peer agent regarding
intimacy, compassion, amiability, and encouragement. Although [Dohsaka et al. 2014] did not find effect of chatbot’s
self-disclosure on emotional connection, [Lee and Choi 2017] found that self-disclosure and reciprocity significantly
improved trust and interactional enjoyment. In [Fitzpatrick et al. 2017], seven participants reported that the best thing
about their experience with the therapist chatbot was the perceived empathy. Five participants highlighted that the
chatbot demonstrated attention to their users’ feelings. In addition, the users referred the chatbot as “he,” “a friend,” “a
fun little dude,” which demonstrates that the empathy was directed to the personification of the chatbot. In another
mental health care study, [Miner et al. 2016] found that humans are twice as likely to mirror negative sentiment from a
chatbot than from a human, which is a relevant implication for therapeutic interactions. In [Zamora 2017], participants
reported that some content is embarrassing to ask another human, thus, talking to a chatbot would be easier due to the
lack of judgement. [Ho et al. 2018] measured users’ experience in conversations with a chatbot compared to a human
partner as well as the amount of intimacy disclosure and cognitive reappraisal. Participants in the chatbots condition
experienced as many emotional, relational, and psychological benefits as participants who disclosed to a human partner.
[B2] to increase engagement: [Shum et al. 2018] argue that longer conversations (10+ turns) are needed to meet
the purpose of fulfilling the needs of affection and belonging. Therefore, the authors defined conversation-turns per
session as a success metric for chatbots, where usefulness and emotional understanding are combined. In [Dohsaka
et al. 2014], empathic utterances for the quiz-style interaction significantly increase the number of users’ messages per
hint for both answers and non-answers utterances (such as feedback about the success/failure). This result shows that
empathic utterances encouraged the users to engage and utter non-answers statements. [Portela and Granell-Canut
2017] compared the possibility of emotional connection between a classical chatbot and a pretended chatbot, simulated
in a WoZ experiment. Quantitative results showed that the WoZ condition was more engaging, since it resulted in
conversations that lasted longer, with a higher number of turns. The analysis of the conversational logs revealed the
positive effect of the chatbot manifesting social cues and empathic signs as well as touching on personal topics.
[B3] to increase believability: [Morris 2002] argue that adapting chatbots language to their current emotional
state, along with their personality and social role awareness, results in more believable interactions. The authors
propose that conversation acts should reflect the pretended emotional status of the agent; the extent to which the acts
impact on emotion depends however on the agent’s personality (e.g., its temperament or tolerance). Personality is an
anthropomorphic characteristic and is discussed in Section 3.3.2.
Although emotional intelligence is the goal of several studies, [C1] regulating affective reactions is still a challenge.
Manuscript submitted to ACM
22 Chaves and Gerosa

The chatbot presented in [Kumar et al. 2010] was designed to mimic the patterns of affective moves in human-human
interactions. Nevertheless, the chatbot has shown an only marginally significant increase in social belonging, when
compared to the same interaction with a human partner. Conversational logs revealed that the human tutor performed
a significantly higher number of affective moves in that context. In [Jenkins et al. 2007], the chatbot was designed to
present emotive-like cues, such as exclamation marks, and interjections. The participants rated the degree of emotions
in chatbot’s responses negatively. In [Thies et al. 2017], the energetic chatbot was reported as having an enthusiasm
“too high to match with.” In contrast, the chatbot described as an “emotional buddy” was reported as being “overly
caring.” [Ho et al. 2018] state that chatbot’s empathic utterances may be seen as pre-programmed and inauthentic.
Although their results revealed that the partners’ identity (chatbot vs. person) had no effect in the perceived relational
and emotional experience, the chatbot condition was a WoZ setup. The wizards were blind to whether users thought
they were talking to a chatbot or a person, which reveals that identity does not matter if the challenge of regulating
emotions is overcome.
The chatbots literature also report some strategies to manifest emotional intelligence:
[S1] using social-emotional utterances: affective utterances toward the user are a common strategy to demon-
strate emotional intelligence. [Ayedoun et al. 2017], [Kumar et al. 2010], and [Dohsaka et al. 2014] suggest that affective
utterances improve the interpersonal relationship with a tutor chatbot. In [Ayedoun et al. 2017], the authors propose
affective backchannel utterances (congratulatory, encouraging, sympathetic, and reassuring) to motivate the user to
communicate in a second language. The tutor chatbot proposed in [Kumar et al. 2010] uses solidarity, tension release,
and agreement utterances to promote its social belonging and acceptance in group chats. [Dohsaka et al. 2014] propose
empathic utterances to express opinion about the difficulty or ease of a quiz, and feedback on success and failure.
[S2] to manifest conscientiousness: providing conscientiousness may affect the emotional connection between
humans and chatbots. In [Portela and Granell-Canut 2017], participants reported the rise of affection when the chatbot
remembered something they have said before, even if it was just the user’s name. Keeping track of the conversation
was reported as an empathic behavior and resulted in mutual affection. [Shum et al. 2018] argue that a chatbot needs
to combine usefulness with emotion, by asking questions that help to clarify the users’ intentions. They provide an
example where a user asks the time, and the chatbot answered “Cannot sleep?” as an attempted to guide the conversation
to a more engaging direction. Adopting this strategy requires the chatbot to handle users’ message understanding,
emotion and sentiment tracking, session context modeling, and user profiling [Shum et al. 2018].
[S3] reciprocity and self-disclosure: [Lee and Choi 2017] hypothesized that a high level of self-disclosure and
reciprocity in communication with chatbots would increase trust, intimacy, and enjoyment, ultimately improving user
satisfaction and intention to use. They performed a WoZ, where the assumed chatbot was designed to recommend
movies. Results demonstrated that reciprocity and self-disclosure are strong predictors of rapport and user satisfaction.
In contrast, [Dohsaka et al. 2014] did not find any effect of self-oriented emotional expressions in the users’ satisfaction
or engagement (the number of utterances per hint). More research is needed to understand the extent to which this
strategy produces positive impact on the interaction.
The literature shows that emotional intelligence is widely investigated, with particular interest from education and
mental health care domains. Using emotional utterances in a personalized, context relevant way is still a challenge.
Researchers in chatbots emotional intelligence can learn from emotional intelligence theory [Gross 1998; Salovey and
Mayer 1990] to adapt the chatbots utterances to match the emotions expressed in the dynamic context. Adaption to the
dynamic context also improves the sense of personalized interactions, which is discussed in the next section.

Manuscript submitted to ACM

How should my chatbot interact? 23

3.2.6 Personalization. Personalization refers to the ability of a technology to adapt its functionality, interface,
information access, and content to increase its personal relevance to an individual or a category of individuals [Fan
and Poole 2006]. In the chatbots domain, personalization may increase the agents’ social intelligence, since it allows a
chatbot to be aware of situational context and to adapt dynamically its features to better suit individual needs [Neururer
et al. 2018]. Grounded on robots and artificial agents’ literature, [Liao et al. 2016] argue that personalization can improve
rapport and cooperation, ultimately increasing engagement with chatbots. Although some studies (see e.g., [Fan and
Poole 2006; Liao et al. 2016; Zhang et al. 2018a]) also relate personalization to the attribution of personal qualities such
as personality, we discuss personal qualities in the Personification category. In this section, we focus on the ability to
adapt the interface, content, and behavior to the users’ preferences, needs, and situational context.
We found 11 studies that report personalization. Three studies pose personalization as a research goal [Duijst 2017;
Liao et al. 2016; Shum et al. 2018]. In most of the studies, though, personalization was observed in exploratory findings. In
six studies, personalization emerged from the analysis of interviews and participants self-reported feedback [Duijvelshoff
2017; Jenkins et al. 2007; Neururer et al. 2018; Portela and Granell-Canut 2017; Tallyn et al. 2018; Thies et al. 2017]. In
two studies [Lasek and Jessa 2013; Toxtli et al. 2018], needs for personalization emerged from the conversational logs.
The surveyed literature highlighted three benefits of providing personalized interactions:
[B1] to enrich interpersonal relationships: [Duijvelshoff 2017] state that personalizing the amount of personal
information a chatbot can access and store is required to establish a relation of trust and reciprocity in workplace
environments. In [Neururer et al. 2018], interviews with 12 participants ended up in a total of 59 statements about
how learning from experience promotes chatbot’s authenticity. [Shum et al. 2018] argue that chatbots whose focus is
engagement need to personalize the generation of responses for different users’ backgrounds, personal interests, and
needs in order to serve their needs for communication, affection, and social belonging. In [Portela and Granell-Canut
2017], participants expressed the desire for the chatbot to provide different answers to different users. Although [Duijst
2017] has found no significant effect of personalization on the user experience with the financial assistant chatbot, the
study applies personalization as the ability of giving empathic responses according to the users’ issues, where emotional
intelligence plays a role. Interpersonal relationship can also be enriched by adapting the chatbots’ language to match
the user’s context, energy, and formality; the ability of appropriately using language is discussed in Section 3.2.2.
[B2] to provide unique services: providing personalization increases the value of provided information [Duijst
2017]. In the ethnography data collection study [Tallyn et al. 2018], eight participants reported dissatisfaction about the
chatbot generic guidance to specific places. Participants self-reported that the chatbot should use their current location
to direct them to places more conveniently located, and ask for participants interests and preferences aiming at directing
them to areas that meet their needs. When exploring how teammates used a task-assignment chatbot, [Toxtli et al.
2018] found that the use of the chatbot varied depending on the participants’ levels of hierarchy. Similarly, qualitative
analysis of perceived interruption in a workplace chat [Liao et al. 2016] suggest that interruption is likely associated
with users’ general aversion to unsolicited messages at work. Hence, the authors argue that chatbot’s messages should
be personalized to the user’s general preference. [Liao et al. 2016] also found that users with low social-agent orientation
emphasize the utilitarian value of the system, while users with high social-agent orientation see the system as a
humanized assistant. This outcome advocates to the need of personalizing the interaction to individual users’ mental
model. In [Thies et al. 2017], participants reported preference for a chatbot that remember their details, likes and
dislikes, and preferences, and use the information to voluntarily make useful recommendations. In [Jain et al. 2018b],
two participants also expected chatbots to retain context from previous interactions to improve recommendations.
[B3] to reduce interactional breakdowns: in HCI, personalization is used to customize the interface to be more
Manuscript submitted to ACM
24 Chaves and Gerosa

familiar to the user [Fan and Poole 2006]. When evaluating visual elements (suck as quick replies) compared to typing
the responses, [Lasek and Jessa 2013] observed that users that start the interaction by clicking an option are more likely
to continue the conversation if the next exchange also has visual elements as optional affordances. In contrast, users
who typed are more likely to abandon the conversation when they are facing options to click. Thus, chatbots should
adapt their interface to users’ preferred input methods. In [Jenkins et al. 2007], one participant suggested that the choice
of text color and font size should be customizable. [Duijst 2017] also observed that participants faced difficulties with
small letters, and concluded that adapting the interface to provide accessibility also needs to be considered.
According to the surveyed literature, the main challenge regarding personalization is [C1] privacy. To enrich the
efficiency and productivity of the interaction, a chatbot needs to have memory of previous interactions as well as learn
user’s preferences and disclosed personal information [Thies et al. 2017]. However, as [Duijvelshoff 2017] and [Thies
et al. 2017] suggest, collecting personal data may lead to privacy concerns. Thus, chatbots should showcase transparent
purpose and ethical standard [Neururer et al. 2018]. [Thies et al. 2017] also suggest that there should be a way to inform
a chatbot that something in the conversation is private. Similarly, participants in [Zamora 2017] reported that personal
data and social media content may be inappropriate topics for chatbots because they can be sensitive. These concerns
may be reduced if a chatbot demonstrates care about privacy [Duijvelshoff 2017].
The reported strategies to provide personalization in chatbots interactions are the following:
[S1] to learn from and about the user: [Neururer et al. 2018] state that chatbots should present strategies to
learn from cultural, behavioral, personal, conversational, and contextual interaction data. For example, the authors
suggest using Facebook profile information to build knowledge about users’ personal information. [Thies et al. 2017]
also suggest that the chatbot should remember user’s preferences disclosed in previous conversations. In [Shum et al.
2018], the authors propose an architecture where responses are generated based on a personalization rank that applies
users’ feedback about their general interests and preferences. When evaluating the user’s experience with a virtual
assistant chatbot, [Zamora 2017] found 16 mentions to personalized interactions, where participants showed needs for
a chatbot to know their personal quirks and to anticipate their needs.
[S2] to provide customizable agents: [Liao et al. 2016] suggest that users should be able to choose the level of the
chatbot’s attributes, for example, the agent’s look and persona. By doing so, users with low social-agent orientation
could use a non-humanized interface, which would better represent their initial perspective. This differentiation could
be the first signal to personalize further conversation, such as focusing on more productive or playful interactions.
Regarding chatbots’ learning capabilities, in [Duijvelshoff 2017], interviews with potential users revealed that users
should be able to manage what information the chatbot know about them and to decide whether the chatbot can learn
from previous interactions or not. If the user prefers a more generic chatbot, then it would not store personal data,
potentially increasing the engagement with more resistant users. [Thies et al. 2017] raise the possibility of having an
“incognito” mode for chatbots or ask for the chatbot to forget what was said in previous utterances.
[S3] visual elements: [Tallyn et al. 2018] adopted quick replies as a mean to the chatbot to tailor its subsequent
questions to the specific experience the participant had reported. As discussed in Section 3.1.2, quick replies may be
seen as restrictive from an interactional perspective; however, conversation logs showed that the tailored questions
prompted the users to report more details about their experience, which is important in ethnography research.
Both the benefits and strategies identified from the literature are in line with the types of personalization proposed by
[Fan and Poole 2006]. Therefore, further investigations in personalization can leverage the knowledge from interactive
systems (e.g., [Fan and Poole 2006; Thomson 2005]) to adapt personalization strategies and handling the privacy concern.

Manuscript submitted to ACM

How should my chatbot interact? 25

In summary, the social intelligence category includes characteristics that help a chatbot to manifest an adequate social
behavior, by managing conflicts, using appropriate language, displaying manners and moral agency, sharing emotions, and
handling personalized interactions. The benefits relate to resolving social positioning and recovering from failures, as well as
increasing believability, human-likeness, engagement, and rapport. To achieve that, designers and researchers should care about
privacy, emotional regulation issues, language consistency, and identification of failures and inappropriate content.

3.3 Personification
In this section, we discuss the influence of identity projection on human-chatbot interaction. Personification refers to
assigning personal traits to non-human agents, including physical appearance, and emotional states [Fan and Poole
2006]. In the HCI field, researchers argue that using a personified character in the user interface is a natural way to
support the interaction [Koda 2003]. Indeed, the literature shows that (i) users can be induced to behave as if computers
were humans, even when they consciously know that human attributes are inappropriate [Nass et al. 1993]; and (ii) the
more human-like a computer representation is, the more social people’s responses are [Gong 2008].
Chatbots are, by definition, designed to have at least one human-like trait: the (human) natural language. Although
research on personification is obviously more common in Embodied Conversational Agents field, [De Angeli et al.
2001a] claim that a chatbot’s body can be created through narrative without any visual help. According to [De Angeli
2005], talking to a machine affords it a new identity. In this section, we divided the social characteristics that reflect
personification into identity (16 papers) and personality (12 papers). In this category, we found several studies where
part of the main investigation relates to the social characteristics. See the supplementary materials for details (Appendix
A).

3.3.1 Identity. Identity refers to the ability of an individual to demonstrate belonging to a particular social group
[Stets and Burke 2000]. Although chatbots do not have the agency to decide what social group they want to belong,
designers attribute identity to them, intentionally or not, when they define the way a chatbot talks or behaves [Cassell
2009]. The identity of a partner (even if only perceived) gives rise to new processes, expectations, and effects that affect
the outcomes of the interaction [Ho et al. 2018]. Aspects that convey the chatbots’ identity include gender, age, language
style, and name. Additionally, chatbots may have anthropomorphic, zoomorphic, or robotic representations. Some
authors include identity aspects in the definition of personality (see, e.g., [Shum et al. 2018]). We distinguish these two
characteristics, where identity refers to the appearance and cultural traits while personality focus on behavioral traits.
We found 16 studies that discuss identity issues, ten of which have identity as part of their main investigation [Araujo
2018; Candello et al. 2017; Ciechanowski et al. 2018; Corti and Gillespie 2016; De Angeli and Brahnam 2006; De Angeli
et al. 2001a; Jenkins et al. 2007; Liao et al. 2018; Marino 2014; Schlesinger et al. 2018]. In two studies, the authors argue
on the impact of identity on the interaction based on the literature [Brandtzaeg and Følstad 2018; Gnewuch et al. 2017].
In three studies [Neururer et al. 2018; Silvervarg and Jönsson 2013; Thies et al. 2017; Toxtli et al. 2018], qualitative
analysis of conversational logs revealed that participants put efforts into understanding aspects of the chatbots’ identity.
The identified benefits of attributing identity to a chatbot are the following:
[B1] to increase engagement: when evaluating signals of playful interactions, [Liao et al. 2018] found that agent-
oriented conversations (asking about agent’s traits and status) are consistent with the tendency to anthropomorphize
the agent and engage in chit-chat. In the educational scenario, [Silvervarg and Jönsson 2013] also observed questions
about agent’s appearance, intellectual capacities, and sexual orientation, although the researchers considered these
Manuscript submitted to ACM
26 Chaves and Gerosa

questions inappropriate for the context of tutoring chatbots. When comparing human-like vs. machine-like language
style, greetings, and chatbot’s framing, [Araujo 2018] noticed that using informal language, having a human name, and
using greetings associated with human communication resulted in significantly higher scores for adjectives like likeable,
friendly, and personal. In addition, framing the agent as “intelligent” also had a slightly influence in users’ scores.
[B2] to increase human-likeness: some attributes may convey a perceived human-likeness. [Araujo 2018] showed
that using a human-like language style, name, and greetings resulted in significantly higher scores for naturalness.
The chatbot’s framing influenced the outcomes when combined with other anthropomorphic clues. When evaluating
different typefaces for a financial adviser chatbot, [Candello et al. 2017] found that users perceive machine-like typefaces
as more chatbot-like, although they did not find strong evidence of handwriting-like typefaces conveying humanness.
The surveyed literature also highlights challenges regarding identity:
[C1] to avoid negative stereotypes: when engaging in a conversation, interlocutors base their behavior on
common ground (joint knowledge, background facts, assumptions, and beliefs that participants have of each other (see
[De Angeli et al. 2001a]). Common ground reflects stereotypical attributions that chatbots should be able to manage as
the conversation evolves [De Angeli 2005]. In [De Angeli et al. 2001a], the authors discuss that chatbots for company
representations are often personified as attractive human-like women acting as spokespeople for their companies,
while men chatbots tend to have a more important position, such as a virtual CEO. [De Angeli and Brahnam 2006]
state that the agent self-disclosure of gender identity opens possibilities to sex talk. The authors observed that the
conversations mirror stereotyped male/female encounters, and the ambiguity of the chatbot’s gender may influence the
exploration of homosexuality. However, fewer instances were observed of sex-talk with the chatbot personified as a
robot, which demonstrates that the gender identity may lead to the stereotypical attributions. When evaluating the
effect of gender identity on disinhibition, [Brahnam and De Angeli 2012] showed that people spoke more often about
physical appearance and used more swear and sexual words with the female-presenting chatbot, and racist attacks were
observed in interactions with chatbots represented as a black person. The conversation logs from [Jenkins et al. 2007]
also show instances of users attacking the chatbot persona (a static avatar of a woman pointing to the conversation box).
[Marino 2014] and [Schlesinger et al. 2018] also highlight that race identity conveys not only the physical appearance,
but all the socio-cultural expectations about the represented group (see discussion in Section 3.2.4). Hence, designers
should care about the impact of attributing an identity to chatbots in order to avoid reinforcing negative stereotypes.
[C2] to balance the identity and the technical capabilities: literature comparing embodied vs. disembodied
conversational agents have contradictory results regarding the relevance of a human representation. For example, in
the context of general-purpose interactions, [Corti and Gillespie 2016] show that people demonstrate more efforts
toward establishing common ground with the agent when they are represented as fully human; in contrast, when
evaluating a website assistant chatbot, [Ciechanowski et al. 2018] show that simpler text-based chatbots with no visual,
human identity resulted in lesser uncanny effect and less negative affect. Overly humanized agents create a higher
expectation on users, which eventually leads to more frustration when the chatbot fails [Gnewuch et al. 2017]. When
arguing on why chatbots fail, [Brandtzaeg and Følstad 2018] advocate for balancing human versus robot aspects, where
“too human” representations may lead to off-topic conversations and overly robotic interactions may lack personal
touch and flexibility. When arguing on the social presence conveyed by deceptive chatbots, [De Angeli 2005] state that
extreme anthropomorphic features may generate cognitive dissonance. The challenge, thus, lies on designing a chatbot
that provides appropriate identity cues, corresponding to their capabilities and communicative purpose, in order to
convey the right expectation and minimize negative discomforts caused by over personification.
Regarding the strategies, the surveyed literature suggests [S1] to design and elaborate on a persona. Chatbots
Manuscript submitted to ACM
How should my chatbot interact? 27

should have a comprehensive persona and answer agent-oriented conversations with a consistent description of itself
[Liao et al. 2018; Neururer et al. 2018]. For example, [De Angeli 2005] discuss that Eliza, the psychotherapist chatbot, and
Parry, a paranoid chatbot, have behaviors that are consistent with the stereotypes associated with the professional and
personal identities, respectively. [Toxtli et al. 2018] suggest that designers should explicitly build signals of the chatbot
personification (either machine- or human-like), so the users can have the right expectation about the interaction. When
identity aspects are not explicit, users try to establish common ground. In [Liao et al. 2018] and [Silvervarg and Jönsson
2013], many of the small talk with the chatbot related to the chatbot’s traits and status. In [De Angeli et al. 2001a],
the authors observed many instances of Alice’s self-references to “her” artificial nature. These references triggered
the users to reflect on their human-condition (self-categorization process), resulting in exchanges about their species
(either informational or confrontational). Similar results were observed by [Thies et al. 2017], as participants engaged
in conversations about the artificial nature of the agent. Providing the chatbot with the ability to describe its personal
identity helps to establish the common ground, and hence, enrich the interpersonal relationship [De Angeli et al. 2001a].
Chatbots may be designed to deceive users about its actual identity, pretending to be a human [De Angeli 2005]. In
this case, the more human the chatbot sounds, the more successful it is. In many cases, however, there is no need to
engage in deception and the chatbots can be designed to represent an elaborated persona. Researchers can explore
social identity theory [Brown 2000; Stets and Burke 2000] regarding to ingroup bias, power relations, homogeneity, and
stereotyping, in order to design chatbots with identity traits that reflect their expected social position [Harré et al. 2003].

3.3.2 Personality. Personality refers to personal traits that help to predict someone’s thinking, feeling, and behaving
[McCrae and Costa Jr 1997]. The most accepted set of traits is called Five-Factor model (or Big Five model) [Goldberg
1990; McCrae and Costa Jr 1997]), which describes personality in five dimensions (extraversion, agreeableness, conscien-
tiousness, neuroticism, and openness). However, personality can also refer to other dynamic, behavioral characteristics,
such as temperament and sense of humor [Thorson and Powell 1993; Zuckerman et al. 1993]. In the chatbots domain,
personality refers to the set of traits that determines the agent’s interaction style, describes its character, and allows the
end-user to understand its general behavior [De Angeli et al. 2001a]. Chatbots with consistent personality are more
predictable and trustable [Shum et al. 2018]. According to [De Angeli et al. 2001b], unpredictable swings in chatbot’s
attitudes can disorient the users and create a strong sense of discomfort. Thus, personality ensures that a chatbot
displays behaviors that stand in agreement with the users’ expectations in a particular context [Petta and Trappl 1997].
We found 12 studies that report personality issues for chatbots. In some studies, personality was investigated in
reference to the Big Five model [Mairesse and Walker 2009; Morris 2002; Sjödén et al. 2011], while two studies focused on
sense of humor [Meany and Clark 2010; Ptaszynski et al. 2010]. Three studies investigated the impact of the personality
of tutor chatbots on students’ engagement [Ayedoun et al. 2017; Kumar et al. 2010; Sjödén et al. 2011]. [Thies et al.
2017] compared users’ preferences regarding pre-defined personalities. In the remaining studies, [Brandtzaeg and
Følstad 2017; Jain et al. 2018b; Portela and Granell-Canut 2017; Shum et al. 2018] personality concerns emerged from
the qualitative analysis of the interviews, users’ subjective feedback, and literature reviews [Meany and Clark 2010].
The surveyed literature revealed two benefits of attributing personality to chatbots:
[B1] to exhibit believable behavior: [Morris 2002] states that chatbots should have a personality, defined by the
Five Factor model plus characteristics such as temperament and tolerance, in order to build utterances using linguistic
choices that coheres with these attributions. When evaluating a joking chatbot, [Ptaszynski et al. 2010] compared the
naturalness of the chatbot’s inputs and the chatbot’s human-likeness compared to a no-joking chatbot. The joking
chatbot scored significantly higher in both constructs. [Portela and Granell-Canut 2017] also showed that sense of
Manuscript submitted to ACM
28 Chaves and Gerosa

humor humanizes the interactions, since humor was one of the factors that influenced the naturalness for the WoZ
condition. [Mairesse and Walker 2009] demonstrated that manipulating language to manifest a target personality
produced moderately natural utterances, with a mean rating of 4.59 out of 7 for the personality model utterances.
[B2] to enrich interpersonal relationships: chatbots personality can make the interaction more enjoyable
[Brandtzaeg and Følstad 2017; Jain et al. 2018b]. In the study from [Brandtzaeg and Følstad 2017], the second most
frequent motivation for using chatbots, pointed out by 20% of the participants, was entertainment. The authors argue
that the chatbot being fun is important even when the main purpose is productivity; according to participants, the
chatbot’s “fun tips” enrich the user experience. This result is consistent with the experience of first-time users [Jain et al.
2018b], where participants relate better with chatbots who have consistent personality. [Thies et al. 2017] show that
witty banter and casual, enthusiastic conversations help to make the interaction effortless. In addition, a few participants
enjoyed the chatbot with a caring personality, who was described as a good listener. In [Shum et al. 2018] and [Sjödén
et al. 2011], the authors argue that a consistent personality helps the chatbot to gain the users’ confidence and trust.
[Sjödén et al. 2011] state that tutor chatbots should display appropriate posture, conduct, and representation, which
include being encouraging, expressive, and polite. Accordingly, other studies students desire for chatbots with positive
agreeableness and extraversion [Ayedoun et al. 2017; Kumar et al. 2010; Sjödén et al. 2011]. Outcomes consistently
suggest that students prefer a chatbot that is not overly polite, but has some attitude. Agreeableness seems to play a
critical role, helping the students to be encouraged and to deal with difficulties. Noticeably, agreeableness requires
emotional intelligence to be warm and sympathetic in appropriate circumstances (see Section 3.2.5).
The reviewed literature pointed out two challenges regarding personality:
[C1] to adapt humor to the users’ culture: sense of humor is highly shaped by cultural environment [Ruch 1998].
[Ptaszynski et al. 2010] discusses a Japanese chatbot who uses puns to create funny conversations. The authors state
that puns are one of the main humor genres in that culture. However, puns are restricted to the culture and language
they are built, with low portability level. Thus, the design challenge lies on personalizing chatbots’ sense of humor to
the target users’ culture and interests or, alternatively, designing cross-cultural kinds of humor. The ability to adapt to
the context and users’ preference is discussed in Section 3.2.6.
[C2] to balance personality traits: [Thies et al. 2017] observed that the users prefer the proactive, productive,
witty chatbot. However, they also would like to add traits such as caring, encouraging, exciting. In [Mairesse and Walker
2009], the researchers intentionally generated utterances to reflect extreme personalities; as a result, they observed that
some utterances sounded unnatural because humans’ personality is a continuous phenomenon, rather than discrete.
[Sjödén et al. 2011] also points out that, although personality is consistent, moods and states of mind constantly vary.
Thus, balancing the predictability of the personality and the expected variation is a challenge to overcome.
We also identified strategies to design chatbots that manifest personality:
[S1] to use appropriate language: [Shum et al. 2018] and [Morris 2002] suggest that the chatbot language should
be consistently influenced by its personality. Both studies propose that chatbots architecture should include a persona-
based model that encode the personality and influence the response generation. The framework proposed by [Mairesse
and Walker 2009] shows that it is possible to automatically manipulate language features to manifest a particular
personality based on the Big Five model. The Big-Five model is a relevant tool because it can be assessed using validated
psychological instruments [McCrae and Costa Jr 1987]. Using this model to represent the personality of chatbots was
also suggested by [Morris 2002] and [Sjödén et al. 2011]. [Jain et al. 2018b] discussed that the chatbot personality should
match its domain. Participants expected the language used by the news chatbot to be professional, while they expected
the shopping chatbot to be casual and humorous. The ability to use consistent language is discussed in Section 3.2.2.
Manuscript submitted to ACM
How should my chatbot interact? 29

[S2] to have a sense of humor: literature highlights humor as a positive personality trait [Meany and Clark 2010].
In [Jain et al. 2018b], ten participants mentioned enjoyment when the chatbots provided humorous and highly diverse
responses. The authors found occurrences of the participants asking for jokes and being delighted when the request
was supported. [Brandtzaeg and Følstad 2017] present similar results when arguing that humor is important even for
task-oriented chatbots when the user is usually seeking for productivity. For casual conversations, [Thies et al. 2017]
highlight that timely, relevant, and clever wit is a desired personality trait. In [Ptaszynski et al. 2010], the joker chatbot
was perceived as more human-like, knowledgeable and funny, and participants felt more engaged.
Personality for artificial agents has been studied for a while in the Artificial Intelligence field [Elliott 1994; Petta and
Trappl 1997; Rousseau and Hayes-Roth 1996]. Thus, further investigations on chatbots’ personality can leverage models
for personality and evaluate how they contribute to believability and rapport building.

In summary, the personification category includes characteristics that help a chatbot to manifest personal and behavioral
traits. The benefits relate to increasing believability, human-likeness, engagement, and interpersonal relationship, which is in line
with the benefits of social intelligence. However, unlike the social intelligence category, designers and researchers should
focus on attributing recognizable identity and personality traits that are consistent with users’ expectations and the chatbot’s
capabilities. In addition, it is important to care about adaptation to users’ culture and reducing the effects of negative stereotypes.

4 INTERRELATIONSHIPS AMONG THE CHARACTERISTICS

In Section 3, we organized the social characteristics into discrete groups. However, we discussed several instances of
characteristics influencing each other, or being used as a strategy to manifest one another. In this section, we describe
these relations in a theoretical framework, depicted in Figure 1. Boxes represent the social characteristics and the colors
group them into their respective categories. The axes represent the 22 propositions we derived from the literature.
According to the surveyed literature, proactivity influences the perceived personality (P1) [Thies et al. 2017], since
recommending and initiating topics may manifest higher levels of extraversion. When the proactive messages are based
on the context, proactivity increases perceived conscientiousness (P2) [Thies et al. 2017], since the proactive messages
may demonstrate attention to the topic. Proactivity supports communicability (P3) [Chaves and Gerosa 2018; Silvervarg
and Jönsson 2013; Valério et al. 2017], since a chatbot can proactively communicate its knowledge and provide guidance;

Fig. 1. Interrelationship among social characteristics

Manuscript submitted to ACM
30 Chaves and Gerosa

in addition, proactivity supports damage control (P4) [Silvervarg and Jönsson 2013], since a chatbot can introduce new
topics when the user either is misunderstood, try to break the system, or send an inappropriate message.
Conscientiousness is by itself a dimension of the Big-Five model; hence, conscientiousness influences the perceived
personality (P5) [Goldberg 1990]. Higher levels of context management, goal-orientation, and understanding increase the
chatbots’ perceived efficiency, organization, and commitment [Dyke et al. 2013]. Conscientiousness manifests emotional
intelligence (P6) since retaining information from previous turns and being able to recall them show empathy [Jain
et al. 2018b]. In addition, conscientiousness manifests personalization (P7) [Jain et al. 2018b; Thies et al. 2017] because a
chatbot can remember individual preferences within and across sessions.
Emotional intelligence influences the perceived personality (P8), since chatbots’ personality traits affect the intensity
of the emotional reactions [Morris 2002; Thies et al. 2017]. Agreeableness is demonstrated through consistent warm
reactions such as encouraging and motivating [Ayedoun et al. 2017; Kumar et al. 2010; Shum et al. 2018; Sjödén et al.
2011]. Some personality traits require personalization (P9) to adapt to the interlocutors’ culture and interests [Ptaszynski
et al. 2010]. Besides, personalization benefits identity (P10), since the users’ social-agent orientation may require a
chatbot to adapt the level of engagement in small talk and the agent’s visual representation [Jenkins et al. 2007; Liao
et al. 2016]. Personalization also improves emotional intelligence (P11), since a chatbot should dynamically regulate the
affective reactions to the interlocutor [Shum et al. 2018; Thies et al. 2017]. Emotional intelligence improves perceived
manners (P12), since the lack of emotional intelligence may lead to the perception of impoliteness [Jenkins et al. 2007].
Conscientiousness facilitates damage control (P13), since the attention to the workflow and context may increase the
ability to recover from a failure without restarting the workflow [Duijst 2017; Dyke et al. 2013; Gnewuch et al. 2017; Jain
et al. 2018a]. Communicability facilitates damage control (P14), since it teaches the user how to communicate, reducing
the numbers of mistakes [Duijst 2017; Gnewuch et al. 2017; Jain et al. 2018b; Silvervarg and Jönsson 2013; Wallis and
Norling 2005]. In addition, suggesting how to interact can reduce frustration after failure scenarios [Liao et al. 2018].
Personalization manifest thoroughness (P15) [Duijst 2017; Gnewuch et al. 2017; Hill et al. 2015; Thies et al. 2017],
since chatbots can adapt their language use to the conversational context and the interlocutor’s expectations. When
the context requires dynamic variation [Gnewuch et al. 2017; Thies et al. 2017], thoroughness may reveal traits of
the chatbot’s identity (P16) [Marino 2014; Schlesinger et al. 2018]. As demonstrated by [Mairesse and Walker 2009],
thoroughness also reveals personality (P17).
Manners influence conscientiousness (P18) [Wallis and Norling 2005], since it can be applied as a strategy to politely
refuse off-topic requests and to keep the conversation on track. Manners also influence damage control (P19) [Curry
and Rieser 2018; Wallis and Norling 2005], because it can help a chatbot to prevent verbal abuse and reduce the negative
effect of lack of knowledge. Both moral agency (P20) and emotional intelligence (P21) improve damage control because
they provide the ability to appropriately respond to abuse and testing [Silvervarg and Jönsson 2013; Wallis and Norling
2005]. Identity influences moral agency (P22), since identity representations require the ability to prevent a chatbot
from building or reinforcing negative stereotypes [Brahnam and De Angeli 2012; Marino 2014; Schlesinger et al. 2018].

5 RELATED SURVEYS
Previous studies have reviewed the literature on chatbots. Several surveys discuss chatbot’s urgency [Dale 2016; Pereira
et al. 2016] and their potential application for particular domains, which include education [Deryugina 2010; Rubin
et al. 2010; Satu et al. 2015; Shawar and Atwell 2007; Winkler and Söllner 2018], business [Deryugina 2010; Shawar
and Atwell 2007], health [Fadhil 2018; Laranjo et al. 2018], information retrieval and e-commerce [Shawar and Atwell
2007]. Other surveys focus on technical design techniques [Ahmad et al. 2018; Deshpande et al. 2017; Masche and
Manuscript submitted to ACM
How should my chatbot interact? 31

Le 2017; Ramesh et al. 2017; Thorne 2017; Walgama and Hettige 2017; Winkler and Söllner 2018], such as language
generation models, knowledge management, and architectural challenges. Although [Augello et al. 2017] discuss social
capabilities of chatbots, the survey focuses on the potential of available open source technologies to support these skills,
highlighting technical hurdles rather than social ones.
We found three surveys [Ferman 2018; Pereira et al. 2016; Radziwill and Benton 2017] that include insights about
social characteristics of chatbots, although none of them focus on this theme. [Pereira et al. 2016] investigated chatbots
that “mimic conversation rather than understand it,” and review the main technologies and ideas that support their
design, while [Ferman 2018] focuses on identifying best practices for developing script-based chatbots. [Radziwill and
Benton 2017] review the literature on quality issues and attributes for chatbots. The supplementary materials include a
table that shows the social characteristics covered by each survey (Appendix A). These related surveys also point out
technical characteristics and attributes that are outside the scope of this survey.

6 LIMITATIONS
This research has some limitations. Firstly, since this survey focused on disembodied, text-based chatbots, the literature
on embodied and speech-based conversational agents was left out. We acknowledge that studies that include these
attributes can have relevant social characteristics for chatbots, especially for characteristics that could be highly
influenced by physical representations, tone, accent, and so forth (e.g. identity, politeness, and thoroughness). However,
embodiment and speech could also bring new challenges (e.g., speech-recognition or eye-gazing), which are out of the
scope of this study and could potentially impact the users experience with the chatbots. Secondly, since the definition of
chatbot is not consolidated in the literature and chatbots have been studied in several different domains, some studies
that include social aspects of chatbots may have not be found. To account for that, we adopted several synonyms in
our research string and used Google Scholar as search engine, which provides a fairly comprehensive indexing of
the literature in more domains. Finally, the conceptual model of social characteristics was derived through a coding
process inspired in qualitative methods, such as Grounded Theory. Like any qualitative coding methods, it relies on
the researchers’ subjective assessment. To mitigate this threat, the researchers discussed the social characteristics
and categories during in-person meetings until reaching consensus, and the conceptual framework along with the
relationship among characteristics were derived considering outcomes explicitly reported in the surveyed studies.

7 CONCLUSION
In this survey, we investigated the literature on disembodied, text-based chatbots to answer the question “What chatbot
social characteristic benefit human interactions and what are the challenges and strategies associated with them?”. Our
main contribution is the conceptual model of social characteristics, from which we can derive conclusions about several
research opportunities. Firstly, we point out several challenges to overcome in order to design chatbots that manifest
each characteristic. Secondly, further research may focus on assessing the extent to which the identified benefits are
perceived by the users and influence users’ satisfaction. Finally, further investigations may propose new strategies to
manifest particular characteristics. In this sense, we highlight that we could not identify strategies to manifest moral
agency and thoroughness, although strategies for several other characteristics are also under-investigated. We also
discussed the relationship among the characteristics. Our results give important references for helping designers and
researchers find opportunities to advance the human-chatbot interactions field.
Manuscript submitted to ACM
32 Chaves and Gerosa

REFERENCES
Sameera A Abdul-Kader and JC Woods. 2015. Survey on chatbot design techniques in speech conversation systems. IJACSA 6, 7 (2015).
Nahdatul Akma Ahmad, Mohamad Hafiz Che, Azaliza Zainal, Muhammad Fairuz Abd Rauf, and Zuraidy Adnan. 2018. Review of Chatbots Design
Techniques. International Journal of Computer Applications 181, 8 (Aug. 2018), 7–10.
Colin Allen, Wendell Wallach, and Iva Smit. 2006. Why machine ethics? IEEE Intelligent Systems 21, 4 (2006), 12–17.
Theo Araujo. 2018. Living up to the chatbot hype: The influence of anthropomorphic design cues and communicative agency framing on conversational
agent and company perceptions. Comput. Hum. Behav. 85 (2018), 183–189.
Carl Auerbach and Louise B Silverstein. 2003. Qualitative data: An introduction to coding and analysis. NYU press.
Agnese Augello, Manuel Gentile, and Frank Dignum. 2017. An Overview of Open-Source Chatbots Social Skills. In INSCI. Springer, 236–248.
Sandeep Avula, Gordon Chadwick, Jaime Arguello, and Robert Capra. 2018. SearchBots: User Engagement with ChatBots during Collaborative Search. In
Proceedings of the 2018 Conference on Human Information Interaction&Retrieval. ACM, 52–61.
Emmanuel Ayedoun, Yuki Hayashi, and Kazuhisa Seta. 2017. Can Conversational Agents Foster LearnersẂillingness to Communicate in a Second
Language?: effects of communication strategies and affective backchannels. In Proceedings of the 25th International Conference on Computers in
Education, W Chen et al. (Eds.).
Emmanuel Ayedoun, Yuki Hayashi, and Kazuhisa Seta. 2018. Adding Communicative and Affective Strategies to an Embodied Conversational Agent to
Enhance Second Language LearnersâĂŹ Willingness to Communicate. International Journal of Artificial Intelligence in Education (2018), 1–29.
Jaime Banks. 2018. A Perceived Moral Agency Scale: Development and Validation of a Metric for Humans and Social Machines. Comput. Hum. Behav.
(2018).
Naomi S Baron. 1984. Computer mediated communication as a force in language change. Visible language 18, 2 (1984), 118.
Kaj Björkqvist, Karin Österman, and Ari Kaukiainen. 2000. Social intelligence-empathy=aggression? Aggression and violent behavior 5, 2 (2000), 191–200.
Marion Boiteux. 2019. Messenger at F8 2018. Retrieved October 18, 2019 from https://bit.ly/2zXVnPH. Messenger Developer Blog.
Sheryl Brahnam and Antonella De Angeli. 2012. Gender affordances of conversational agents. Interacting with Computers 24, 3 (2012), 139–153.
Petter Bae Brandtzaeg and Asbjørn Følstad. 2017. Why people use chatbots. In INSCI. Springer, 377–392.
Petter Bae Brandtzaeg and Asbjørn Følstad. 2018. Chatbots: changing user needs and motivations. Interactions 25, 5 (2018), 38–43.
Penelope Brown. 2015. Politeness and language. In IESBS, 2nd ed. Elsevier, 326–330.
Penelope Brown and Stephen C. Levinson. 1987. Politeness: Some universals in language usage. Vol. 4. Cambridge university press.
Rupert Brown. 2000. Social identity theory: Past achievements, current problems and future challenges. Eur. J. Soc. Psychol. 30, 6 (2000), 745–778.
Heloisa Candello, Claudio Pinhanez, and Flavio Figueiredo. 2017. Typefaces and the Perception of Humanness in Natural Language Chatbots. In Proc. of
the SIGCHI CHI Conference. ACM, 3476–3487.
Heloisa Candello, Claudio Pinhanez, Mauro Carlos Pichiliani, Melina Alberio Guerra, and Maira Gatti de Bayser. 2018. Having an Animated Coffee with a
Group of Chatbots from the 19 th Century. In Proc. of the SIGCHI CHI Conference (Extended Abstract). ACM, D206.
Justine Cassell. 2009. Social practice: Becoming enculturated in human-computer interaction. In Int Conf on UAHCI. Springer, 303–313.
Ana Paula Chaves and Marco Aurelio Gerosa. 2018. Single or Multiple Conversational Agents?: An Interactional Coherence Comparison. In Proc. of the
SIGCHI CHI Conference. ACM, 191.
Leon Ciechanowski, Aleksandra Przegalinska, Mikolaj Magnuski, and Peter Gloor. 2018. In the shades of the uncanny valley: An experimental study of
human–chatbot interaction. Future Generation Computer Systems (2018).
David Coniam. 2008. Evaluating the language resources of chatbots for their potential in English as a second language learning. ReCALL 20, 1 (2008),
99–116.
Susan Conrad and Douglas Biber. 2009. Register, genre, and style. Cambridge University Press.
Kevin Corti and Alex Gillespie. 2016. Co-constructing intersubjectivity with artificial conversational agents: people are more likely to initiate repairs of
misunderstandings with agents represented as human. Comput. Hum. Behav. 58 (2016), 431–442.
Amanda Cercas Curry and Verena Rieser. 2018. # MeToo Alexa: How Conversational Systems Respond to Sexual Harassment. In Proceedings of the Second
ACL Workshop on Ethics in Natural Language Processing. 7–14.
Nils Dahlbäck, Arne Jönsson, and Lars Ahrenberg. 1993. Wizard of Oz studies–why and how. Knowledge-based systems 6, 4 (1993), 258–266.
Robert Dale. 2016. The return of the chatbots. Natural Language Engineering 22, 5 (2016), 811–817.
Antonella De Angeli. 2005. To the rescue of a lost identity: Social perception in human-chatterbot interaction. In Virtual Agents Symposium. 7–14.
Antonella De Angeli and Sheryl Brahnam. 2006. Sex stereotypes and conversational agents. Proc. of Gender and Interaction: real and virtual women in a
male world (2006).
Antonella De Angeli, Graham I Johnson, and Lynne Coventry. 2001a. The unfriendly user: exploring social reactions to chatterbots. In Proc. of the CAHD.
467–474.
Antonella De Angeli, Paula Lynch, and Graham Johnson. 2001b. Personifying the e-Market: A Framework for Social Agents.. In Interact. 198–205.
Maíra Gatti de Bayser, Paulo Rodrigo Cavalin, Renan Souza, Alan Braz, Heloisa Candello, Claudio S. Pinhanez, and Jean-Pierre Briot. 2017. A Hybrid
Architecture for Multi-Party Conversational Systems. CoRR arXiv/1705.01214 (2017).
Clarisse S De Souza, Raquel O Prates, and Simone DJ Barbosa. 1999. A method for evaluating software communicability. PUC-RioInf 1200 (1999), 11–99.
OV Deryugina. 2010. Chatterbots. Scientific and Technical Information Processing 37, 2 (2010), 143–147.

Manuscript submitted to ACM

How should my chatbot interact? 33

Aditya Deshpande, Alisha Shahane, Darshana Gadre, Mrunmayi Deshpande, and Prachi M Joshi. 2017. A survey of various chatbot implementation
techniques. International Journal of Computer Engineering and Applications, Special Issue XI (May 2017).
Kohji Dohsaka, Ryota Asai, Ryuichiro Higashinaka, Yasuhiro Minami, and Eisaku Maeda. 2014. Effects of conversational agents on activation of
communication in thought-evoking multi-party dialogues. IEICE TRANSACTIONS on Information and Systems 97, 8 (2014), 2147–2156.
Daniëlle Duijst. 2017. Can we Improve the User Experience of Chatbots with Personalisation. Master’s thesis. University of Amsterdam.
Willem Duijvelshoff. 2017. Use-Cases and Ethics of Chatbots on Plek: a Social Intranet for Organizations. In Workshop On Chatbots And Artificial
Intelligence.
Gregory Dyke, Iris Howley, David Adamson, Rohit Kumar, and Carolyn Penstein Rosé. 2013. Towards academically productive talk supported by
conversational agents. In Intelligent Tutoring Systems, Cerri S.A., Clancey W.J., Papadourakis G., and Panourgia K. (Eds.). Springer, 459–476.
Clark Elliott. 1994. Research problems in the use of as allow Artificial Intelligence model of personality and emotion. In AAAI-94 Proc.
Ahmed Fadhil. 2018. Can a Chatbot Determine My Diet?: Addressing Challenges of Chatbot Application for Meal Recommendation. CoRR arXiv:1802.09100
(2018).
Haiyan Fan and Marshall Scott Poole. 2006. What is personalization? Perspectives on the design and implementation of personalization in information
systems. Journal of Organizational Computing and Electronic Commerce 16, 3-4 (2006), 179–202.
Maria Ferman. 2018. Towards Best Practices for Chatbots. Master’s thesis. Universidad Villa Rica.
Emilio Ferrara, Onur Varol, Clayton Davis, Filippo Menczer, and Alessandro Flammini. 2016. The rise of social bots. Commun. ACM 59, 7 (2016), 96–104.
Kathleen Kara Fitzpatrick, Alison Darcy, and Molly Vierhile. 2017. Delivering cognitive behavior therapy to young adults with symptoms of depression
and anxiety using a fully automated conversational agent (Woebot): a randomized controlled trial. JMIR mental health 4, 2 (2017).
Mary Anne Fitzpatrick and Jeff Winke. 1979. You always hurt the one you love: Strategies and tactics in interpersonal conflict. Commun. Q. 27, 1 (1979),
3–11.
B.J. Fogg. 2003. Computers as persuasive social actors. In Persuasive Technology, B.J. Fogg (Ed.). Morgan Kaufmann, San Francisco, Chapter 5, 89 – 120.
Asbjørn Følstad and Petter Bae Brandtzæg. 2017. Chatbots and the new world of HCI. interactions 24, 4 (2017), 38–42.
Jodi Forlizzi, John Zimmerman, Vince Mancuso, and Sonya Kwak. 2007. How interface agents affect interaction between humans and computers. In Proc.
of the Conf. on DPPI. ACM, 209–221.
Ulrich Gnewuch, Stefan Morana, and Alexander Maedche. 2017. Towards designing cooperative and social conversational agents for customer service. In
Proc. of the ICIS.
Lewis R Goldberg. 1990. An alternative" description of personality": the big-five factor structure. J. Pers. Soc. Psychol. 59, 6 (1990), 1216.
Li Gong. 2008. How social is social responses to computers? The function of the degree of anthropomorphism in computer representations. Comput.
Hum. Behav. 24, 4 (2008), 1494–1509.
James J Gross. 1998. The emerging field of emotion regulation: an integrative review. Review of general psychology 2, 3 (1998), 271.
Tovi Grossman, George Fitzmaurice, and Ramtin Attar. 2009. A survey of software learnability: metrics, methodologies and guidelines. In Proc. of the
SIGCHI CHI Conference. ACM, 649–658.
Charlotte N Gunawardena and Frank J Zittle. 1997. Social presence as a predictor of satisfaction within a computer-mediated conferencing environment.
American journal of distance education 11, 3 (1997), 8–26.
Rom Harré, Fathali M Moghaddam, Fathali Moghaddam, et al. 2003. The self and others: Positioning individuals and groups in personal, political, and
cultural contexts. Greenwood Publishing Group.
Yugo Hayashi. 2015. Social Facilitation Effects by Pedagogical Conversational Agent: Lexical Network Analysis in an Online Explanation Task. Proc. of
the IEDMS (2015).
Jennifer Hill, W Randolph Ford, and Ingrid G Farreras. 2015. Real conversations with artificial intelligence: A comparison between human–human online
conversations and human–chatbot conversations. Comput. Hum. Behav. 49 (2015), 245–250.
Kenneth Einar Himma. 2009. Artificial agency, consciousness, and the criteria for moral agency: What properties must an artificial agent have to be a
moral agent? Ethics and Information Technology 11, 1 (2009), 19–29.
Annabell Ho, Jeff Hancock, and Adam S Miner. 2018. Psychological, Relational, and Emotional Effects of Self-Disclosure After Conversations With a
Chatbot. Journal of Communication (2018).
Mohit Jain, Ramachandra Kota, Pratyush Kumar, and Shwetak N Patel. 2018a. Convey: Exploring the Use of a Context View for Chatbots. In Proc. of the
SIGCHI CHI Conference. ACM, 468.
Mohit Jain, Pratyush Kumar, Ramachandra Kota, and Shwetak N Patel. 2018b. Evaluating and Informing the Design of Chatbots. In Proc. of the SIGCHI
DIS. ACM, 895–906.
Marie-Claire Jenkins, Richard Churchill, Stephen Cox, and Dan Smith. 2007. Analysis of user interaction with service oriented chatbot systems. In Int.
Conf. on Hum. Comput. Interact. Springer, 76–83.
Ridong Jiang and Rafael E Banchs. 2017. Towards Improving the Performance of Chat Oriented Dialogue System. In Proc. of the IALP. IEEE.
Jurek Kirakowski, Anthony Yiu, et al. 2009. Establishing the hallmarks of a convincing chatbot-human dialogue. In Human-Computer Interaction. InTech.
Tomoko Koda. 2003. User reactions to anthropomorphized interfaces. IEICE TRANSACTIONS on Information and Systems 86, 8 (2003), 1369–1377.
Rohit Kumar, Hua Ai, Jack L Beuth, and Carolyn P Rosé. 2010. Socially capable conversational tutors can be effective in collaborative learning situations.
In International Conference on Intelligent Tutoring Systems. Springer, 156–164.
Liliana Laranjo, Adam G Dunn, Huong Ly Tong, Ahmet Baki Kocaballi, Jessica Chen, Rabia Bashir, Didi Surian, Blanca Gallego, Farah Magrabi, Annie YS
Manuscript submitted to ACM
34 Chaves and Gerosa

Lau, et al. 2018. Conversational agents in healthcare: a systematic review. J. Am. Med. Inform. Assoc. 25, 9 (2018), 1248–1258.
Mirosława Lasek and Szymon Jessa. 2013. Chatbots for Customer Service on Hotels’ Websites. Information Systems in Management 2, 2 (2013), 146–158.
SeoYoung Lee and Junho Choi. 2017. Enhancing user experience with conversational agent for movie recommendation: Effects of self-disclosure and
reciprocity. Int J Hum Comput Stud. 103 (2017), 95–105.
Yanran Li, Hui Su, Xiaoyu Shen, Wenjie Li, Ziqiang Cao, and Shuzi Niu. 2017. DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset. In IJCNLP.
Vera Q Liao, Matthew Davis, Werner Geyer, Michael Muller, and N Sadat Shami. 2016. What can you do?: Studying social-agent orientation and agent
proactive interactions with an agent for employees. In Proc. of the SIGCHI DIS. ACM, 264–275.
Vera Q Liao, Muhammed Masud Hussain, Praveen Chandar, Matthew Davis, Marco Crasso, Dakuo Wang, Michael Muller, Sadat N Shami, and Werner
Geyer. 2018. All Work and no Play? Conversations with a Question-and-Answer Chatbot in the Wild. In Proc. of the SIGCHI CHI Conference, Vol. 13.
Ewa Luger and Abigail Sellen. 2016. Like Having a Really Bad PA: The Gulf between User Expectation and Experience of Conversational Agents. In Proc.
of the SIGCHI CHI Conference. ACM, 5286–5297.
François Mairesse and Marilyn A Walker. 2009. Can Conversational Agents Express Big Five Personality Traits through Language?: Evaluating a
Psychologically-Informed Language Generator. Cambridge & Sheffield, United Kingdom: University of Sheffield.
Mark C Marino. 2014. The racial formation of chatbots. CLCWeb: Comparative Literature and Culture 16, 5 (2014), 13.
Julia Masche and Nguyen-Thinh Le. 2017. A Review of Technologies for Conversational Systems. In Int. Conf. on Computer Science, Applied Mathematics
and Applications. Springer, 212–225.
Irina Maslowski, Delphine Lagarde, and Chloé Clavel. 2017. In-the-wild chatbot corpus: from opinion analysis to interaction problem detection. In
International Conference on Natural Language and Speech Processing.
Daniel Mäurer and Karsten Weihe. 2015. Benjamin Franklin’s decision method is acceptable and helpful with a conversational agent. In Intelligent
Interactive Multimedia Systems and Services. Springer, 109–120.
Robert R McCrae and Paul T Costa Jr. 1987. Validation of the five-factor model of personality across instruments and observers. J. Pers. Soc. Psychol. 52, 1
(1987), 81.
Robert R McCrae and Paul T Costa Jr. 1997. Personality trait structure as a human universal. American psychologist 52, 5 (1997), 509.
Michael M Meany and Tom Clark. 2010. Humour Theory and Conversational Agents: An Application in the Development of Computer-based Agents.
International Journal of the Humanities 8, 5 (2010).
Adam Miner, Amanda Chow, Sarah Adler, Ilia Zaitsev, Paul Tero, Alison Darcy, and Andreas Paepcke. 2016. Conversational Agents and Mental Health:
Theory-Informed Assessment of Language and Affect. In Proc. of the Int. Conf. on HAI. ACM, 123–130.
Thomas William Morris. 2002. Conversational agents for game-like virtual environments. In AAAI Spring Symposium. 82–86.
Kellie Morrissey and Jurek Kirakowski. 2013. ’Realness’ in Chatbots: Establishing Quantifiable Criteria. In Int. Conf. on Hum. Comput. Interact. Springer,
87–96.
Yi Mou and Kun Xu. 2017. The media inequality: Comparing the initial human-human and human-AI social interactions. Comput. Hum. Behav. 72 (2017),
432–440.
Tatsuya Narita and Yasuhiko Kitamura. 2010. Persuasive conversational agent with persuasion tactics. In International Conference on Persuasive Technology.
Springer, 15–26.
Clifford Nass, Jonathan Steuer, Ellen Tauber, and Heidi Reeder. 1993. Anthropomorphism, agency, and ethopoeia: computers as social actors. In Proc. of
the INTERACT ’93 and CHI ’93. ACM, 111–112.
Clifford Nass, Jonathan Steuer, and Ellen R Tauber. 1994. Computers are social actors. In Proc. of the SIGCHI CHI Conference. ACM, 72–78.
Gina Neff and Peter Nagy. 2016. Automation, algorithms, and politics| talking to bots: symbiotic agency and the case of Tay. Int. J. Commun. 10 (2016), 17.
Mario Neururer, Stephan Schlögl, Luisa Brinkschulte, and Aleksander Groth. 2018. Perceptions on Authenticity in Chat Bots. Multimodal Technologies
and Interaction 2, 3 (2018), 60.
Quynh N Nguyen and Anna Sidorova. 2018. Understanding user interactions with a chatbot: a self-determination theory approach. In AMCIS–ERF.
Kristine L Nowak and Christian Rauh. 2005. The influence of the avatar on online perceptions of anthropomorphism, androgyny, credibility, homophily,
and attraction. Journal of Computer-Mediated Communication 11, 1 (2005), 153–178.
Heather L OB́rien and Elaine G Toms. 2008. What is user engagement? A conceptual framework for defining user engagement with technology. Journal
of the American society for Information Science and Technology 59, 6 (2008), 938–955.
Joel Parthemore and Blay Whitby. 2013. What makes any agent a moral agent? Reflections on machine consciousness and moral agency. Int. J. Mach.
Consciousness 5, 02 (2013), 105–129.
Maria João Pereira, Luísa Coheur, Pedro Fialho, and Ricardo Ribeiro. 2016. Chatbots’ Greetings to Human-Computer Communication. CoRR
arXiv:1609.06479 (2016).
Paolo Petta and Robert Trappl. 1997. Why to create personalities for synthetic actors. In Creating Personalities for Synthetic Actors. Springer, 1–8.
Manuel Portela and Carlos Granell-Canut. 2017. A new friend in our smartphone?: observing interactions with chatbots in the search of emotional
engagement. In Proc. of the Int. Conf. on Hum. Comput. Interact. ACM, 48.
Tom Postmes, Russell Spears, Khaled Sakhel, and Daphne De Groot. 2001. Social influence in computer-mediated communication: The effects of anonymity
on group behavior. Personality and Social Psychology Bulletin 27, 10 (2001), 1243–1254.
Shrimai Prabhumoye, Yulia Tsvetkov, Ruslan Salakhutdinov, and Alan W Black. 2018. Style Transfer Through Back-Translation. In Proc. of the 56th ACL.
Raquel O Prates, Clarisse S de Souza, and Simone DJ Barbosa. 2000. Methods and tools: a method for evaluating the communicability of user interfaces.
Manuscript submitted to ACM
How should my chatbot interact? 35

interactions 7, 1 (2000), 31–38.

Michal Ptaszynski, Pawel Dybala, Shinsuke Higuhi, Wenhan Shi, Rafal Rzepka, and Kenji Araki. 2010. Towards Socialized Machines: Emotions and Sense
of Humour in Conversational Agents. In Web Intelligence and Intelligent Agents. InTech.
Nicole M Radziwill and Morgan C Benton. 2017. Evaluating Quality of Chatbots and Intelligent Conversational Agents. CoRR arXiv:1704.04579 (2017).
Kiran Ramesh, Surya Ravishankaran, Abhishek Joshi, and K Chandrasekaran. 2017. A Survey of Design Techniques for Conversational Agents. In
International Conference on Information, Communication and Computing Technology. Springer, 336–350.
Bertram H Raven. 1964. Social influence and power. Technical Report. CALIFORNIA UNIV LOS ANGELES.
Byron Reeves and Clifford Nass. 1996. The Media Equation: How people treat computers, television, and new media like real people and places. CSLI
Publications and Cambridge university press.
Daniel Rousseau and Barbara Hayes-Roth. 1996. Personality in synthetic agents. Technical Report. Citeseer.
Victoria L Rubin, Yimin Chen, and Lynne Marie Thorimbert. 2010. Artificially intelligent conversational agents in libraries. Library Hi Tech 28, 4 (2010),
496–522.
Willibald Ruch. 1998. The sense of humor: Explorations of a personality characteristic. Vol. 3. Walter de Gruyter.
Antti Salovaara and Antti Oulasvirta. 2004. Six modes of proactive resource management: a user-centric typology for proactive behaviors. In Proceedings
of the third Nordic conference on Human-computer interaction. ACM, 57–60.
Peter Salovey and John D Mayer. 1990. Emotional intelligence. Imagination, cognition and personality 9, 3 (1990), 185–211.
Md Shahriare Satu, Md Hasnat Parvez, et al. 2015. Review of integrated applications with AIML based chatbot. In 1st Int. Conf. on ICCIE. IEEE, 87–90.
Ari Schlesinger, Kenton P O’Hara, and Alex S Taylor. 2018. Let’s Talk About Race: Identity, Chatbots, and AI. In Proc. of the SIGCHI CHI Conference. ACM,
315.
Ryan M Schuetzler, G Mark Grimes, and Justin Scott Giboney. 2018. An Investigation of Conversational Agent Relevance, Presence, and Engagement, In
Americas Conference on Information Systems 2018 Proceedings. Americas’ Conference on Information Systems.
Bayan Abu Shawar and Eric Atwell. 2007. Chatbots: are they really useful?. In LDV Forum, Vol. 22. 29–49.
Nicole Shechtman and Leonard M Horowitz. 2003. Media inequality in conversation: how people behave differently when interacting with computers
and people. In Proc. of the SIGCHI CHI Conference. ACM, 281–288.
John Short, Ederyn Williams, and Bruce Christie. 1976. The social psychology of telecommunications. (1976).
Heung-yeung Shum, Xiao-dong He, and Di Li. 2018. From Eliza to XiaoIce: challenges and opportunities with social chatbots. Front. Inf. Technol. Electron.
Eng. 19, 1 (2018), 10–26.
Annika Silvervarg and Arne Jönsson. 2013. Iterative Development and Evaluation of a Social Conversational Agent. In 6th IJCNPL. Japan, 1223–1229.
Björn Sjödén, Annika Silvervarg, Magnus Haake, and Agneta Gulz. 2011. Extending an Educational Math Game with a Pedagogical Conversational Agent:
Facing Design Challenges. In Interdisciplinary Approaches to Adaptive Learning. A Look at the Neighbours. Springer, 116–130.
Jan E Stets and Peter J Burke. 2000. Identity theory and social identity theory. Social psychology quarterly (2000), 224–237.
S Shyam Sundar, Saraswathi Bellur, Jeeyun Oh, Haiyan Jia, and Hyang-Sook Kim. 2016. Theoretical importance of contingency in human-computer
interaction: effects of message interactivity on user engagement. Commun. Res. 43, 5 (2016), 595–625.
Ella Tallyn, Hector Fried, Rory Gianni, Amy Isard, and Chris Speed. 2018. The Ethnobot: Gathering Ethnographies in the Age of IoT. In Proc. of the SIGCHI
CHI Conference. ACM, 604.
Silvia Tamayo-Moreno and Diana Pérez-Marín. 2016. Adapting the design and the use methodology of a Pedagogical Conversational Agent of Secondary
Education to Childhood Education. In Computers in Education (SIIE), 2016 International Symposium on. IEEE, 1–6.
Stergios Tegos, Stavros Demetriadis, and Thrasyvoulos Tsiatsos. 2016. An Investigation of Conversational Agent Interventions Supporting Historical
Reasoning in Primary Education. In Int Conf on ITS. Springer, 260–266.
David Tennenhouse. 2000. Proactive computing. Commun. ACM 43, 5 (2000), 43–50.
Indrani Medhi Thies, Nandita Menon, Sneha Magapu, Manisha Subramony, and Jacki O’neill. 2017. How do you want your chatbot? An exploratory
Wizard-of-Oz study with young, urban Indians. In IFIP Conf. on Hum. Comput. Interact. Springer, 441–459.
Laura Thomson. 2005. A standard framework for web personalization. In 1st Int. Workshop on Innovations in Web Infrastructure - 14th WWW Conf. Japan.
Camilo Thorne. 2017. Chatbots for troubleshooting: A survey. Language and Linguistics Compass (2017).
James A Thorson and FC Powell. 1993. Sense of humor and dimensions of personality. Journal of clinical Psychology 49, 6 (1993), 799–809.
Carlos Toxtli, Justin Cranshaw, et al. 2018. Understanding Chatbot-mediated Task Management. In Proc. of the SIGCHI CHI Conference. ACM, 58.
Chih-Hsiung Tu and Marina McIsaac. 2002. The relationship of social presence and interaction in online classes. The American journal of distance
education 16, 3 (2002), 131–150.
Alan M Turing. 1950. Computing machinery and intelligence. Mind 59, 236 (1950), 433–460.
Francisco AM Valério, Tatiane G Guimarães, Raquel O Prates, and Heloisa Candello. 2017. Here’s What I Can Do: Chatbots’ Strategies to Convey Their
Features to Users. In Proceedings of the XVI Brazilian Symposium on Human Factors in Computing Systems. ACM, 28.
MS Walgama and B Hettige. 2017. Chatbots: The next generation in computer interfacing–A Review. KDU International Research Conference (2017).
Marilyn A Walker. 2009. Endowing Virtual Characters with Expressive Conversational Skills. In Int. Workshop on Intelligent Virtual Agents. Springer, 1–2.
Richard S Wallace. 2009. The anatomy of A.L.I.C.E. In Parsing the Turing Test. Springer, 181–210.
Peter Wallis and Emma Norling. 2005. The Trouble with Chatbots: social skills in a social world. Virtual Social Agents 29 (2005).
Joseph B Walther. 1992. Interpersonal effects in computer-mediated interaction: A relational perspective. Commun. Res. 19, 1 (1992), 52–90.
Manuscript submitted to ACM
36 Chaves and Gerosa

Joseph B Walther. 1996. Computer-mediated communication: Impersonal, interpersonal, and hyperpersonal interaction. Commun. Res. 23, 1 (1996), 3–43.
Joseph B Walther. 2007. Selective self-presentation in computer-mediated communication: Hyperpersonal dimensions of technology, language, and
cognition. Comput. Hum. Behav. 23, 5 (2007), 2538–2557.
Joseph B Walther. 2011. Theories of computer-mediated communication and interpersonal relations (4 ed.). Sage, Thousand Oaks, CA, Chapter 4, 443–479.
Richard J Watts. 2003. Politeness. Cambridge University Press.
Joseph Weizenbaum. 1966. ELIZA-a computer program for the study of natural language communication between man and machine. Commun. ACM 9, 1
(1966), 36–45.
Rainer Winkler and Matthias Söllner. 2018. Unleashing the Potential of Chatbots in Education: A State-Of-The-Art Analysis. (2018).
Jennifer Zamora. 2017. I’m Sorry, Dave, I’m Afraid I Can’t Do That: Chatbot Perception and Expectations. In Proc. of the Int. Conf. on HAI. ACM, 253–260.
Saizheng Zhang, Emily Dinan, Jack Urbanek, Arthur Szlam, Douwe Kiela, and Jason Weston. 2018a. Personalizing Dialogue Agents: I have a dog, do you
have pets too? arXiv preprint arXiv:1801.07243 (2018).
Wei-Nan Zhang, Qingfu Zhu, Yifa Wang, Yanyan Zhao, and Ting Liu. 2018b. Neural personalized response generation as domain adaptation. World Wide
Web (2018), 20.
Marvin Zuckerman, D Michael Kuhlman, Jeffrey Joireman, Paul Teta, and Michael Kraft. 1993. A comparison of three structural models for personality:
The Big Three, the Big Five, and the Alternative Five. J. Pers. Soc. Psychol. 65, 4 (1993), 757.

A SUPPLEMENTAL MATERIAL
This supplementary material include a list of tables that summarize the outcomes presented in the paper. Additionally,
we include insights on five constructs that can be used to assess whether social characteristics are reaching the intended
design goals and leading to the expected benefits.

A.1 Overview of the surveyed literature

Table 2 shows the which interaction type is evaluated in each paper while Table 3 describe the list of topics the chatbots
could handle along with the number of papers is presented in Table 3.
Interaction type Counting (%) Surveyed studies
Task-oriented 34 (59%) [Araujo 2018] [Avula et al. 2018] [Ayedoun et al. 2017] [Brandtzaeg and Følstad 2018] [Candello et al. 2017] [Chaves and
Gerosa 2018] [Ciechanowski et al. 2018] [Coniam 2008] [Dohsaka et al. 2014] [Duijst 2017] [Duijvelshoff 2017] [Dyke et al.
2013] [Fitzpatrick et al. 2017] [Gnewuch et al. 2017] [Hayashi 2015] [Jain et al. 2018a] [Jenkins et al. 2007] [Kumar et al. 2010]
[Lasek and Jessa 2013] [Lee and Choi 2017] [Liao et al. 2016] [Mäurer and Weihe 2015] [Morris 2002] [Narita and Kitamura
2010] [Schuetzler et al. 2018] [Silvervarg and Jönsson 2013] [Sjödén et al. 2011] [Tallyn et al. 2018] [Tamayo-Moreno and
Pérez-Marín 2016] [Tegos et al. 2016] [Toxtli et al. 2018] [Valério et al. 2017] [Wallis and Norling 2005] [Zamora 2017]
General purpose chat 19 (33%) [Brahnam and De Angeli 2012] [Corti and Gillespie 2016] [Curry and Rieser 2018] [De Angeli 2005] [De Angeli and Brahnam
2006] [De Angeli et al. 2001a] [Hill et al. 2015] [Ho et al. 2018] [Kirakowski et al. 2009] [Liao et al. 2018] [Mairesse and Walker
2009] [Marino 2014] [Miner et al. 2016] [Morrissey and Kirakowski 2013] [Portela and Granell-Canut 2017] [Ptaszynski et al.
2010] [Schlesinger et al. 2018] [Shum et al. 2018] [Thies et al. 2017]
Both or not defined 5 (8%) [Banks 2018] [Brandtzaeg and Følstad 2017] [Jain et al. 2018b] [Meany and Clark 2010] [Neururer et al. 2018]
Table 2. Interaction type

# of papers Topics handled by the chatbots

16 Open domain (unrestricted topics)
9 Education
5 Customer services
2 E-commerce, financial services, game, health, information search, race, task management, virtual assistants
1 Business, Credibility assessment interviews, Decision-making coach, Ethnography, Human resources, Humor, Movie recommendation, News, Tourism
Table 3. Conversational topics for chatbots in the surveyed studies

A.2 Chatbots Social Characteristics

In this section, we present some tables to summarize the outcomes of the conceptual model of social characteristics. For
the categories, Tables 4, 8, and 15 depicts an overview of the included studies. For the social characteristics, the tables
shows the studies that address each listed benefit, challenge, and strategies.
Manuscript submitted to ACM
How should my chatbot interact? 37

A.2.1 Conversational Intelligence.

Study Main investigation Interaction Analyzed data Methods Reported social

characteristics
[Liao et al. 2016] Social-agent orientation; Real chatbot Log of conversations; Quantitative; Proactivity
Proactivity Questionnaires; Inter- Qualitative
views
[Avula et al. 2018] Intervention mode WoZ Log of conversations; Quantitative; Proactivity
Questionnaires Qualitative
[Schuetzler et al. 2018] Intervention mode; Users Real chatbot Questionnaires Quantitative Proactivity; Consci-
deceptive behavior entiousness
[Chaves and Gerosa Sequential coherence WoZ Log of conversations; Quantitative; Proactivity
2018] Think aloud; Interviews Qualitative
[Portela and Granell- Emotional engagement Real chatbot; Log of conversations; Quantitative; Proactivity
Canut 2017] WoZ Questionnaires; Inter- Qualitative
views
[Shum et al. 2018] Emotional engagement Real chatbot Log of conversations Qualitative Proactivity
[Jain et al. 2018b] First-time users experience Real chatbot Log of conversations; Quantitative; Proactivity; Con-
Questionnaires; Inter- Qualitative scientiousness;
views Communicability
[Duijvelshoff 2017] Privacy and ethics WoZ Workshop outcomes; Inter- Qualitative Proactivity
views
[Thies et al. 2017] Personality traits WoZ Log of conversations; Fo- Qualitative Proactivity
cus group discussion; Inter-
views;
[Morrissey and Ki- Naturalness Real chatbot Log of conversations; Inter- Quantitative; Proactivity; Consci-
rakowski 2013] views; Questionnaires Qualitative entiousness
[Silvervarg and Jöns- Iterative prototyping Real chatbot Log of conversations Quantitative; Proactivity
son 2013] Qualitative
[Mäurer and Weihe Conversational decision- Real chatbot Log of conversations; Quantitative; Proactivity
2015] making Questionnaires Qualitative
[Dyke et al. 2013] Intervention mode (APT WoZ Log of conversations Quantitative Proactivity; Consci-
moves) entiousness
[Tallyn et al. 2018] Ethnographic data collec- Real chatbot Log of conversations; Inter- Qualitative Proactivity; Consci-
tion views entiousness
[Toxtli et al. 2018] Task management chatbot Real chatbot Log of conversations; Quantitative; Proactivity
design Questionnaires Qualitative
[Fitzpatrick et al. 2017] Conversational mental Real chatbot Questionnaires Quantitative Proactivity
health care
[Hayashi 2015] Intervention mode (APT Real chatbot Log of conversations Quantitative Proactivity
moves)
[Tegos et al. 2016] Intervention mode (APT Real chatbot Log of conversations; Quantitative; Proactivity
moves) Questionnaires Qualitative
[Jain et al. 2018a] Context management Real chatbot Log of conversations; Quantitative; Conscientiousness
Questionnaires; Subjective Qualitative
feedback
[Coniam 2008] Language capabilities Real chatbot Log of conversations Qualitative Conscientiousness
[Ayedoun et al. 2017] Communication strategies Real chatbot Questionnaires Quantitative Conscientiousness
and affective backchannels
[Gnewuch et al. 2017] Chatbots design principles None Literature review Qualitative Conscientiousness;
Communicability
[Brandtzaeg and Users’ motivations None Questionnaires; Subjective Quantitative; Conscientiousness
Følstad 2017] feedback Qualitative
[Duijst 2017] Personalization Real chatbot Questionnaires; Think Quantitative; Conscientiousness;
aloud; Interviews Qualitative Communicability
[Valério et al. 2017] Communicability Real chatbot Semiotic Inspection Qualitative Communicability
[Liao et al. 2018] Playfulness Real chatbot Log of conversations; Quantitative; Communicability
Questionnaires Qualitative
[Lasek and Jessa 2013] Patterns of use of hotel Real chatbot Log of conversations Quantitative; Communicability
chatbots Qualitative
Table 4. Description of the studies that report conversational intelligence

Manuscript submitted to ACM

38 Chaves and Gerosa

[B1] to provide additional information [Morrissey and Kirakowski 2013] [Thies et al. 2017] [Avula et al. 2018]
[B2] to inspire users and to keep the conversation alive [Avula et al. 2018] [Chaves and Gerosa 2018] [Silvervarg and Jönsson 2013] [Tallyn
Benefits et al. 2018] [Schuetzler et al. 2018]
Proactivity [B3] to recover from a failure [Portela and Granell-Canut 2017] [Silvervarg and Jönsson 2013]
[B4] to improve conversation productivity [Avula et al. 2018] [Jain et al. 2018b]
[B5] to guide and engage users [Mäurer and Weihe 2015] [Tallyn et al. 2018] [Dyke et al. 2013] [Hayashi 2015] [Fitz-
patrick et al. 2017] [Toxtli et al. 2018] [Tegos et al. 2016]
[C1] timing and relevance [Portela and Granell-Canut 2017] [Chaves and Gerosa 2018] [Liao et al. 2016] [Silver-
Challenges varg and Jönsson 2013]
[C2] privacy [Duijvelshoff 2017]
[C3] users’ perception of being controlled [Tallyn et al. 2018] [Toxtli et al. 2018]
[S1] leveraging conversational context [Avula et al. 2018] [Chaves and Gerosa 2018] [Shum et al. 2018] [Duijvelshoff 2017]
Strategies
[S2] select a topic randomly [Portela and Granell-Canut 2017]
Table 5. Proactivity social characteristic

[B1] to keep the conversation on track [Brandtzaeg and Følstad 2017] [Duijst 2017] [Jain et al. 2018b] [Ayedoun et al. 2017]
Conscientiousness

Benefits [B2] to demonstrate understanding [Dyke et al. 2013] [Duijst 2017] [Jain et al. 2018b] [Ayedoun et al. 2017] [Gnewuch et al.
2017] [Schuetzler et al. 2018]
[B3] to hold a continuous conversation [Jain et al. 2018b] [Gnewuch et al. 2017] [Coniam 2008] [Morrissey and Kirakowski 2013]
[C1] to handle task complexity [Duijst 2017] [Dyke et al. 2013] [Gnewuch et al. 2017]
Challenges [C2] to harden the conversation [Duijst 2017] [Jain et al. 2018b] [Tallyn et al. 2018]
[C3] to keep the user aware of the chatbot’s context [Jain et al. 2018a] [Jain et al. 2018b] [Gnewuch et al. 2017]
[S1] conversational flow [Duijst 2017] [Ayedoun et al. 2017] [Gnewuch et al. 2017]
Strategies [S2] visual elements [Duijst 2017] [Jain et al. 2018b] [Tallyn et al. 2018]
[S3] confirmation messages [Jain et al. 2018a] [Ayedoun et al. 2017] [Gnewuch et al. 2017] [Duijst 2017]
Table 6. Conscientiousness social characteristic

[B1] to unveil functionalities [Valério et al. 2017] [Jain et al. 2018b] [Liao et al. 2018] [Lasek and Jessa 2013]
Communicability

Benefits
[B2] to manage the users’ expectations [Valério et al. 2017] [Duijst 2017] [Jain et al. 2018b] [Liao et al. 2018]
[C1] to provide business integration [Jain et al. 2018b] [Gnewuch et al. 2017]
Challenges
[C2] to keep visual elements consistent with textual inputs [Valério et al. 2017]
[S1] to clarify the purpose of the chatbot [Valério et al. 2017] [Jain et al. 2018b] [Gnewuch et al. 2017]
[S2] to advertise the functionality and suggest the next step [Valério et al. 2017] [Jain et al. 2018b]
Strategies
[S3] to provide a help functionality [Jain et al. 2018b] [Liao et al. 2018] [Valério et al. 2017]
Table 7. Communicability social characteristic

Manuscript submitted to ACM

How should my chatbot interact? 39

A.2.2 Social Intelligence.

Study Main investigation Interaction Analyzed data Methods Reported social characteris-
tics
[Jain et al. 2018b] First-time users experience Real chatbot Log of conversations; Question- Quantitative; Damage control; Manners; Per-
naires; Interviews Qualitative sonalization
[De Angeli et al. 2001a] Anthropomorphism Real chatbot Log of conversations Qualitative Damage control
[Silvervarg and Jönsson 2013] Iterative prototyping Real chatbot Log of conversations Quantitative; Damage control
Qualitative
[Mäurer and Weihe 2015] Conversational decision- Real chatbot Log of conversations; Question- Quantitative; Damage control; Manners
making naires Qualitative
[Toxtli et al. 2018] Task management chatbot de- Real chatbot Log of conversations; Question- Quantitative; Damage control; Manners; Per-
sign naires Qualitative sonalization
[Gnewuch et al. 2017] Chatbots design principles None Literature review Qualitative Damage control; Thoroughness
[Duijst 2017] Personalization Real chatbot Questionnaires; Think aloud; In- Quantitative; Damage control; Thoroughness;
terviews Qualitative Personalization
[Liao et al. 2018] Playfulness Real chatbot Log of conversations; Question- Quantitative; Damage control; Manners
naires Qualitative
[Lasek and Jessa 2013] Patterns of use of hotel chatbots Real chatbot Log of conversations Quantitative; Damage control; Personalization
Qualitative
[Wallis and Norling 2005] Social intelligence WoZ Log of conversations Qualitative Damage control; Manners; Emo-
tional intelligence
[Jenkins et al. 2007] Users’ expectations and experi- Real chatbot; Log of conversations; Question- Quantitative; Damage control; Thoroughness;
ence WoZ naires; Subjective feedback Qualitative Manners; Emotional intelligence
[Curry and Rieser 2018] Sexual verbal abuse Real chatbot Log of conversations; Quantitative Damage Control
[Morrissey and Kirakowski Naturalness Real chatbot Log of conversations; Interviews; Quantitative; Thoroughness; Manners
2013] Questionnaires Qualitative
[Hill et al. 2015] Communication changes with Real chatbot Log of conversations; Quantitative Thoroughness
human or chatbot partners
[Coniam 2008] Language capabilities Real chatbot Log of conversations Qualitative Thoroughness
[Kirakowski et al. 2009] Naturalness Real chatbot Log of conversations; Interviews; Quantitative; Thoroughness; Manners
Questionnaires Qualitative
[Thies et al. 2017] Personality traits WoZ Log of conversations; Focus Qualitative Thoroughness; Emotional intelli-
group discussion; Interviews; gence; Personalization
[Mairesse and Walker 2009] Expressing personality through None Automatically generated utter- Quantitative Thoroughness
language ances; Questionnaires
[Chaves and Gerosa 2018] Sequential coherence WoZ Log of conversations; Think Quantitative; Thoroughness; Manners
aloud; Interviews Qualitative
[Morris 2002] Believability None Not evaluated Not evalu- Thoroughness; Emotional intelli-
ated gence
[Shum et al. 2018] Emotional engagement Real chatbot Log of conversations Qualitative Moral agency; Emotional intelli-
gence; Personalization
[Marino 2014] Racial stereotypes Real chatbot Log of conversations Qualitative Moral agency
[De Angeli and Brahnam Gender affordances Real chatbot Log of conversations Qualitative Moral agency
2006]
[Banks 2018] Perceived moral agency Video chat- Questionnaires Quantitative Moral gency
bot
[Brahnam and De Angeli Gender affordances Real chatbot Log of conversations; Question- Quantitative; Moral agency
2012] naires Qualitative
[Schlesinger et al. 2018] Race-talk None Literature Qualitative Moral agency
[Kumar et al. 2010] Socially capable chatbot Real chatbot Log of conversations; Question- Quantitative Emotional intelligence; Manners
naires
[Dohsaka et al. 2014] Thought-evoking dialogues Real chatbot Log of conversations; Question- Quantitative Emotional intelligence
naires
[Fitzpatrick et al. 2017] Conversational mental health Real chatbot Questionnaires Quantitative Emotional intelligence
care
[Ayedoun et al. 2017] Communication strategies and Real chatbot Questionnaires Quantitative Emotional intelligence
affective backchannels
[Miner et al. 2016] Mental health care Real chatbot Log of conversations Quantitative Emotional intelligence
[Ho et al. 2018] Self-disclosure WoZ Log of conversations; Question- Quantitative; Emotional intelligence
naires Qualitative
[Portela and Granell-Canut Emotional engagement Real chatbot; Log of conversations; Question- Quantitative; Emotional intelligence; Personal-
2017] WoZ naires; Interviews Qualitative ization
[Tallyn et al. 2018] Ethnographic data collection Real chatbot Log of conversations; Interviews Qualitative Thoroughness; Personalization
[Liao et al. 2016] Social-agent orientation; Proac- Real chatbot Log of conversations; Question- Quantitative; Personalization
tivity naires; Interviews Qualitative
[Duijvelshoff 2017] Privacy and ethics WoZ Workshop outcomes; Interviews; Qualitative Personalization
[Neururer et al. 2018] Authenticity None Interviews Quantitative; Personalization
Qualitative
[Zamora 2017] Users’ expectations and experi- Real chatbot Subjective feedback Qualitative Thoroughness; Emotional intelli-
ences gence; Personalization
Table 8. Description of the studies that report social intelligence

Manuscript submitted to ACM

40 Chaves and Gerosa

[B1] to appropriately respond to harassment [Lasek and Jessa 2013] [Curry and Rieser 2018]
Benefits [B2] to deal with testing [Wallis and Norling 2005] [Silvervarg and Jönsson 2013] [Liao et al. 2018] [Jain et al. 2018b]
[B3] to deal with lack of knowledge [Wallis and Norling 2005] [Jain et al. 2018b] [Toxtli et al. 2018] [Silvervarg and Jönsson

Damage control
2013] [Gnewuch et al. 2017] [Mäurer and Weihe 2015]
[C1] to deal with unfriendly users [Silvervarg and Jönsson 2013] [Mäurer and Weihe 2015] [De Angeli et al. 2001a]
[C2] to identify abusive utterances [Curry and Rieser 2018]
Challenges
[C3] to balance emotional reactions [Wallis and Norling 2005] [Curry and Rieser 2018]
[S1] emotional reactions [Wallis and Norling 2005] [Curry and Rieser 2018] [Silvervarg and Jönsson 2013] [De Angeli
et al. 2001a]
Strategies
[S2] authoritative reactions [Wallis and Norling 2005] [Jenkins et al. 2007] [Toxtli et al. 2018] [Silvervarg and Jönsson
2013]
[S3] to ignore the user’s utterance and [Wallis and Norling 2005] [Silvervarg and Jönsson 2013]
change the topic
[S4] conscientiousness and communicability [Silvervarg and Jönsson 2013] [Wallis and Norling 2005] [Duijst 2017] [Jain et al. 2018b]
[Gnewuch et al. 2017]
[S5] to predict users’ satisfaction [Liao et al. 2018]
Table 9. Damage control social characteristic

[B1] to adapt the language dynam- [Mairesse and Walker 2009] [Duijst 2017] [Thies et al. 2017] [Jenkins et al. 2007] [Gnewuch et al.
Thoroughness

Benefits
ically 2017] [Hill et al. 2015] [Morrissey and Kirakowski 2013]
[B2] to exhibit believable behavior [Jenkins et al. 2007] [Mairesse and Walker 2009] [Morrissey and Kirakowski 2013] [Coniam 2008]
[Morris 2002] [Tallyn et al. 2018]
[C1] to decide on how much to talk [Jenkins et al. 2007] [Zamora 2017] [Gnewuch et al. 2017] [Chaves and Gerosa 2018] [Duijst 2017]
Challenges
[C2] to be consistent [Duijst 2017] [Kirakowski et al. 2009]
Table 10. Thoroughness social characteristic

Benefits [B1] to increase human-likeness [Jenkins et al. 2007] [Morrissey and Kirakowski 2013] [Kirakowski et al. 2009] [Toxtli et al. 2018]
Manners

[C1] to deal with face-threatening acts [Wallis and Norling 2005] [Mäurer and Weihe 2015]
Challenges
[C2] to end a conversation gracefully [Jain et al. 2018b] [Chaves and Gerosa 2018]
[S1] to engage in small talk [Liao et al. 2018] [Jain et al. 2018b] [Kumar et al. 2010]
Strategies
[S2] to adhere turn-taking protocols [Toxtli et al. 2018]
Table 11. Manners social characteristic

[B1] to avoid stereotyping [Marino 2014] [Schlesinger et al. 2018] [Brahnam and De Angeli
Moral agency

Benefits
2012] [De Angeli and Brahnam 2006]
[B2] to enrich interpersonal relationships [Banks 2018] [Shum et al. 2018]
[C1] to avoid alienation [De Angeli and Brahnam 2006] [Schlesinger et al. 2018]
Challenges
[C2] to build unbiased training data and algorithms [Schlesinger et al. 2018] [Shum et al. 2018]
Table 12. Moral agency social characteristic

[B1] to enrich interpersonal relation- [Kumar et al. 2010] [Wallis and Norling 2005] [Dohsaka et al. 2014] [Lee and Choi 2017] [Ho
Benefits ships et al. 2018] [Ayedoun et al. 2017] [Fitzpatrick et al. 2017] [Zamora 2017] [Miner et al. 2016]
Intelligence

[B2] to increase engagement [Dohsaka et al. 2014] [Shum et al. 2018] [Portela and Granell-Canut 2017]
Emotional

[B3] to increase believability [Morris 2002]

Challenges [C1] to regulate affective reactions [Kumar et al. 2010] [Jenkins et al. 2007] [Thies et al. 2017] [Ho et al. 2018]
[S1] to use social-emotional utterances [Kumar et al. 2010] [Ayedoun et al. 2017]
Strategies [S2] to manifest conscientiousness [Shum et al. 2018] [Portela and Granell-Canut 2017]
[S3] reciprocity and self-disclosure [Dohsaka et al. 2014] [Lee and Choi 2017]
Table 13. Emotional intelligence social characteristic

[B1] to enrich interpersonal relation- [Duijvelshoff 2017] [Duijst 2017] [Neururer et al. 2018] [Shum et al. 2018]
Benefits ships [Portela and Granell-Canut 2017]
[B2] to provide unique services [Duijst 2017] [Tallyn et al. 2018] [Toxtli et al. 2018] [Liao et al. 2016] [Thies
Personalization et al. 2017] [Jain et al. 2018b]
[B3] to reduce interactional breakdowns [Lasek and Jessa 2013] [Duijst 2017] [Jenkins et al. 2007]
Challenges [C1] privacy [Duijvelshoff 2017] [Zamora 2017] [Neururer et al. 2018] [Thies et al. 2017]
[S1] to learn from and about the user [Neururer et al. 2018] [Thies et al. 2017] [Shum et al. 2018] [Zamora 2017]
Strategies [S2] to provide customizable agents [Liao et al. 2016] [Duijvelshoff 2017] [Thies et al. 2017]
[S3] visual elements [Tallyn et al. 2018]
Table 14. Personalization social characteristic

Manuscript submitted to ACM

How should my chatbot interact? 41

A.2.3 Personification.

Study Main investigation Interaction Analyzed data Methods Reported social

characteristics
[Jain et al. 2018b] First-time users experience Real chatbot Log of conversations; Question- Quantitative; Personality
naires; Interviews Qualitative
[Silvervarg and Jönsson 2013] Iterative prototyping Real chatbot Log of conversations Quantitative; Identity
Qualitative
[Toxtli et al. 2018] Task management chatbot Real chatbot Log of conversations; Question- Quantitative; Identity
design naires Qualitative
[Gnewuch et al. 2017] Chatbots design principles None Literature review Qualitative Identity
[Liao et al. 2018] Playfulness Real chatbot Log of conversations; Question- Quantitative; Identity
naires Qualitative
[Jenkins et al. 2007] Users’ expectations and ex- Real chat- Log of conversations; Question- Quantitative; Identity
perience bot; WoZ naires; Subjective feedback Qualitative
[Thies et al. 2017] Personality traits WoZ Log of conversations; Focus group Qualitative Personality
discussion; Interviews;
[Mairesse and Walker 2009] Expressing personality None Automatically generated utter- Quantitative Personality
thorough language ances; Questionnaires
[Morris 2002] Believability None Not evaluated Not evaluated Personality
[Shum et al. 2018] Emotional engagement Real chatbot Log of conversations Qualitative Personality
[Marino 2014] Racial stereotypes Real chatbot Log of conversations Qualitative Identity
[De Angeli and Brahnam 2006] Gender affordances Real chatbot Log of conversations Qualitative Identity
[Schlesinger et al. 2018] Race-talk None Literature Qualitative Identity
[Kumar et al. 2010] Socially capable chatbot Real chatbot Log of conversations; Question- Quantitative Personality
naires
[Ayedoun et al. 2017] Communication strategies Real chatbot Questionnaires Quantitative Personality
and affective backchannels
[Portela and Granell-Canut Emotional engagement Real chat- Log of conversations; Question- Quantitative; Personality
2017] bot; WoZ naires; Interviews Qualitative
[Ciechanowski et al. 2018] Uncanny valley Real chatbot Psychophysiological measures; Quantitative Identity
Questionnaires
[Araujo 2018] Anthropomorphic clues Real chatbot Questionnaires Quantitative Identity
and agency framing
[De Angeli 2005] Social perception Real chatbot Log of conversations Qualitative Identity
[De Angeli et al. 2001a] Anthropomorphism Real chatbot Log of conversations Qualitative Identity
[Corti and Gillespie 2016] Anthropomorphism and Real chatbot Log of conversations Quantitative; Identity
Initiation repairs Qualitative
[Brandtzaeg and Følstad 2018] User needs and motivations None None Qualitative Identity
[Candello et al. 2017] Humanness and typefaces None Questionnaires; Think aloud Quantitative; Identity
Qualitative
[Neururer et al. 2018] Authenticity None Interviews Quantitative; Identity
Qualitative
[Ptaszynski et al. 2010] Sense of humor Real chatbot Questionnaires Quantitative Personality
[Meany and Clark 2010] Sense of humor None Literature Qualitative Personality
[Brandtzaeg and Følstad 2017] Users’ motivations None Questionnaires; Subjective feed- Quantitative; Personality
back Qualitative
[Sjödén et al. 2011] Personality preferences WoZ Log of conversations; Focus group Quantitative; Personality
discussion; Questionnaires Qualitative
Table 15. Description of the studies that report personification

[B1] to increase engagement [Araujo 2018] [Silvervarg and Jönsson 2013] [Liao et al. 2018]
Benefits
[B2] to increase human-likeness [Candello et al. 2017] [Araujo 2018]
Identity [C1] to avoid negative stereotypes [De Angeli 2005] [Schlesinger et al. 2018] [Brahnam and De Angeli 2012]
Challenges
[Marino 2014] [De Angeli and Brahnam 2006] [Brahnam and De Angeli 2012]
[De Angeli et al. 2001a] [Jenkins et al. 2007]
[C2] to balance the identity and the technical [Corti and Gillespie 2016] [Ciechanowski et al. 2018] [Gnewuch et al. 2017]
capabilities [Brandtzaeg and Følstad 2018] [De Angeli 2005]
Strategies [S1] to design and elaborate on a persona [Liao et al. 2018] [Neururer et al. 2018] [Toxtli et al. 2018] [Thies et al. 2017]
[Silvervarg and Jönsson 2013] [De Angeli 2005] [De Angeli et al. 2001a]
Table 16. Identity social characteristic

Manuscript submitted to ACM

42 Chaves and Gerosa

[B1] to exhibit believable behavior [Morris 2002] [Mairesse and Walker 2009] [Ptaszynski et al. 2010] [Portela and
Benefits
Granell-Canut 2017]
[B2] to enrich interpersonal relationships [Brandtzaeg and Følstad 2017] [Jain et al. 2018b] [Thies et al. 2017] [Sjödén et al.
Personality
2011] [Shum et al. 2018] [Kumar et al. 2010] [Ayedoun et al. 2017]
[C1] to adapt humor to the users’ culture [Ptaszynski et al. 2010]
Challenges
[C2] to balance the personality traits [Thies et al. 2017] [Mairesse and Walker 2009] [Sjödén et al. 2011]
[S1] to use appropriate language [Shum et al. 2018] [Morris 2002] [Mairesse and Walker 2009] [Jain et al. 2018b]
Strategies
[S2] to have sense of humor [Meany and Clark 2010] [Ptaszynski et al. 2010] [Thies et al. 2017] [Brandtzaeg
and Følstad 2017] [Jain et al. 2018b]
Table 17. Personality social characteristic

A.3 Measurements of social characteristics

The surveyed literature revealed a number of constructs that are used to measure whether the interaction with the
chatbot reaches the intended social goals. In general terms, task-oriented interactions focus on completing the task,
while the general purpose chatbots aim to engage users in general conversations. In both cases, engagement performs
an important role, and therefore is a commonly used metric. However, we also found that social characteristics can be
measured looking at additional constructs, which include interpersonal relationship, social presence, social influence,
and anthropomorphism. In this section, we discuss each of these constructs and the characteristics that can influence
the measurements.
Engagement relates to attracting and holding the user’s attention and interest [OB́rien and Toms 2008]. In the chatbot
domain, engagement can be measured by the number of exchanges per session [Dohsaka et al. 2014; Shum et al. 2018],
although other attributes can manifest users’ engagement, such as emotional connection, attention, the perception of
time, and self- and external awareness [OB́rien and Toms 2008]. In this survey, we found social characteristics in all the
three categories measured in terms of their impact on engagement. Conversational intelligence teaches users how
to interact (communicability), demonstrate attention to the users’ intentions and needs (conscientiousness) [Schuetzler
et al. 2018], and encourage users to continue the conversation [Tegos et al. 2016], even after periods of inactivity
(proactivity) [Fitzpatrick et al. 2017]. Personification makes the interaction more fun and enjoyable (personality) [Jain
et al. 2018b; Ptaszynski et al. 2010; Sjödén et al. 2011], while social intelligence provides emotional connection and
support (emotional intelligence) [Ayedoun et al. 2017; Portela and Granell-Canut 2017; Shum et al. 2018]. In line with the
engagement with technology framework [OB́rien and Toms 2008], usability also came up as influencing engagement,
particularly the ease of use [Jain et al. 2018a,b; Jenkins et al. 2007; Tallyn et al. 2018; Toxtli et al. 2018] and accessibility
[Duijst 2017; Tamayo-Moreno and Pérez-Marín 2016], which can be improved with personalization [Jain et al. 2018b].
For example, [Tamayo-Moreno and Pérez-Marín 2016] show how adapting the visual interface (color scheme, input
mode, background images, and amount of textual information) changes the user’s experience when the interlocutors
are children in early childhood education.
Interpersonal relationship relates to building a social connection with the chatbot that relies on trust, intimacy,
common ground, and reciprocal enjoyment [Lee and Choi 2017]. We found that social intelligent and appropriately
personified chatbots are more likely to build an interpersonal relationship to the user. In [Lee and Choi 2017], [Ho
et al. 2018], and [Dohsaka et al. 2014], the authors showed that emotional intelligence potentially helps with building
trust, intimacy, and enjoyment, which influences the willingness to engage. [Duijvelshoff 2017] and [Duijst 2017]
argue that reducing privacy and security concerns also increases trust. Hence, personalizing the information stored by
the chatbot and transparency result in higher interpersonal relationship and consequently willingness to engage. In
addition, [Banks 2018] showed that perceived moral agency also correlates with higher trustworthiness and goodwill.
Regarding personification, [De Angeli et al. 2001a] argue that improper personality and identity representations may
Manuscript submitted to ACM
How should my chatbot interact? 43

lead to confusing, disempowering, and distracting the users, ultimately raising interpersonal conflicts.
Interpersonal relationship is a consequence of social presence [Gunawardena and Zittle 1997; Short et al. 1976]. In
CMC fields, social presence describes the degree of salience of an interlocutor [Short et al. 1976], in this case, the chatbot,
and how it can project itself as an individual. As a determinant of interpersonal relationship, social presence is also
influenced by intimacy and trust; however, social presence is also assessed as how much the chatbot was considered to
be a “real” person [Ciechanowski et al. 2018], where humanness and believability are influencing factors. In this sense,
personification may drive the creation of social presence, since it increases the perception of anthropomorphic clues
[Ciechanowski et al. 2018; De Angeli 2005]. However, anthropomorphic clues by themselves do not imply social presence.
For example, [Araujo 2018] did not find a main effect on social presence of anthropomorphic clues, such as having a
human name (identity) and language style (thoroughness). On the other hand, they found that framing the chatbot as
“intelligent” slightly increased social presence, and higher social presence resulted in higher emotional connection with
the company represented by the chatbot. Hence, social and conversational intelligence are also required to increase
social presence, most likely due to the potential elevation of the chatbot’s social positioning [Wallis and Norling 2005].
For instance, in [Tallyn et al. 2018], participants who complained about the chatbot’s handcrafted responses expressed
the desire for spontaneous (thoroughness) and somewhat emotional reactions (emotional intelligence) to their inputs,
so the chatbot would be “more like a person.” Participants in both [Jain et al. 2018b] and [Portela and Granell-Canut
2017] related human-likeness to the ability to hold meaningful conversations, which include context preservation
(conscientiousness) and timing (proactivity). In addition, [Schuetzler et al. 2018] showed that increasing the relevance
of the chatbot’s utterance (conscientiousness) increases social presence and perceived humanness. [Morrissey and
Kirakowski 2013] list a number of characteristics that increases chatbot’s believability, including manners, proactivity,
damage control, conscientiousness, and personality. These align with the dimensions of social presence theory in CMC
[Tu and McIsaac 2002].
Anthropomorphism, in its turn, is a process of attributing of human traits to a non-human entity, even when this
attribution is known to be inappropriate [Nass et al. 1993]; for example, referring a chatbot with a personal pronoun
(he/she) rather than “it.” Anthropomorphism can be induced by personification [Araujo 2018; De Angeli 2005; Nass
et al. 1993] since the human traits are explicitly attributed by the designer. Characteristics as manners [Tallyn et al.
2018] and emotional intelligence [Portela and Granell-Canut 2017] were also shown to trigger anthropomorphism [Liao
et al. 2018], although it may depend on the user’s tendency to anthropomorphize [Liao et al. 2016].
Finally, a chatbot’s social influence refers to its capacity to promote changes in the user’s cognition, attitude, or
behavior [Raven 1964], which is sometimes called persuasiveness [Narita and Kitamura 2010]. Although we did not
find studies that focus on formally measuring the social influence of chatbots, the surveyed literature revealed a few
instances of chatbots changing users’ behaviors in particular domains. For example, in health, [Fitzpatrick et al. 2017]
showed that a chatbot with proactivity and emotional intelligence can motivate users to engage in a self-help program for
students who self-identify as experiencing symptoms of anxiety and depression. In education, tutor chatbots proactive
interventions (APT moves) that helped students to increase participation in group discussions [Dyke et al. 2013; Hayashi
2015; Tegos et al. 2016]. In the customer services field, [Araujo 2018] evaluated whether anthropomorphic clues and
framing changes the users’ attitude toward the company being represented by the chatbot; however, they did not find a
significant effect. Although social influence has shown to increase with higher social presence levels in CMC fields (e.g.,
see [Postmes et al. 2001]), the impact of enriching chatbots with social characteristics is still under-investigated.

Manuscript submitted to ACM

44 Chaves and Gerosa

A.4 Related Surveys

Table 18 shows the social characteristics covered by each related survey, where the content in each cell represents how
the paper refers to the social characteristic.

This survey [Pereira et al. 2016] [Ferman 2018] [Radziwill and Benton 2017]
Proactivity - social intelligence, users’ control -
Conscientiousness guiding the users chatbot’s conversational flows, chatbot’s mem- maintain the theme and respond specific questions
through the topics ory, making changes on the fly, conversational
and situational knowledge
Communicability - chatbot’s help, documentation -
Damage control - - damage control
Thoroughness chatbot’s language; user’s recognition and recall appropriate linguistic register/accuracy
Manners handling small talk - -
Moral agency - - respect, inclusion, and preservation of dignity,
ethics and cultural knowledge of users
Emotional intelligence - social intelligence provide emotional information, be warm, adapt to
the human’s mood
Personalization - social intelligence; ethics regarding privacy (data meets neurodiverse needs
retention and transparency)
Identity - - transparent to inspection and discloses its identity
Personality personality personality personality, fun, humor
Table 18. Social characteristics from related surveys

Manuscript submitted to ACM

How Should My Chatbot Interact A Survey On Social Characteristics in Human Chatbot Interaction Design
No ratings yet
How Should My Chatbot Interact A Survey On Social Characteristics in Human Chatbot Interaction Design
31 pages
Preprint Ijofhci Survey
No ratings yet
Preprint Ijofhci Survey
53 pages
Towards User Centric Guidelines For Chatbot Conversational 2gx13g3w
No ratings yet
Towards User Centric Guidelines For Chatbot Conversational 2gx13g3w
26 pages
Elsevier Enhanced Reader
No ratings yet
Elsevier Enhanced Reader
18 pages
Følstad & Brandtzaeg (2020)
No ratings yet
Følstad & Brandtzaeg (2020)
14 pages
I Am in Your Computer While We Talk To Each Other A Content Analysis On The Use of Language-Based Strategies by Humans and A Social Chatbot in Initi
No ratings yet
I Am in Your Computer While We Talk To Each Other A Content Analysis On The Use of Language-Based Strategies by Humans and A Social Chatbot in Initi
20 pages
1 s2.0 S2666827020300062 Main1
No ratings yet
1 s2.0 S2666827020300062 Main1
19 pages
An Overview of Chatbot Technology
No ratings yet
An Overview of Chatbot Technology
11 pages
Seminar Chapter One
No ratings yet
Seminar Chapter One
23 pages
1 s2.0 S2666827020300062 Main
No ratings yet
1 s2.0 S2666827020300062 Main
18 pages
Deciding Whether and How To Deploy Chatbots
No ratings yet
Deciding Whether and How To Deploy Chatbots
16 pages
Understanding The User Experience of Customer Service Chatbots
No ratings yet
Understanding The User Experience of Customer Service Chatbots
16 pages
How To Leverage Anthropomorphism For Chatbot Service Interface-Communication Style
No ratings yet
How To Leverage Anthropomorphism For Chatbot Service Interface-Communication Style
17 pages
Whypeopleusechatbots Preprint
No ratings yet
Whypeopleusechatbots Preprint
19 pages
CHI18 SIG Chatbots+for+social+good Authors+version
No ratings yet
CHI18 SIG Chatbots+for+social+good Authors+version
5 pages
Adamopoulou-Moussiades2020 Chapter AnOverviewOfChatbotTechnology
No ratings yet
Adamopoulou-Moussiades2020 Chapter AnOverviewOfChatbotTechnology
12 pages
Chatbot in A Campus Environment: Design of Lisa, A Virtual Assistant To Help Students in Their University Life
No ratings yet
Chatbot in A Campus Environment: Design of Lisa, A Virtual Assistant To Help Students in Their University Life
14 pages
Chatbots: Changing User Needs and Motivations: Petter Bae Brandtzaeg and Asbjørn Følstad, SINTEF, Norway
No ratings yet
Chatbots: Changing User Needs and Motivations: Petter Bae Brandtzaeg and Asbjørn Følstad, SINTEF, Norway
6 pages
Chatbots: Enhancing User Interaction
No ratings yet
Chatbots: Enhancing User Interaction
5 pages
Chatbots
No ratings yet
Chatbots
18 pages
Chatbot Research and Design
No ratings yet
Chatbot Research and Design
217 pages
Chatbot Survey: Adolescent Stress Relief
No ratings yet
Chatbot Survey: Adolescent Stress Relief
4 pages
New Project Chatbot Wordpad-2
No ratings yet
New Project Chatbot Wordpad-2
41 pages
Intelligent Chatbots: Survey & Future Directions
No ratings yet
Intelligent Chatbots: Survey & Future Directions
11 pages
Master Hu Yuchen 2019
No ratings yet
Master Hu Yuchen 2019
76 pages
Emotionally Intelligent Chatbots A Systematic Lite
No ratings yet
Emotionally Intelligent Chatbots A Systematic Lite
23 pages
Chatbots, Humbots, and The Quest For Artificial General Intelligence
No ratings yet
Chatbots, Humbots, and The Quest For Artificial General Intelligence
11 pages
Evaluating Quality of Chatbots and Intelligent Con
No ratings yet
Evaluating Quality of Chatbots and Intelligent Con
22 pages
Chatbot
No ratings yet
Chatbot
11 pages
Chatbot AI Framework Review
No ratings yet
Chatbot AI Framework Review
6 pages
Can We Be Friends With
No ratings yet
Can We Be Friends With
22 pages
Social Chatbot
No ratings yet
Social Chatbot
21 pages
SurveyonIntelligentChatbots - State of The ArtandFutureResearchDirections
No ratings yet
SurveyonIntelligentChatbots - State of The ArtandFutureResearchDirections
11 pages
Humanizing Chatbots The Effects of Visual, Identity and Conversational Cues
No ratings yet
Humanizing Chatbots The Effects of Visual, Identity and Conversational Cues
13 pages
Measuring User Experience in Chatbots An
No ratings yet
Measuring User Experience in Chatbots An
8 pages
Chatbot For Children Assistance
No ratings yet
Chatbot For Children Assistance
6 pages
WIREs Data Min Knowl - 2021 - Luo - A Critical Review of State of The Art Chatbot Designs and Applications
No ratings yet
WIREs Data Min Knowl - 2021 - Luo - A Critical Review of State of The Art Chatbot Designs and Applications
26 pages
Elis A Virtual Chatbot
No ratings yet
Elis A Virtual Chatbot
4 pages
Establishing Credibility in AI Chatbots The Importance of Customization, Communication Competency and User Satisfaction
No ratings yet
Establishing Credibility in AI Chatbots The Importance of Customization, Communication Competency and User Satisfaction
18 pages
A2IOT2020 Paper 5
No ratings yet
A2IOT2020 Paper 5
6 pages
Chatbot Insights for Developers
No ratings yet
Chatbot Insights for Developers
1 page
From Eliza To XiaoIce Challenges and Opportunities With Social Chatbots
No ratings yet
From Eliza To XiaoIce Challenges and Opportunities With Social Chatbots
20 pages
NLP Chatbots for Industry Experts
No ratings yet
NLP Chatbots for Industry Experts
8 pages
Application of Chatbot at A Higher Education Institution in Republic of Serbia
No ratings yet
Application of Chatbot at A Higher Education Institution in Republic of Serbia
12 pages
Chat Bots Review 2019
No ratings yet
Chat Bots Review 2019
11 pages
Does Chatbot Language Formality Affect Users' Self-Disclosure?
No ratings yet
Does Chatbot Language Formality Affect Users' Self-Disclosure?
13 pages
Report On Chatbot
No ratings yet
Report On Chatbot
16 pages
Chatbot Technology PDF
No ratings yet
Chatbot Technology PDF
18 pages
1 s2.0 S0747563223000250 Main
No ratings yet
1 s2.0 S0747563223000250 Main
11 pages
1 s2.0 S0736585323001351 Main
No ratings yet
1 s2.0 S0736585323001351 Main
12 pages
Triggered by Socialbots Communicative Anthropomorp
No ratings yet
Triggered by Socialbots Communicative Anthropomorp
20 pages
Conversational AI Chatbots
No ratings yet
Conversational AI Chatbots
6 pages
Essay On Chatbots
No ratings yet
Essay On Chatbots
3 pages
Sustainability 15 04012 v2
No ratings yet
Sustainability 15 04012 v2
13 pages
Chatbot Research Annotated Bibliography
No ratings yet
Chatbot Research Annotated Bibliography
28 pages
The User Experience of Chatgpt: Findings From A Questionnaire Study of Early Users
No ratings yet
The User Experience of Chatgpt: Findings From A Questionnaire Study of Early Users
10 pages
"I Am Here To Assist You Today" The Role of Entity
No ratings yet
"I Am Here To Assist You Today" The Role of Entity
26 pages
Chatbot Advances
No ratings yet
Chatbot Advances
24 pages
Service Chatbots Final
No ratings yet
Service Chatbots Final
33 pages
Visioe Rror MSG
No ratings yet
Visioe Rror MSG
24 pages
Equip Sim User Manual
No ratings yet
Equip Sim User Manual
131 pages
Forms Design and Control
100% (1)
Forms Design and Control
50 pages
Mekelle University Ethiopian Institute of Technology-Mekelle Mechanical Engineering Department
No ratings yet
Mekelle University Ethiopian Institute of Technology-Mekelle Mechanical Engineering Department
3 pages
MegaKernel Blog
No ratings yet
MegaKernel Blog
11 pages
Why Web3 Matters - Cdixon
No ratings yet
Why Web3 Matters - Cdixon
5 pages
Afzal Resume
No ratings yet
Afzal Resume
1 page
Biodata Etrio Widodo
No ratings yet
Biodata Etrio Widodo
3 pages
Wizolayer: Whitepaper
No ratings yet
Wizolayer: Whitepaper
16 pages
2010-08-18 Zernik, J: Data Mining of Online Judicial Records of The Networked US Federal Courts, International Journal On Social Media: Monitoring, Measurement, Mining, 1:69-83 (2010)
No ratings yet
2010-08-18 Zernik, J: Data Mining of Online Judicial Records of The Networked US Federal Courts, International Journal On Social Media: Monitoring, Measurement, Mining, 1:69-83 (2010)
13 pages
HGS-HSM-SL-21-001 - Improvement of Safety Function For DF Engine
No ratings yet
HGS-HSM-SL-21-001 - Improvement of Safety Function For DF Engine
6 pages
VMware Setup for Pexip Infinity
No ratings yet
VMware Setup for Pexip Infinity
23 pages
Week 3
No ratings yet
Week 3
3 pages
Interactive Cyber Security Career Roadmap
100% (1)
Interactive Cyber Security Career Roadmap
22 pages
Campus Recruitment System Is A Web Application Software
No ratings yet
Campus Recruitment System Is A Web Application Software
57 pages
MCQ Test On Ms Word
No ratings yet
MCQ Test On Ms Word
2 pages
PNF Visunet HMI Monitor PDF
No ratings yet
PNF Visunet HMI Monitor PDF
12 pages
C1 Editable End-Of-Year Test
No ratings yet
C1 Editable End-Of-Year Test
9 pages
Computer History Timeline PPTX 1
100% (1)
Computer History Timeline PPTX 1
11 pages
Soal Dan Jawaban Studi Kasus
No ratings yet
Soal Dan Jawaban Studi Kasus
4 pages
F-Secure Admin Guide
No ratings yet
F-Secure Admin Guide
136 pages
Sabella Radostitz Resume
No ratings yet
Sabella Radostitz Resume
2 pages
BYOD On Rise in Asia
No ratings yet
BYOD On Rise in Asia
3 pages
Internship Report
No ratings yet
Internship Report
20 pages
Introduction To Using C# For Graphics and Guis: Learning Objectives
No ratings yet
Introduction To Using C# For Graphics and Guis: Learning Objectives
13 pages
TDD
No ratings yet
TDD
3 pages
Real-Time Systems Course Plan
No ratings yet
Real-Time Systems Course Plan
14 pages
Portainer Documentation: Release 1.22.1
No ratings yet
Portainer Documentation: Release 1.22.1
55 pages
SRIHARI V RESUME Rev
No ratings yet
SRIHARI V RESUME Rev
3 pages
Gamification in Education
No ratings yet
Gamification in Education
8 pages

How Should My Chatbot Interact

Uploaded by

How Should My Chatbot Interact

Uploaded by

How should my chatbot interact?

A survey on human-chatbot interaction design

© 2018 Association for Computing Machinery.

Manuscript submitted to ACM 1

2 OVERVIEW OF THE SURVEYED LITERATURE

3 CHATBOTS SOCIAL CHARACTERISTICS

3.1 Conversational Intelligence

Social Characteristics Benefits Challenges Strategies

[S4] conscientiousness and communicability

Table 1. Conceptual model of chatbots social characteristics

3.1.2 Conscientiousness. Conscientiousness is a chatbot’s capacity to demonstrate attentiveness to the conversation

conscientiousness and personality is highlighted in Section 4.

Manuscript submitted to ACM

provided information, privacy, interactional flexibility, and consistency.

3.2 Social Intelligence

Manuscript submitted to ACM

Manuscript submitted to ACM

4 INTERRELATIONSHIPS AMONG THE CHARACTERISTICS

Fig. 1. Interrelationship among social characteristics

Manuscript submitted to ACM

interactions 7, 1 (2000), 31–38.

A.1 Overview of the surveyed literature

# of papers Topics handled by the chatbots

A.2 Chatbots Social Characteristics

A.2.1 Conversational Intelligence.

Study Main investigation Interaction Analyzed data Methods Reported social

Manuscript submitted to ACM

Manuscript submitted to ACM

A.2.2 Social Intelligence.

Manuscript submitted to ACM

[B3] to increase believability [Morris 2002]

Manuscript submitted to ACM

Study Main investigation Interaction Analyzed data Methods Reported social

Manuscript submitted to ACM

A.3 Measurements of social characteristics

Manuscript submitted to ACM

A.4 Related Surveys

Manuscript submitted to ACM

You might also like