ASR(Automatic Speech Recognition)
Hritwiza Gupta Vandana Dubey Gargi Paliwal
Computer Science and Engineering Computer Science and Engineering Computer Science and Engineering
Mody University Science and Mody University Science and Mody University Science and
Technology Technology Technology
Laxmangarh,Sikar,Rajasthan Laxmangarh,Sikar,Rajasthan Laxmangarh,Sikar,Rajasthan
[email protected] [email protected] [email protected] Abstract— Automatic Speech Recognition (ASR)
technology has become an essential component in enabling The main goal of speech recognition area is to develop
seamless human-computer interaction across various techniques and systems for speech input to machine. Speech
applications, including voice-activated systems, virtual is the primary means of communication between humans.
assistants, and transcription services. This ASR tool aims to For reasons ranging from technological curiosity about the
accurately convert spoken language into text, providing a mechanisms for mechanical realization of human speech
high degree of accuracy and adaptability to diverse accents, capabilities to desire to automate simple tasks which
languages, and environmental conditions. Leveraging necessitates human machine interactions and research in
advanced machine learning models and deep neural automatic speech recognition by machines has attracted a
networks, the tool can learn and improve over time, great deal of attention for sixty years[76]. Based on major
ensuring robust performance across different audio qualities advances in statistical modeling of speech, automatic speech
and linguistic nuances. This ASR system not only supports recognition systems today find widespread application in
natural language understanding (NLU) but also integrates tasks that require human machine interface, such as
well with text-to-speech (TTS) modules to enable automatic call processing in telephone networks, and query
bidirectional conversational AI. Its modular design allows based information systems that provide updated travel
easy integration with mobile and web applications, making information, stock price quotations, weather reports, Data
it suitable for applications in customer service, accessibility, entry, voice dictation, access to information: travel, banking,
and real-time communications. By optimizing both speed Commands, Avoinics, Automobile portal, speech
and accuracy, the ASR tool represents a significant transcription, Handicapped people (blind people)
advancement in enhancing user experiences through voice- supermarket, railway reservations etc. Speech recognition
based interactions. Hence authors hope that this work shall technology was increasingly used within telephone networks
be a contribution in the area of speech recognition. The to automate as well as to enhance the operator services. This
objective of this review paper is to summarize and compare report reviews major highlights during the last six decades in
some of the well-known methods used in various stages of the research and development of automatic speech
speech recognition system and identify research topic and recognition, so as to provide a technological perspective.
applications which are at the forefront of this exciting and Although many technological progresses have been made,
challenging field. still there remains many research issues that need to be
tackled.
Keywords— Speech Recognition, Automatic Speech Fig.1 shows a mathematical representation of speech
Recognition, ASR Systematic Review, ASR recognition system in simple equations which contain front
challenges. end unit, model unit, language model unit, and search unit.
The recognition process is shown below (Fig .1)
I. INTRODUCTION
1.1 Definition of speech recognition:
Speech Recognition (is also known as Automatic Speech
Recognition (ASR), or computer speech recognition) is the
process of converting a speech signal to a sequence of words,
by means of an algorithm implemented as a computer
program.
1.2 Basic Model of Speech Recognition:
Research in speech processing and communication for the
most part, was motivated by people s desire to build
mechanical models to emulate human verbal communication
capabilities. Speech is the most natural form of human
communication and speech processing has been one of the
most exciting areas of the signal processing. Speech
recognition technology has made it possible for computer to
follow human voice commands and understand human
languages. Fig.1 Basic model of speech recognition
XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE
The standard approach to large vocabulary continuous
1.4 Automatic Speech Recognition system classification:
speech recognition is to assume a simple probabilistic model
of speech production whereby a specified word sequence, The following tree structure emphasizes the speech
W, produces an acoustic observation sequence Y, with processing applications. Depending on the chosen criterion,
probability P(W,Y). The goal is then to decode the word Automatic Speech Recognition systems can be classified as
string, based on the acoustic observation sequence, so that shown in figure 2.
the decoded string has the maximum a posteriori (MAP)
probability.
The first term in equation (3) P(A/W), is generally called
the acoustic model, as it estimates the probability of a
sequence of acoustic observations, conditioned on the word
string. Hence P(A/W) is computed. For large vocabulary
speech recognition systems, it is necessary to build
statistical models for sub word speech units, build up word
models from these sub word speech unit models (using a
lexicon to describe the composition of words), and then
postulate word sequences and evaluate the acoustic model
probabilities via standard concatenation methods. The
second term in equation (3) P(W), is called the language
model. It describes the probability associated with a
postulated sequence of words. Such language models can
incorporate both syntactic and semantic constraints of the
language and the recognition task.
1.3 Types of Speech Recognition
Speech recognition systems can be separated in several
different classes by describing what types of utterances they
1.5 Relevant issues of ASR design:
have the ability to recognize. These classes are classified as
the following: Main issues on which recognition accuracy depends
have been presented in the table 1.
Isolated Words:
Isolated word recognizers usually require each utterance to
have quiet (lack of an audio signal) on both sides of the
sample window. It accepts single words or single utterance
at a time. These systems have "Listen/Not-Listen" states,
where they require the speaker to wait between utterances
(usually doing processing during the pauses). Isolated
Utterance might be a better name for this class.
Connected Words:
Connected word systems (or more correctly 'connected
utterances') are similar to isolated words, but allows separate
utterances to be 'run-together' with a minimal pause between
them.
Table 1: Relevant issues of ASR design
Continuous Speech:
Continuous speech recognizers allow users to speak almost
naturally, while the computer determines the content. 2. Approaches to speech recognition:
(Basically, it's computer dictation). Recognizers with
continuous speech capabilities are some of the most difficult Basically there exist three approaches to speech recognition.
to create because they utilize special methods to determine They are Acoustic Phonetic Approach Pattern Recognition
utterance boundaries. Approach Artificial Intelligence Approach
Spontaneous Speech: 2.1 Acoustic phonetic approach:
At a basic level, it can be thought of as speech that is natural
sounding and not rehearsed. An ASR system with The earliest approaches to speech recognition were based on
spontaneous speech ability should be able to handle a finding speech sounds and providing appropriate labels to
variety of natural speech features such as words being run these sounds. This is the basis of the acoustic phonetic
approach (Hemdal and Hughes 1967), which postulates that
together, "ums" and "ahs", and even slight stutters. there exist finite, distinctive phonetic units (phonemes) in
spoken language and that these units are broadly frameworks mirror human way of behaving's like going with
characterized by a set of acoustics properties that are choices at the time, executing routine jobs, responding to
manifested in the speech signal over time. Even though, the users fastly, and answering questions. E-business,
acoustic properties of phonetic units are highly variable, both entertainment, virtual aid, and other electronic groups
with speakers and with neighboring sounds (the so-called co abound. In this generation, everything is becoming more and
articulation effect), it is assumed in the acoustic-phonetic more connected to the internet. It's a very good way
approach that the rules governing the variability are tomanage and benefit from everything that's just outside your
straightforward and can be readily learned by a machine. The door. At runtime, they have a very limited knowledge base
first step in the acoustic phonetic approach is a spectral
no way keeping track of all the talks. Chatbots employ
analysis of the speech combined with a feature detection that
converts the spectral measurements to a set of features that machine learning to assist AI in understanding user
describe the broad acoustic properties of the different queries/doubts and providing an appropriate response to the
phonetic units. The next step is a segmentation and labeling user. For conversing or engaging with the user, they are
phase in which the speech signal is segmented into stable created utilizing the AI Markup Language. Answering
acoustic regions, followed by attaching one or more phonetic engines are another name for chatbots. Because the
labels to each segmented region, resulting in a phoneme knowledge has already been programmed in advance, this
lattice characterization of the speech. The last step in this application works in a very straightforward manner. Design
approach attempts to determine a valid word (or string of coordinating, normal language handling, and information
words) from the phonetic label sequences produced by the mining are a portion of the methodologies utilized in the
segmentation to labeling. In the validation process, linguistic application. The chatbot looks at the client's provided
constraints on the task (i.e., the vocabulary, the syntax, and sentence to a current example in the information base.
other semantic rules) are invoked in order to access the Consumer loyalty with an organization's administrations is in
lexicon for word decoding based on the phoneme lattice. The many cases considered the way to progress and long haul
acoustic phonetic approach has not been widely used in most intensity for an organization; be that as it may, clients know
commercial applications ([76], Refer fig.2.32. p.81).The nothing about the subtleties encompassing what this cover
following table 3 broadly gives the different speech incorporates, assuming the cover incorporates family or
recognition techniques. voyaging colleagues, how the cover is enacted, and who to
call when they Insurance professionals also needed
processes. Getting all of the data they require is a
troublesome undertaking. To uncover the solution, insurance
staff had to go through a mountain of paperwork. [6] As a
result, the only method to get immediate assistance was to
call endorsing or deals support - even for simple "how-to"
questions or answers to FAQs. [8] This system is
overburdened. Call centers have long wait times. As a result,
customers are disappointed and unsatisfied with their
interactions, lowering throughput and business performance
significantly. According to research, approximately 75
percent of clients have had terrible client service. [2]– [4].
Fig 3: Classification of chat bot model
Table 2: Speech Recognition Techniques\ One of the first recognised chat bots was ELIZA, a computer
software built at the MIT AI Laboratory in 1964. To
comprehend the intricacies of human language, ELIZA uses
3. Additional features a technology known as natural language processing. It
recognised key tags (essential phrases) and was able to
3.1 Coversational AI
answer some basic decision tree problems [1]. The creation
A chatbot is a piece of software that helps in the natural of ELIZA signalled the start of the first generation of
development of a conversation with a user. AI has become conversation bots. Organizations like MSN and AOL started
increasingly complicated as data innovation and involving this innovation in robotized phone frameworks that
correspondence have progressed. Artificial intelligence utilized incredibly crude choice trees in the late twentieth
century. ELIZA was quickly followed by PARRY, a far
more sophisticated bot. Kenneth Colby, a psychiatrist, came 3.2 Applications
up with PARRY. Parry as able mimic the actions of a person
suffering from paranoid schizophrenia. By the late 1970s, a • Conversational enquiry chatbot helps Customers to
growing number of bots had been developed to replace the get the right source of information.
previous generation bots.
• Not only our chatbot but any chatbot will provide
them with an instant as well as accurate response.
• AI based Chatbot system can be used by colleges and
businesses
3.3 Architecture of Chatbot
Table 3: Techniques and models used in some chatbots. The architecture means working of CHATBOT starting
from user requests to the Bot response (figure 4). The
Chatbot background process begins with the user’s appeal,
for example, “What is PTSD ?” to the BOT deployed to the
messenger system app like Facebook, Telegram, WhatsApp,
Website, Slack, etc. or to the device using speech as input
like Google Assitant, Amazon Alexa, Amazon echo dot.
After receiving the user’s request, the Natural Language
Understanding (NLUs) component analyzes it or maps it to
the user’s intention and, consequently, gathers further
related information (intent: “translate,” entities: [word:
“PTSD”]).
Fig 4: Types of chatbot models.
Fig 6: Designing Chatbot
Fig 5: Flowchart of Query solvers
Fig 7: Architecture
3.4 Common Terminologies Used in Chatbot 3. Output Speaker:
o Converts the chatbot's textual responses
Since Dialogflow Essential, IBM Watson, Amazon Lex, back into speech using Text-to-Speech
ManyChat,etc.,provideML algorithms training this section (TTS) technology.
addresses how user intentions, entities, & fulfillment are
utilized to build and train Bot. Figure 6 shows the common o Provides auditory feedback to the user,
terminologies used in the CHATBOT.Intents are potential completing the conversational loop.
user statements that can trigger the user’s purpose [12].
When an end-user An Automatic Speech Recognition (ASR) tool integrated
connects with BOT, they intend to; use BOT to know the with a chatbot and output speaker creates an advanced
information they want? Suppose an voice-based interactive system.
end-user asks Bot to “Book a movie ticket,” in this scenario,
if this conversation happens at a • The ASR tool converts spoken language into text,
theatre, we can understand that the customer wants to book enabling the chatbot to process and understand user
a movie ticket. Now to understand inputs using Natural Language Processing
the same for BOT, the designer uses INTENT to identify (NLP).
what the user requesting. As a result,
“Book a movie ticket” could be named “book_movie”
intent. Intents are the aim, purpose, goal, • The chatbot generates contextually relevant
motives of the users interacting with the BOT application or responses, which are then converted into speech
web service. Now user’s intention using Text-to-Speech (TTS) technology and
is categorized into two parts [13]: the first user Seeking for delivered via the output speaker.
Something – for instance, patron
purpose of finding the information about train tickets, of This system enables natural, hands-free, and real-time
seeking weather condition of Toronto conversations, making technology more accessible and
for next week, [13] etc. The second one is for taking action, intuitive. It has diverse applications in smart assistants,
such as booking a table at a food education, healthcare, customer support, and even
restaurant and booking movie tickets. Entities are modifiers specialized fields like CAD systems, simplifying complex
to intents, which are used to add tasks through conversational AI.
knowledge or information to intent. Bot finds the exact
matches for the training phrase of the II. Platform to Build
user input [12]. Suppose two phrases of user input, “Book a
movie ticket.” or “Book a flight No-programming platforms are that platform design by the
ticket,” in this intent is “book” similarly here “movie” or developer uses to build Bot without
“flight” act as a modifier, hence acts any programming language, machine learning algorithm,
as entities. Designing CHATBOT entities is equally crucial and natural language processing and
to fed-up on the database understanding skills. These platforms are impeccable for
concerning intent [13]; if this will not happen, Bot fails small-scale projects and simple Bot.
miserably if they cannot give required Codes for these platforms are easy to develop without
information after identifying user intention. knowing programming skills, ML
algorithm, NLP, and NLU expertise. The widespread
example of the non-coding platform is
I. System Overview Chatfuel, ManyChat, and Motion.ai [14]. Now come to
platforms build by tech giants for
1. ASR Tool: CHATBOT since they recognize as a symbol of standard.
These platforms are robust in nature;
significant memory and a learning curve are also
o Converts spoken input into text using significantly elevated. These are commonly
advanced AI and Machine Learning used to build complex BOTs which involves a design
algorithms. conversation flow means flowchart
though they have to consider that Bot should never
o Acts as the entry point for voice misunderstand user requests or it should be
commands, translating human speech into rare. Commonly used tech giants platforms such as Google
machine-readable language. develop Dialogflow Essential,
Dialogflow CX, Facebook generates Wit.ai, Microsoft
2. Chatbot: develops LUIS, Amazon develops Lex,
and IBM develops Watson from this they are easy to deploy
o Processes the text generated by the ASR [15] to the Application, website,
tool using Natural Language Processing Telegram, etc.
(NLP). GOOGLE DIALOGFLOW (Figure 8) allows users to use a
new methodology to unite with
o Understands the intent of the query or their product by building CHATBOT by involving text,
command and formulates an appropriate speech, or voice conversation in the
response. interfaces. For example, the voice recognition technology
deployed CHATBOTs; for instance,
Amazon echoes dot. GOOGLE DIALOGFLOW allows its
users to connect or deploy on the
organization’s website, mobile application, Google o Personal assistants for smart homes and
Assistant, Amazon Alexa, Facebook devices (e.g., controlling appliances,
Messenger, and other popular platforms. Utilization for setting reminders).
Medical Consultant System, MedBot [17], and Jamura: A
Conversational Smart • Education:
Home Assistant [18], Development of the CHATBOT
Einstein Application as a Virtual o Voice-based tutors that explain concepts,
Teacher of Physical Learning [19], Developing the answer questions, and provide
CHATBOT Speech-to-Text interface based personalized feedback.
through Google API [20]. BM WATSON (Figure 8) has a
service, IBM Assistant, that lets designers develop, train,
test, • Healthcare:
and deploy on the web server, application, devices.
CHATBOTs are built to mimic human o Virtual assistants to schedule
interactions, such that conversations between Bot and appointments, provide medical advice, or
customer should like conversing between guide patients through procedures.
two humans. Watson Assistant can search for an answer
from a knowledge base, ask for • Customer Support:
clarification for the question requested, and direct users to a
human if the Bot cannot solve the o Automated agents for handling queries
user’s queries. CHATBOT builds using IBM Watson, for with a natural, human-like interaction.
instance, A Voice Interactive,
Multilingual Student Assistance System, based on IBM • Engineering and CAD:
Watson [21], Implementation of
CHATBOT for ITSM Application based on IBM Watson
[22], Smart Assistance supporting o Guides users through design processes
Students and Staff Living in a Campus [23]. and troubleshooting with step-by-step
RASA NLU (Figure 8) is an open-source NLP library for voice instructions.
identifying the intent and extraction
of entities in CHATBOTs. It helps the designer to create and IV. Advantages
write customization NLP for
CHATBOTs. In RASA Conversational, the designer has to • Accessibility: Makes technology user-friendly for
deal with two components: Rasa people with physical disabilities or low technical
NLU and Rasa Core. Rasa NLU is likely to be ear, taking expertise.
inputs from the requested user, and
Rasa Core is expected to be the brain, making decisions or • Real-Time Interaction: Immediate processing and
giving response for user input. [24]. response facilitate seamless communication.
Rasa NLU is not the only library having a bunch of
algorithms to achieve what designers want. • Hands-Free Operation: Enables users to perform
RASA can develop almost all kinds of CHATBOT that tasks while interacting, particularly useful in
designers imagine and users requiring professional and industrial settings.
from the organization. CHATBOT builds using RASA
NLU, for example, FLOSS FAQ
CHATBOT project reuse [25], Self-Learning Chabot from • Multilingual Support: Supports diverse languages
User Interactions and Preferences and accents, expanding its reach.
[26].
MANYCHAT (Figure 8) is a web service that allows the 4.Future Scope
designer to make CHATBOTs,
especially Facebook Messenger. The designer can use this
platform to build various The future scope of an Automatic Speech Recognition
purposes, like marketing the product and customer care. The (ASR) tool integrated with a chatbot and an output speaker
key point of this platform is is vast and can cater to numerous industries and user
its simplicity in use. ManyChat claims that customers can experiences. Here's an analysis of its potential:
use these platforms to set up a
CHATBOT in about two minutes, free of coding, does not
have to be an expert in any 1. Enhancing Human-Computer Interaction
programming language. This enables the designer to make
Bot even more targeted
broadcasts by deploying onto the Facebook Messenger • Seamless Communication: Voice-based
system. CHATBOT builds using ManyChat, for example: interaction makes technology more accessible and
Improve the Security of Social Media Accounts [27], natural for users, especially those less familiar with
CHATBOT for Institutional purpose [28]. typing or digital interfaces.
III. Applications • Real-Time Response: Integrating ASR with a
chatbot allows for immediate responses, creating a
fluid conversation that feels human-like.
• Smart Assistance:
• Multilingual Support: Advanced ASR systems • Finance: AI assistants to guide users through
can handle multiple languages and accents, banking transactions or investment options using
expanding the tool's usability globally. voice.
2. Accessibility Solutions 9. Emotional Intelligence in AI
• Inclusivity for the Differently Abled: Voice • Voice Tone Analysis: Future ASR tools can
interfaces can empower individuals with physical integrate emotion detection to modify chatbot
disabilities, enabling hands-free interaction. responses based on the user's mood, enhancing
empathy and engagement.
• Speech-to-Speech Translations: Combining ASR
with Natural Language Processing (NLP) can 10. Voice Biometrics and Security
facilitate real-time translations, breaking language
barriers. • Authentication: Voice-based login for secure
access.
3. Personal Assistants
• Fraud Detection: Identifying discrepancies in
• Smart Devices: Integration into smart homes for speech patterns for fraud prevention.
controlling appliances through conversational
commands. II. Technical Advancements in Future ASR Tools
• Wearables: Enhanced functionality in devices like 1. Noise Robustness: Improved accuracy in noisy
smartwatches or earbuds with conversational AI environments.
capabilities.
2. Low-Latency Processing: Real-time interactions
4. Customer Support Automation with minimal delay.
• Call Centers: Automating call responses with 3. Energy Efficiency: Optimized for low-power
human-like conversations reduces dependency on devices.
human agents.
4. Personalized Speech Models: Adapting to
• 24/7 Availability: Always-on service enhances individual user habits, preferences, and accents.
customer experience and operational efficiency.
III. Challenges and Opportunities
5. Healthcare Applications
• Challenges: Privacy concerns, data security,
• Telemedicine: Real-time conversations between handling of regional accents, and latency in real-
patients and medical chatbots. time systems.
• Medical Data Entry: Automating the transcription • Opportunities: Rising demand for voice-first
of doctors' notes or patient conversations. applications, advancements in edge computing, and
innovations in deep learning models for speech
6. Education and Learning processing.
• Personal Tutors: Chatbots with ASR can serve as
voice-interactive tutors. 5. Result
The integration of an Automatic Speech Recognition
• Language Learning: Real-time pronunciation (ASR) tool, chatbot, and output speaker yields the following
feedback and conversational practice for students. results:
7. Automotive Integration 1. Enhanced Accessibility:
Facilitates voice-based interaction, enabling hands-free use.
• Voice-Driven Commands: ASR-powered chatbots
in vehicles for navigation, music selection, and Increases inclusivity for individuals with physical disabilities
vehicle status updates. or non-technical backgrounds.
2. Improved User Experience:
• Safety Features: Hands-free interaction reduces
distractions while driving. Provides real-time, context-aware responses through auditory
feedback.
8. Industry-Specific Applications Simplifies complex workflows in domains like CAD by
allowing natural language commands and conversational
• Retail: Personalized shopping assistants that assistance.
recognize and respond to voice commands.
3. Increased Efficiency: When combined with an output speaker, these systems
provide auditory feedback, enabling a fully immersive and
Reduces reliance on procedural inputs by providing intuitive hands-free interactive experience. This can be particularly
and alternative solutions to problems. beneficial in fields like engineering design, education,
Streamlines user engagement, minimizing the learning curve
for new tools or applications. healthcare, and customer support, where hands-free, real-
time solutions are critical.
4. Broadened Applications: In summary, the ASR tool integrated with a chatbot and
output speaker exemplifies how voice-interactive systems
Effective in industries like engineering, education, can revolutionize traditional workflows. It not only
healthcare, customer support, and smart devices. enhances efficiency and usability but also expands the scope
of chatbot applications into complex domains such as CAD
Supports complex systems (e.g., CAD) by addressing software, addressing challenges in knowledge representation
knowledge representation challenges and offering and offering innovative problem-solving approaches.
innovative, AI-driven solutions.
5. Real-Time Assistance: 7. Acknowledgement
Delivers step-by-step guidance and alternative approaches We would like to express our heartfelt gratitude to everyone
through speech input and output. who contributed to the successful completion of this work.
Enhances productivity by bridging gaps between technical
and non-technical users. First and foremost, we extend our sincere thanks to our
mentors and advisors for their invaluable guidance,
In conclusion, the ASR tool combined with a chatbot and insightful suggestions, and constant support throughout this
output speaker transforms traditional interaction models into journey. Their expertise and encouragement have been
intuitive, voice-enabled systems, significantly improving instrumental in shaping our understanding of Automatic
usability, accessibility, and efficiency across a wide range of Speech Recognition (ASR) tools and their integration with
applications. chatbots and output speakers.
6. Conclusion We also acknowledge the contributions of researchers,
The integration of Automatic Speech Recognition (ASR) developers, and pioneers in the fields of Artificial
tools with chatbots and output speakers represents a Intelligence (AI), Machine Learning (ML), and Natural
transformative advancement in human-computer interaction. Language Processing (NLP) whose work laid the
foundation for this study.
By enabling voice-based conversational interfaces, this
system leverages Artificial Intelligence (AI) and Machine
Learning (ML) to enhance accessibility, usability, and Finally, we are grateful to our colleagues, friends, and
families for their unwavering support and encouragement,
efficiency across various domains. Incorporating ASR into
which motivated us to complete this project with dedication
chatbots, coupled with an output speaker, builds on the and enthusiasm.
principles outlined for chatbot applications. This
combination can serve as a natural extension for Computer-
Thank you all for your contributions and inspiration in
Aided Design (CAD) systems, overcoming the challenges advancing this innovative application of technology.
associated with procedural-based knowledge methods. The
voice-enabled interaction simplifies user engagement, 8. REFERENCES
making complex systems like CAD more accessible to non-
technical users.
[1] Elliott, W. S. “Computer-Aided Mechanical Engineering: 1958 to
1988.”Computer-Aided Design, vol. 21, no. 5, 1989, pp. 275–88.
Through speech recognition and contextual understanding, [2] Daud, Mohd Fadzil, et al. “Assessing Mechanical Engineering
Undergraduates’ Conceptual Knowledge in Three Dimensional
ASR-enabled chatbots can: Computer-AidedDesign(3DCAD).”Procedia - Social and Behavioral
Sciences, vol. 56, no. Ictlhe, 2012, pp. 1–11
1. Understand Natural Language Commands: [3] Shevchuk, Ruslan, and Yaroslav Pastukh. “Improve the Security of
Simplifying input for users by reducing Social Media
dependency on complex commands or technical Accounts.” 2019 9th International Conference on Advanced
know-how. Computer Information Technologies (ACIT). IEEE, pp. 439-442,
2019.R. Nicole, “Title of paper with only first word capitalized,” J.
Name Stand. Abbrev., in press.
2. Provide Real-Time Assistance: Offering step-by-
[4] B. Sonawane, A. Ombase, P. Rajmane, and D. Kamble, “Chatbot for
step guidance or alternative solutions to design Institutional Purpose,” no. 07, pp. 585–601, 2020M. Young, The
challenges, much like a virtual assistant. Technical Writer’s Handbook. Mill Valley, CA: University Science,
1989.
3. Enhance Productivity: Minimizing the learning [5] Toxtli, Carlos, Andrés Monroy-Hernández, and Justin Cranshaw.
curve for new users by offering intuitive, voice- “Understanding chatbot-mediated task management.” Proceedings of
guided solutions. the 2018 CHI conference on human factors in computing systems.
2018.
4. Expand Accessibility: Making CAD tools more
inclusive for users with physical disabilities or
limited technical expertise.