TuVox Confidential - 1 - 2005 TuVox Incorporated
D E M Y S T I F Y I N G S P E E C H A P P L I C A T I O N S
12 Key Best Practices for Implementing Speech
333 Distel Circle
Los Altos, CA 94022
650 623-0210
LESSONS ABOUT SPEECH YOU ALREADY LEARNED FROM WEB STRATEGY
TuVox Confidential - 2 - 2005 TuVox Incorporated
TABLE OF CONTENTS
Abstract ........................................................................................... 4
Introduction...................................................................................... 4
Lesson One: Speech is like the Web.................................................... 5
Lesson Two: Its Your Public Face....................................................... 5
Lesson Three: New Forms of Self-Service ............................................ 6
Lesson Four: Empower a Strategic Shift in Your Organization ................ 6
Lesson Five: Dont Proliferate Speech without A Strategy ...................... 7
Lesson Six: Dont Be Paralyzed By What You Dont Know...................... 7
Lesson Seven: If You Build It, You Must Maintain It.......................... 8
Lesson Eight: You Need Enterprise Software to Build and Maintain
Speech Applications.......................................................................... 8
Lesson Nine: Choose Platform Independence over Proprietary Solutions . 8
Lesson Ten: Beware the Limits of Pre-packaged Applications ................ 9
Speech applications are a new breed of
self-service application with a powerful
effect both on customer experience and on
business strategy overall.
TuVox Confidential - 3 - 2005 TuVox Incorporated
Lesson Eleven: Understand the Risks of Outsourcing Speech Development
....................................................................................................... 9
Lesson Twelve: Leverage Corporate Web Content and Infrastructure ..... 10
Conclusion...................................................................................... 11
Checklist for advancing your speech strategy...................................... 11
TuVox Confidential - 4 - 2005 TuVox Incorporated
This new breed of self
service application has
a powerful effect both
on customer
experience and on
business strategy
overall.
ABSTRACT
Speech technology has moved
beyond its early adoption phase
into the mainstream as a
powerful and cost effective way
of providing customer support.
This white paper provides
business line managers and
executives with an understanding
of key critical success factors in
developing a speech application
strategy. Speech can be viewed as
a specialized form of Web
application. Many of the lessons
learned from the corporate Web
presence can be adapted directly
to speech application
development and management.
Companies that capitalize on
their Web experience to provide
speech enabled self-service are in
the process of leapfrogging their
competitors.
INTRODUCTION
Across a growing number of
industries, customers are greeted
over the phone by a new
technology with a friendly
human voice. Called speech
applications, this new
technology uses natural language
to automate caller conversations,
enabling customers to speak into
the phone instead of pressing
numbers on their phone keypad.
Whether callers need
information about airline or
train schedules, reserving a rental
car, troubleshooting their
technology purchase or checking
their account balance, these
speech applications quickly route
callers to support and
information in a friendly,
cheerful voice.
This new breed of self-service
application has a powerful effect
both on customer experience and
on business strategy overall. In
addition to giving businesses a
more human face and a
memorable brand, these
applications empower customers
to more quickly access
information and in some cases to
complete tasks that outdated
push-tone applications simply
could not readily perform. For
example, it is much easier for a
traveler to ask for a specific city
by name (Cleveland) than to
press a series of numbers on the
phone and navigate through a
complicated maze of menus.
Speech also offers broader
capabilities, enabling businesses
to discover new ways of
improving customer self-service.
Speech applications today are
Automating new types
of self-service
Driving down call center
and support costs
Helping to generate
more revenue
Increasing customer
satisfaction by providing
a superior caller
experience
As speech applications move
beyond the early adoption phase
into mainstream business use,
companies are approaching the
development and management of
speech differently. While
previously companies only used
speech in simple or superficial
ways, more now view speech as
a critical business capability for
which they must develop a well-
thought out strategy. Without
such a strategy, these companies
realize they may lose their
competitive edge. What follows
are twelve important lessons that
the most advanced users of
speech have learned in preparing
their corporate speech strategy.
TuVox Confidential - 5 - 2005 TuVox Incorporated
Nothing is more
frustrating to a
customer than
having to navigate
through a
complicate
hierarchy of menus
and then fail to get
LESSON ONE: SPEECH IS
LIKE THE WEB
Speech applications are a lot like
corporate Web sites. Companies
that capitalize on this insight
leverage the value of speech
much faster than their
competitors. After all, most
organizations have developed
deep knowledge about the Web
and have developed clear
corporate strategies around Web
site management. Many of these
best practices are transferable
directly to speech.
The similarities between speech
applications and Web sites are
profound. At the business level,
both speech and the Web serve
very similar business functions
with comparable impacts. Like
the Web, speech interacts with
customers, helps provide
information and conduct
transactions, and can have a
huge impact on developing a
memorable corporate brand.
Speech applications also enable
organizations to reduce costs by
quickly routing callers to new
forms of self-service.
Beyond the similarity in business
functions, speech applications
and Web sites have almost
identical technical architectures.
Speech applications use an
application server for business
logic, a Web server to serve
specialized XML pages (called
VXML) and connect to backend
systems to access customer data
and business logic. In many
ways, a speech application is
really a specialized form of Web
application. When a customer
dials a phone and reaches a
speech application, he or she
reaches a Voice browser and, by
speaking into the phone,
navigates through pages of
speech to get information or
conduct transactions.
The processes of developing and
managing speech applications
are also very similar to the
processes of creating and
maintaining a corporate Web
site. Both have similar lifecycles
that involve keeping customers
up to date on the latest
information. And before a new
version can be launched, it must
be reviewed and approved and
finally run through a QA
process.
LESSON TWO: ITS YOUR
PUBLIC FACE
Speech and the Web are the
public faces of your organization
to your customer, and are often
the first point of interaction.
Every time a customer comes to
your Web site or dials your 800
number, you reinforce a
particular customer experience
that is shaped by numerous
intangibles. This customer
experience will create a critical
impression and, if positive, a
lasting relationship with your
customer base to give your
products additional cache in the
market. When the cost to acquire
a new customer far exceeds the
cost to retain an existing
customer, investments in brand
and customer experience are
investments in revenue.
Nothing is more frustrating to a
customer than having to navigate
through a complicated hierarchy
of menus (press 1, then press 2,
then press 8) and then fail to get
to the information he or she
needs. From a usability
perspective, callers universally
dislike touch-tone navigation
and will only patiently navigate
three to four levels down a menu
hierarchy. As a result, many
callers discontinue the call or
simply press '0' to speak to a live
agent, negating the system's
intended value.
Improving customer experience
is one of the powerful reasons
that companies are migrating
TuVox Confidential - 6 - 2005 TuVox Incorporated
from touch tones to speech.
Speech solves this caller
frustration problem by enabling
a customer to navigate using
natural conversation and to
speak to the application as
though speaking with a live
agent. In this way, callers
typically can reach information
more rapidly and with greater
success. Companies can also
make more information available
through speech because speech
applications are not limited to a
complicated maze of menus or
the limited options of keypads.
Finally, customers find the
experience more enjoyable since
the speech application is more
human-like in its behavior. All
of these factors translate into a
better customer experience, more
automation and ultimately
higher retention rates and
stronger brand.
LESSON THREE: NEW
FORMS OF SELF-SERVICE
The Internet became powerful
because it provided a new form
of self-service that customers
quickly grew to rely on. The
Web enabled them to access
information on demand. At the
same time, the Web dramatically
reduced the cost of some types of
customer interaction.
Speech has similar capabilities.
With speech, callers can use
language to point and click for
information. Because they dont
have to navigate menus of touch
tones, they can more rapidly
access information and conduct
transactions without necessarily
speaking to a customer service
representative. Customers are
rapidly becoming power users of
speech, learning to surf through
a speech application the way
they learned to move through a
Web site, discovering short cuts
to information and taking full
advantage of this new tool. The
result is that this growing base of
callers no longer tolerates the old
touch tone system. Theyve seen
the future and they like it.
Consequently, companies can
now significantly increase their
rate of self-service while giving
customers a better experience. In
many cases, companies are
saving several million dollars for
every percentage of self-service
increase they achieve with
speech. In many cases they are
increasing their self-service rates
by 8 to 10percent or more over
current automation and getting
between 75 to 90percent
automation rates based on the
type of application.
LESSON FOUR: EMPOWER
A STRATEGIC SHIFT IN
YOUR ORGANIZATION
The shift from touch tone self-
service to speech ultimately
requires a shift in the
philosophical framework of the
enterprise that you can help
enable. Touch-tone applications
belong principally to the world
of telephony. They use
proprietary scripting languages
and are built with proprietary
tools by specialists who learned
those languages. They are
typically maintained by teams
who have little or no knowledge
of the Web.
By contrast, speech belongs to
the world of the Web. It uses
open standards. Knowledge that
resides in marketing and
technical Web teams can be
readily applied to speech.
Corporations that are moving
the fastest with Speech are
allowing this boundary between
telephony and Web to blur. They
build new teams that leverage
knowledge from both
organizations and create
powerful cross fertilization in
their organization. In this way,
Web knowledge rapidly transfers
into the speech team while at the
same time the telephony team
enters the world of the Web.
When the Internet took off in the
mid-nineties, many organizations
failed to develop strategies that
took into account the entire
enterprise. They had fledgling
Web sites, but they had not
anticipated the importance of the
Web to all aspects of corporate
business. As a result, the
corporations that were first to
develop a comprehensive Web
strategy capitalized much more
rapidly than their competitors on
costs savings, revenue
opportunities and the branding
impact of the Web.
TuVox Confidential - 7 - 2005 TuVox Incorporated
As speech solutions mainstream,
organizations that are proactive
rather than reactive develop a
corporate speech strategy to
maximize the benefits that these
applications offer. This involves
defining a six month, one year
and two year plan that answers
How speech can and
should be used to reach
specific business goals
How speech will
complement other
channels
How speech will be used
with other strategies
such as outsourcing
What ROI metrics will
be used to validate the
cost savings, revenue
generation and customer
satisfaction goals that
speech will provide
What process and
domain knowledge will
enable the organization
to reach those goals
Whether the
organization will take an
outsourcing on in-house
approach to speech.
LESSON FIVE: DONT
PROLIFERATE SPEECH
WITHOUT A STRATEGY
Reactive organizations were
unprepared to take advantage of
Internet growth because they did
not move quickly enough to
develop their own internal
expertise for Web site
management When they realized
their mistake, they
overcompensated by
proliferating Web sites across the
global organization with no
carefully-defined plan. The result
was a sprawling, uncontrollable
number of Web sites that could
not be managed. This
proliferation created redundant
and costly infrastructure and
processes that cost hundreds of
thousands of dollars to
eventually clean up.
Speech applications pose a
similar problem if not managed
well. Organizations succeed best
with speech when they establish
a clearly-defined corporate
strategy and create an
organization that owns speech
best practices and initiatives
corporate wide. Rather than
proliferate speech without
control, the best approach is to
identify a set of stakeholders
from the telephony, Web and
customer support worlds who
can collaboratively drive
corporate best practices, and
change and define the
corporations vision for speech.
LESSON SIX: DONT BE
PARALYZED BY WHAT
YOU DONT KNOW
At one end of the spectrum are
organizations that develop
speech applications without any
strategy at all. At the other end
of the spectrum are
organizations that wont touch
speech until everyone
understands every last detail.
The desire for total knowledge
can paralyze these organizations
and they often react too slowly
to changing market conditions.
While strategy is important,
learning occurs through practice.
Organizations that make the
most progress with speech have
an adopt and go mentality.
They set measurable objectives
and then make progress with
their speech initiative. They
strike a balance between strategy
and pragmatic action, realizing
that learning only occurs in an
organization through concrete
doing. These organizations set
concrete milestones with speech
every six months to assure that
they continue to make progress
towards speechs adoption in
their business. By deploying
speech in limited ways and then
expanding its use, they gain
The best approach
is to identify
stakeholders from
the telephony, Web
and customer
support worlds who
can collaboratively
drive corporate best
practices.
TuVox Confidential - 8 - 2005 TuVox Incorporated
domain knowledge and make
progress at the same time.
LESSON SEVEN: IF YOU
BUILD IT, YOU MUST
MAINTAIN IT
Like the corporate Web site,
speech applications are living,
growing entities that need to be
managed over time.
Organizations that quickly threw
together Web sites in the early
days found that, without a way
to update and continually change
them, their Web sites quickly
became stale and lost much of
their original business value.
Speech is similar. As speech
applications replace touch tones,
they become one of the most
important customer-facing tools
in the organization. Speech
applications shape customer
experience, reinforce and create
brand, help with customer
retention, and provide customers
with the latest information
available. Because speech
becomes so mission critical in
customer interactions, speech
applications have to be dynamic
and responsive, constantly
updated with new information.
In developing a corporate
strategy, organizations must
approach speech with the
understanding that it is a
medium that constantly
undergoes change. Without the
infrastructure, people, process
and budget to manage that
change, speech applications are
like Web sites that are not
updated in a timely fashion.
Companies that outsource
speech application development
have often made the mistake of
focusing solely on the first
speech application that is
developed and deployed. Theyve
been unpleasantly surprised later
when they realize they no longer
have sufficient budget to pay the
outsourcing provider to keep the
application updated. A speech
strategy therefore must include a
clear path for maintaining and
updating a corporations speech
applications.
LESSON EIGHT: YOU
NEED ENTERPRISE
SOFTWARE TO BUILD AND
MAINTAIN SPEECH
APPLICATIONS
In the early days of the Internet,
organizations created and
maintained their corporate Web
sites with simple desktop tools
like HTML editors or even
Notepad. But these simple tools
simply failed to scale as Web
sites became more complex,
changing entities. Enterprises
needed enterprise software to
create and manage their
corporate Web sites. These
enterprise content management
and publishing applications
enabled organization to manage
all aspect of the Web site
lifecycle: creating, reviewing and
approving content, managing
images, checking hyperlinks,
labeling and reusing content,
storing Web site versions,
applying business rules,
managing reports and analytics,
and a host of other activities that
turned Web sites into a powerful
business enabler.
Enterprise software for speech is
having the same dramatic effects
on productivity. This software
simplifies hundreds of tasks
related to creating, managing
and running speech applications.
With built-in knowledge about
speech and language, this
software empowers non-
technical users to write effective
and powerful speech
applications.
By leveraging enterprise software
for speech, companies find that
they can dramatically cut the
cost of building the speech
application by 60 or more
percent. In addition, they now
have a sustainable process and
infrastructure to keep the
applications fresh and updated.
LESSON NINE: CHOOSE
PLATFORM
INDEPENDENCE OVER
PROPRIETARY
SOLUTIONS
As in the early days of the
Internet, speech is now a rapidly
developing market in which
technology evolution is occurring
at a tremendous pace. In rapidly
TuVox Confidential - 9 - 2005 TuVox Incorporated
evolving markets, buyers need to
be careful and avoid proprietary
solutions that lock them into a
single platform from which they
cannot escape later. Today, more
than a half dozen companies
offer recognition engines
including Nuance, ScanSoft, IBM
and most recently Microsoft.
Similarly, there are a number of
voice browsers or gateways
including Genesys GVP,
VoiceGenie, and Verascape
among others. Further
complicating this, there are two
competing standards for Speech:
VXML (open source) and SALT
(Microsoft standard). As the
market matures, competition is
likely to drive down costs further
and to commoditize the
underlying speech recognition
and Voice gateway platforms. It
may also impact the evolution of
standards in ways that are not
foreseeable now.
Companies need to develop an
approach to speech that is
flexible enough to capitalize on
this price curve while remaining
able to switch platforms at will,
taking advantage of market
consolidations and changes. The
only way to do this is to adopt a
platform independent approach
to developing speech
applications. Speech applications
need to be written so that they
can be moved from one speech
recognition engine and Voice
platform to another at the click
of a button. Whether you
develop speech applications
internally or outsource them to
providers, its important to
protect your investment by
taking an agnostic approach to
the underlying platform.
LESSON TEN: BEWARE
THE LIMITS OF PRE-
PACKAGED
APPLICATIONS
Would you build your Web site
with a pre-packaged application?
Most organizations would not.
They understand that the Web is
a mission-critical part of the
business over which they need
complete control. And a pre-
packaged application, while
easier to deploy initially, is
ultimately limiting as a business
evolves and changes.
The same is true with speech
applications. A pre-packaged
application might initially appear
to be a good short cut to
building speech applications. But
pre-packaged applications, by
definition, lack the flexibility to
grow with a business. Because
they are pre-built, they are not
architected to be readily changed
or adapted. At best they can be
configured. But often when a
corporation wants to change its
speech application it finds that a
pre-built application is not very
flexible. The best practice is to
avoid pre-built applications and
choose a flexible approach to
application development that
leverages pre-built components
(e.g. account number
identification) but builds out the
application to fit business
requirements. In this way, you
get the best of both worlds: pre-
built components but future
flexibility to change the
application framework.
LESSON ELEVEN:
UNDERSTAND THE RISKS
OF OUTSOURCING
SPEECH DEVELOPMENT
Organizations have different
philosophies about outsourcing.
Most organizations outsource
only what they consider to be
non-critical business functions.
With speech, as with the Web,
organizations need to determine
how critical speech is to their
future business strategy. If
speech is critical, they need to
decide whether outsourcing will
limit their ability to reach these
objectives. To be sure, some
outsourcing operations (such as
IT) can often get economies of
scale that corporations
themselves cannot achieve. But
by outsourcing speech
development, organizations
become dependent on other
Pre-packaged
applications, by
definition, lack the
flexibility to grow
with a business.
TuVox Confidential - 10 - 2005 TuVox Incorporated
organizations and fail to develop
their own internal expertise or
domain knowledge.
Speech is one area where
corporations need to be
exceedingly careful before
choosing an outsourcing
strategy. Since speech involves
ongoing change like a Web site,
any outsourcing strategy needs
to anticipate recurring expenses
for managing updates.
Unfortunately, organizations
that outsource speech to
professional service
organizations often find that
they cannot afford the costs of
maintaining the speech
application once it is developed.
The professional service
organizations that develop
speech applications are
motivated by their services
margins. Unlike your
corporation, they arent
motivated to keep down
recurring costs of speech
application development. Nor do
they have enterprise software
that makes them efficient in
developing speech applications.
An outsourcing strategy,
therefore, can sometimes be
much costlier in the long run
than it first appears. The cost of
the initial build may be only a
small percentage of the total cost
of ownership. For this reason,
many corporations now look to
develop speech expertise in-
house and only outsource in
early phases as they develop
knowledge internally. Those that
want to outsource are looking
for consulting organizations that
will partner with companies that
offer enterprise software for
managing speech. In this way,
they still get the efficiencies of
software and have the ability to
bring speech application
development in-house at a later
date.
LESSON TWELVE:
LEVERAGE CORPORATE
WEB CONTENT AND
INFRASTRUCTURE
There is still a myth in the speech
industry that organizations need
of the skills of PhDs in linguistics
to write speech applications. But
that is simply a myth to
perpetuate the need for
professional services
organizations. The truth is that
speech application development
is a learnable skill, just as Web
site development is. Your
organization has most of what it
needs already to develop speech
applications. By combining the
knowledge of your telephony
organization, call center, Web
and marketing teams, you have
most of what it takes to start the
process of building speech
applications.
Your organization already
invested a huge amount of time,
effort and dollars in developing
your Web presence. Why not
leverage that same existing
content and infrastructure for
another channel? This is one of
the primary advantages speech
has over touch tones. Touch
tones cannot readily leverage
either your Web content or your
Web systems, but speech can.
Many companies, for example,
are leveraging their Frequently
Asked Questions (FAQs) and
technical support from their Web
sites directly into their speech
applications. Software for speech
applications imports Web
content into speech and tags it
on the fly so that it can be
repurposed for speech. In this
way, enterprise Web content can
be leveraged in your speech
channel. In addition, speech
applications can readily leverage
Web backend systems, hooking
into Web services APIs that have
already been developed. By
leveraging content and APIs that
already exist in the Web,
companies can reduce the
overhead of maintaining two
entirely separate systems and
processes, as they do today with
touch tone.
Many companies
are leveraging their
Frequently Asked
Questions (FAQs)
and technical
support from their
Web sites directly
into their speech
TuVox Confidential - 11 - 2005 TuVox Incorporated
CONCLUSION
Speech is an innovative form of
customer self-service that will
rapidly transform the landscape
of corporate customer support
over the five years. Like the Web
before it, Speech has tremendous
value for its cost cutting impact,
its ability to improve customer
access to information and
satisfaction, and in reinforcing
the corporate brand. Enterprises
that have rapidly adopted and
deployed speech have realized
that speech is a form of Web
application, and that what they
have learned from their own
Web development can be
transferred to the practice of
speech. These organizations are
well along the path of achieving
revenue, customer satisfaction
and business goals in their one
year and two year plans.
CHECKLIST FOR ADVANCING YOUR SPEECH STRATEGY
Checklist for Speech Readiness
Understand how speech can support strategic business initiatives
Develop Business case and ROI
Identify Executive Sponsor
Identify speech team from telephony, Web, marketing and contact centers
Conduct Speech Readiness Assessment
Identify initial project
Develop gap analysis
Develop and gain approval for six month, one year and two year plans
Decide whether to build knowledge internally or outsource
Identify enterprise software for managing speech
Develop metrics for change
Initiate the process
333 Distel Circle
Los Altos, CA 94022
650.623.0210
www.tuvox.com