Monday, October 28, 2013
Get me out of here
Journeys and maps.
These two always came hand in hand for the last centuries, if not millennia. The pre-journey planning was essential in order to choose the best route and save time on the way.
All this was revolutionized by the U.S. Department of Defense with the invention of GPS tracking and route calculating in the last decades.¹
It is therefore, that Generation Y is the first one, that is able to travel without spending a significant amount of time before starting to get moving.
- From now on, it's more convenient to travel -
Since 1975, the pioneering Dragon Speech Recognition software has continuously improved to understand human’s voice & meaning.²
The integration of voice recognition and GPS services brought the next level to new age voyage planning.
- From now on, it’s more safe to travel -
Voice Command Systems in cars are the realization of many childhood phantasies. They enable not only the trip planning to be controlled with linguistic commands, but also other multimedia & infotainment functions of the vehicle. There are Ford Sync, Lexus Voice Command, Chrysler UConnect and GM IntelliLink to list a few.
A major risk to be avoided is the distraction the driver faces when using his or her hands to operate the car’s dashboard. Due to the ability to tell the system where you want to go, who you want to call & what you want to listen to, the need for manual control decreases drastically.
The innovative development of these speech-recognition software, such as Nuance, will enable drivers to make natural-language requests in their vehicles. Meanwhile, compared to other outdated systems, which supported 50 to 60 voice commands, there are up to 10,000 commands in modern systems. These also include the function to initiate tasks to book a table at your favorite restaurant.³
The continuos development of these integrated voice control function increases the car’s functionality of being simply a vehicle. Slowly but surely, the fiction of the popular 80’s TV series „Knight Rider“ is about to become reality, where the lingual interaction between the driver & the car resulted in team-play scenarios.
However, the increased scope for functionality of speech recognition in a car, is pushing the expectations and requirements. What started as a voice controlled navigation & route planning, is currently evolving into onboard personal assistance.
For this, the support of 10,000 voice commands don’t seem to be sufficient enough and the artificial intelligence of Voice Command Systems needs to be adapted to various languages, pronunciations and cultures.³
It sure can be said, that the development of Voice Command Systems in cars has not only been created but already revolutionized the interaction between driver and vehicle. However, there are endless possibilities which will give a vehicle not only a strong functionality but also a character.
- From now on, it will be more than just riding a car -
___________________________________________________________________________
¹ National Research Council (U.S.). Committee on the Future of the Global Positioning System
² http://eandt.theiet.org/magazine/2013/07/inner-voice.cfm
³ http://reviews.cnet.com/8301-13746_7-57321094-48/siri-like-voice-recognition-coming-to-cars/
Friday, October 25, 2013
Thursday, October 24, 2013
The Application of Speech Recognition in Education
Although it was more than 60 years ago that the speech recognition (also known as "Automatic Speech Recognition" or "ASR") was born, not until recent years has this technology been booming and its application can be seen in almost every industry and the application in education is one worth mentioning.
Both Apple and Microsoft have integrated speech recognition technology into their operating systems (as is shown below) and users can use their voice to control the computer and applications[1], that is, they can speak certain phrases, or “spoken commands,” to make the computer take different actions, such as opening documents or switching applications[2]. Therefore, this build-in technology can be applied to teachers, students or staff in an educational institution.
Mac OS X
Windows
Apart from these two OS providers, such companies as Nuance (www.nuance.com) in the USA and iflytek (www.iflytek.com) in China have also offered speech recognition solutions for education. Their target customers can be categorized into two parts: the educational institution itself (including teachers and staff) and students.
For educational institutions, one important yet tedious task is conducting oral tests because there are a large amount of examinees and traditionally the examiners have to follow a one-to-one approach or record test-takers’ words for further analysis, either of which is time-consuming, costly and may involve subjective judgment. Now evaluation systems based on speech recognition technology can automatically analyze test-takers’ words according to a combined standard including linguistics and statistics and output test reports. Iflytek’s Multilingual Intelligent Speech Evaluation System is one case in point where the system can automatically grade examinees while pinpointing the errors and flaws in the user's speech and give suggestions on improvement[3]. In this way, educational institutions can truly improve efficiency and objectivity and lower the examination cost.
Another thing educational institutions have to cope with is how to improve communication efficiency and fluency while maintaining or even reducing operating cost. Communication here has two dimensions: communication in between staff and with people outside and a large number of these communications are fulfilled by telephone call since it is a more direct way than other methods such as by email and fax. A question may be addressed here: since each staff or department has independent telephone number, how can one get in touch with them by telephone? A telephone directory, either online or in paper, might be a solution, but would it be better if we had telephone attendants specializing in connecting two parties over the phone? Yet new posts means new cost and a qualified attendant can be quite costly and cannot be at office 24h a day. Facing it, some institutions come up with the idea of “auto attendant” which can answer phones automatically, however this will sacrifice callers’ satisfaction because it forces callers to listen to menus, make touch-tone selections.[4] Nuance Employee Productivity Suite (EPS) is a much better choice here which is a voice-driven auto attendant and directory services solution that offers callers fast and efficient voice-command access to other people, places, and information resources from any telephone device at any time using simple, natural voice commands.[5]
For students, more and more are using computers to do homework, much of which involves text inputting and editing. However they may not know they can “type” text in a much more efficient and creative way - using speech recognition. Software like Nuance Dragon 12 can satisfy this need, which allows students to edit and format documents by voice.[6]
From above discussion, the overall benefit of using such technology is self-evident such as the improvement of efficiency, objectivity and satisfaction and the reduction of cost and time. However, there are still some issues that need to be considered. Investment might be the first problem people encounter when trying to apply these solutions in practice. What’s more, the accuracy of speech recognition is still one of the biggest concerns for customers, even though the accuracy are declared to be more than 99%, still customers will encounter errors while using it due to many factors such as background noise, accent and the quality of input device (microphone).
To sum up, though speech recognition technology is growing in educational field, operating system and software manufacturers still have a long way to go: They still need to suit diverse customers’ needs by developing software or applications that are more affordable and more accurate in “listening”, ”understanding” and “reaction”.
Works Cited:
[1][2]"Getting Started:Apple Technology for Diverse Learners." Apple Inc., n.d. Web.24 Oct. 2013. <http://www.apple.com/education/docs/L360989C-US_L360989C_DiverseLearners_ff_acc.pdf>.
[3]"Multilingual Intelligent Speech Evaluation System_Education Product_Anhui USTC IFLYTEK Co., Ltd." Multilingual Intelligent Speech Evaluation System_Education Product_Anhui USTC IFLYTEK Co., Ltd. N.p., n.d. Web. 24 Oct. 2013. http://www.iflytek.com/en/content/details_10_1680.html.
[4][5]"Experience the Difference Speech Makes…Speech-Enabled Employee Directories." Nuance Communications, Inc., 2007. Web. 24 Oct. 2013. <http://www.nuance.com/ucmprod/groups/corporate/@web/documents/collateral/nd_004545.pdf>.
[6]"Dragon for Education." Dragon Education Solutions. N.p., n.d. Web. 24 Oct. 2013. http://www.nuance.com/for-business/by-industry/education/dragon-education-solutions/index.htm.
Monday, October 21, 2013
Flying Solo
Direct voice input on aircrafts
Speech recognition is becoming an essential tool
in the life of every individual. Voice innovation is redefining the ways we
interact with machines. Gone are the days when you needed codes to access the information
system. Today we say the command “Open” to open a door and like “Abra Kadabra”
it’s open.
Direct Voice Input (DVI) was developed to
automate in flight communication systems, so that the pilots can actually focus
on more important information. DVI was implemented in the early part of the 21st
century.
Direct voice input is a speech recognition system
that is employed in the military aircrafts such as the Eurofighter Typhoon. It
is a style of human–machine
interaction, in which the user gives voice commands to
issue instructions to the aircraft. The development of aircraft
capabilities and functionalities has dramatically increased pilot workload.
The goal of this interaction is to increase efficiency
of operations and to control the machine on the user's end. The feedback from
the machine aids the operator in making operational decisions. Examples of this
broad concept of user interfaces include the interactive aspects of computer operating systems,
hand tools, heavy machines and aircrafts.
Looking at its history, it has been in service on
the Eurofighter Typhoon since 2005. In addition to aircraft’s capability
enhancements, DVI has further potential for growth.
The technical process consists of a real time
comparison between the incoming audio signal, the pilot voice, and stored data speech
models. DVI is a great example of speech recognition applications. Nonetheless,
there are challenges that include background audio signal acoustics — the
pilot’s speech style/accent or the cockpit noise of the engines.
Direct Voice Input does not compromise flight
safety, but actually enhances it. Pilots work environment is becoming more
efficient and manageable thanks to the optimization of DVI. DVI has opened a
new frontier in automated flights that are more performance oriented when the
number of tasks increase and the maneuvering of the aircraft depends on both of
the pilot’s hands.
Thursday, October 17, 2013
Nuance Communications
Nuance Recognizer for Contact Centers
Voice-activated technology is improving because of powerful processors, advancements in natural language processing, and improved algorithms for recognizing voice. Nuance Communications, is an innovative company based in the US that provides customer support services, whose motto is, “Ease of use through Speech Recognition.”
Nuance is of the highest-grade software’s with the best recognition accuracy, encouraging natural human-like conversations. Natural Language Processing (NLP) utilizes the maximum capacity of various Hidden Markov Models (HMM) to sift through data hidden in speech analysis graphs. HMM results are then processed by mathematical models that yield a specific command.
Nuance specializes in the following products:
- 1. Automatic Speech Recognition (ASR) for contact center automation. ASR has the ability of analyzing 79 different languages and encourages human interaction in a natural manner. Benefits of using ASR include but are not limited to: cost savings, funneling calls that promote business, filtering unwanted calls, selective processing of information, thereby increasing overall business proposition.
- 2. Dragon Speech Recognition for Personal Use – Dragon Speech Recognition enables daily processes and tasks to be automated. These include speech-to-text dictation in word documents and accessibility functions to assist disabled users better interact with the computers.
- 3. Clintegrity – Hospitals ha ve extensive Customer Relationship Management (CRM) systems that organize and update data for many tasks. Many of these tasks require manual entry, which is time consuming. Clintegrity provides mechanisms where the software analyses speech in order to input entries or access records for doctors to examine. This in turn helps the hospital to focus on more important things at hand, to serve the patient in a timely and orderly manner.
Nuance has set benchmarks for the speech recognition industry yet new methods are still prone to analysis errors. The HMM model works on probability and statistics, which provide approximations and at times cannot decipher the way human interact. There are still areas in voice recognition software’s that can be adapted to the industry as a whole and can be customized to the specific company requirements.
Sunday, October 13, 2013
CALL CENTERS - "HOW MAY I HELP YOU?"
Call centers - "how may i help you?"
In 1952, Bell Labs investigators
presented a basic system recognizing numbers spoken over a telephone: “Speech
Recognition” was born. Nowadays, after significant technological progress,
these systems are able to deal with infinite accents and languages.[1]
The capability of the “Speech recognition”
system is to recognize words in natural speech and then convert them into a
machine-readable format. Call centers use “Speech recognition” software to
handle incoming customer calls. Nevertheless, it is essential to distinguish
between "speech recognition" and "voice recognition".
Indeed, "voice recognition" is recognizes a particular person's voice.[2]
What mainly attracted companies was its cost efficiency. Thus,
appointing these machines reduced expenditure on employee staffing and
maintenance. It should be underlined that by choosing a BPO company, business
expenses related to employee maintenance directly decrease. But if the company
uses an answering machine, operational costs could be reduced even more. In fact, this technological
system can
assist corporations reduce costs but also mechanize the handling of a high
percentage of incoming customer calls. Some businesses combine speech
recognition with Interactive Voice Response (IVC) to advance service quality.
Indeed, as automated systems are available even when call center agents are
not, effectiveness and productivity can be progressed, particularly for sales
and collections corporations. For instance, the company One Telecom is using speech recognition technology in order to
recover customer service in a mixed live agent and self-service setting. This
system strengthened the company's data base of frequently asked questions
(FAQs) and now tracks callers to 1 of 35 self-service units depending on their
wants acknowledged by the speech recognition software throughout the call. 2
However, “Speech recognition” also presents drawbacks. Indeed, background noise and systems' low aptitude
to identify accents and vulgarisms reduce its effectiveness, although some
advances are still being made, such as with the Dragon software. Even though speech recognition actually requires
more effort, conceivable earnings are enormous if implemented effectively, as
said by Steve Rutledge, vice president of product marketing at Genesys
Telecommunications Laboratories[3]. Still, the main element making answering machines
short is their incompetence to attend to precise and complicated inquiries or
difficulties. Indeed, numerous problems can only be listened by live agents,
where business end consumers or probable clients needed sophisticated and
specific clarification. Under such conditions, speech recognition software by
itself would not be sufficient.[4]
To conclude, businesses may lower
operation costs drastically by reducing staff and maintenance. Nevertheless,
delivering ineffective customer service by not answering to specific customer wants
could damage the corporation’s image significantly. In that way, speech
recognition is a valuable tool but still needs important progress in order to
be as effective as firms need it to be.
[1] Borzo, J. (2007). Now You're Talking.
Available: Money CNN, http://money.cnn.com/magazines/business2/business2_archive/2007/02/01/8398978/.
Last accessed 10th Oct 2013.
[2] Search CRM. (2009). Leveraging speech
recognition technology in call centers. Available: Search CRM,
http://searchcrm.techtarget.com/report/Leveraging-speech-recognition-technology-in-call-centers.
Last accessed 9th Oct 2013.
[3] Bailor. C (2005). Avoiding the Speech Rec.
Wreck. Available: destinationCRM.com, http://www.destinationcrm.com/Articles/Editorial/Magazine-Features/Avoiding-the-Speech-Rec.-Wreck-42727.aspx.
Last accessed 9th Oct 2013.
[4] Johnson, A. (2012). Pros And Cons Of
Automated Answering Service By BPO Manila. Available: Fusion Blog,
http://www.fusionbposervices.com/blog/answering-service-by-bpo-manila.html.
Last accessed 10th Oct 2013.
Monday, October 7, 2013
The Voice in the Machine
The Voice in the Machine
By Ahmed Ashraf
My friends the best,
with him around I feel so
blessed.
He will do as I ask,
Helping me complete my task.
He helps with my chores and
work,
hes as precise as a legal
Clerk.
He can write down all I say,
He even helps plan my day.
Sometimes he makes me
scream,
Making my ears burst with
steam.
He makes these silly
mistakes,
Which later on cause
terrible headaches.
Many times we’ve said
enough,
But without him it was
simply too tough.
The world is fast evolving,
And having my friend helps
me with problem solving,
Sometimes we're lost in
translation,
But once that is overcome
it's a great sensation.
My friend has a distinct voice,
He can appear on any gadget
of choice,
He's half human half machine,
Have you figured who I mean?
Subscribe to:
Posts (Atom)