Synthetic Voice Shock Reverberates Across the Divides!

Synthetic Voice Shock — oh, those awful voices!


As I communicate with other persons with progressive vision loss, I often sense a quite negative reaction to synthetic, or so-called ‘robotic’, voices that enable reading digital materials and interfacing with computers. Indeed, that’s how I felt a few years ago. Let’s call this reaction "synthetic voice shock" as in:

  • I cannot understand that voice!!!
  • The voice is so inhuman, inexpressive, robotic, unpleasant!
  • How could I possibly benefit from using anything that hard to listen to?
  • If that’s how the blind read, I am definitely not ready to take that step.

Conversely, those long experienced with screen readers and reading appliances may be surprised at these adverse reactions to the text-to-speech technology they listen to many hours a day. They know the clear benefits of such voices, rarely experience difficult understandability, exploit voice regularity and adjustability, and innovate better ways of "living big" in the sighted world, to quote the LevelStar motto.

The ‘Synthetic Speech’ divide


Synthetic voice reactions appear to criss-cross many so-called divides: digital, generational, disability, and developer. The free WebAnywhere is the latest example with a robotic voice that must be overcome in order to gain the possible benefits of its wide dissemination. Other examples are talking ATM centers and accessible audio for voting machines. The NVDA installation and default voice can repel even sighted individuals who could benefit from a free screen reader as a web page accessibility checker or a way to learn about the audio assistive mode. Bookshare illustrates book reading potential by a robotic, rather than natural, voice. Developers of these tools seen the synthetic voice as a means to gain the benefits of their tools while users not accustomed to speech-enabled hardware and software run the other way at the unfriendliness and additional stress of learning an auditory rather than visual sensory practice.


This is especially unfortunate when people losing vision may turn to magnifiers that can only improve spot reading, when extra hours and energy are spent twiddling fonts then working line by line through displayed text, when mobile devices are not explored, when pleasures of book reading and quality of information from news are reduced.

Addressing Synthetic Voice Shock


I would like to turn this posting into messages directed at developers, Vision Losers, caretakers, and rehab personnel.

To Vision Losers who could benefit sooner or later

Please be patient and separate voice quality from reading opportunities when you evaluate potential assistive technology.


The robotic voice you encounter with screen readers is used because it is fast and flexible and widely accepted by the blind community. But there do exist better natural voices that can be used for reading books, news, and much more. While these voices seem initially offensive, synthetic voices are actually one of the great wonders of technology by opening the audio world to the blind and gradually becoming common in telephony and help desks.


As one with Myopic Macular Degeneration forced to break away from visual dependency and embrace audio information, I testify it takes a little patience and self-training and then you hear past these voices and your brain naturally absorbs the underlying content. Of course, desperation from print disability is a great motivator! Once overcoming the resistance to synthetic voices, a whole new world of spoken content becomes available using innovative devices sold primarily to younger generations of educated blind persons. Freed of the struggle to read and write using defective eyesight, there is enormous power to absorb an unbelievable amount of high quality materials. As a technologist myself, I made this passage quickly and really enjoyed the learning challenge, which has made me into an evangelist for the audio world of assistive technology.


If you have low vision training available, ask about learning to listen through synthetic speech. For the rest of our networked lives, synthetic voices may be as important as eccentric viewing and using contrast to manage objects.


So, when you encounter one of these voices, maybe think of them as another rite of passage to remain fully engaged with the world. Also, please consider how we can help others with partial sight. With innovations from web anywhere and free screen readers, like NVDA, there could be many more low cost speaking devices available world wide.

To Those developing reading tools with Text-to-Speech

>


Do not expect that all users of your technology will be converts from within the visually impaired communities familiar with TTS. Provide a voice tuned in pitch and speed and simplicity for starters to achieve the necessary intelligibility and sufficient pleasantness. Suggest that better voices are also available and show how to achieve their use.


It’s tough to spent development effort on such a mundane matter as the voice, but technology adoption lessons show that it only takes a small bit of discouragement to ruin a user’s experience and send a tool they could really use straight into their recycle bin. Demos and warnings could be added to specifically address Synthetic Voice Shock and show off the awesome benefits to be gained. The choice of a freely available voice is a perfectly rational design decision but may indicate a lack of sensitivity to the needs of those newly losing vision forced to learn not only the mechanics of a tool but also how to lis en to this foreign speech.

To Sighted persons helping Vision Losers

>
You should be tech savvy enough to separate out the voice interface from the core of the tool you might be evaluating for a family member or demonstration. Remember the recipient of the installed software will be facing both synthetic voice shock and possibly dependency on the tool as well as long learning curve. Somehow, you need to make the argument that the voice is a help not a hindrance. Of course, you need to be able to understand the voice yourself, perhaps translate its idiosyncrasies, and tune its pitch and speed. A synthetic voice is a killer software parameter.


You may need to seek out better speech options, even outlay a few bucks to upgrade to premium voices or a low cost tool. Amortizing $100 for voice interface over the lifetime hours of listening to valuable materials, maintaining an independent life style, and expanding communication makes voices such a great bargain.


And, who knows, many of the voice-enabled apps may help your own time shifting, multi-tasking, mobile life styles.

To Rehab Trainers

From the meager amount of rehab available to me, the issue of Synthetic Voice Shock is not addressed at all. Eccentric viewing, the principles of contrast for managing objects, a host of useful independent living gadgets, font choices, etc. are traditional modules in standard rehab programs. Perhaps it would be good to have a simple lesson listening to pleasant natural voices combined with more rough menu readers just to show it can be done. Listening to synthetic voices should not be treated like torture but rather like a rite of passage to gain the benefits brought by assistive technology vendors and already widely accepted in the visually impaired communities. Indeed, inability to conquer Synthetic Voice Shock might be considered a disability in itself.


As I have personally experienced, it must be especially difficult to handle Vision Losers with constantly changing eyesight and a mixed bag of residual abilities. It could be very difficult to tell Vision Losers they might fare better reading like a totally blind person. But when it comes to computer technology, that step into the audio world can both reduce stress of struggling to see poorly in a world geared toward hyperactive visually oriented youngsters, especially when print disability opens the flow of quality reading materials, often ahead of the technology curve for sighted people.


The most useful training I can imagine is a session reading an article from AARP or sports Illustrated or New York times editorial copied into a version of TextAloud, or similar application, with premium voices. Close those eyes and just relax and listen and imagine doing that anywhere, in any bodily position, with a daily routine of desirable reading materials. To demonstrate the screen reader aspect, the much maligned Microsoft sam in Narrator can quickly show how menus, windows, and file lists can be traversed by reading and key strokes. The takeaway of such a session should be that there are other, perhaps eventually better, ways of reading print materials and interacting with computers than struggling with deteriorating vision, assuming hearing is sufficient.

So, let us pay attention to Voice Shock


In summary, more attention should be paid to the pattern of adverse reactions of Vision Losers unfamiliar with the benefits of the synthetic speech interaction that enables so many assistive tools and interfaces.

References on Synthetic Voice Shock

  1. Wikipedia on Synthetic Speech. Technical and historical, back to 1939 Worlds Fair.
  2. Wired for Speech, research and book by Clifford Nass. Experiments with effects of gender, ethnicity, personality in perception of synthetic speech.
  3. Audio demonstrations using synthetic speech
  4. NosillaCast podcaster Allison Sheridan interviewing her macular degenerate mother on her new reading device. Everyzing is a general search engine for audio, as in podcasts.
  5. Example of a blog with natural synthetic speech reading. Warning: Political!
  6. Google for ‘systhetic voice online demo’ for examples across the synthetic voice marketplace. Most will download as WAY files.
  7. The following products illustrate Synthetic Voice Shock.
  8. Podcast Interview with ‘As Your World Changes’ blog author covering many issues of audio assistive technology
  9. Audio reading of this posting in male and female voices
About these ads

Tags: , , , , , , , , , ,

10 Responses to “Synthetic Voice Shock Reverberates Across the Divides!”

  1. slger Says:

    Jon Udell comments in Overcoming Synthetic Voice Shock. He suggests caretakers become more familiar with the capabilities, limitations, and range of synthetic voices.

  2. samanthaga Says:

    My father suffered an ear injury which limits the range of frequencies he is able to hear. Due to complications from diabetes, he is also loosing his eyesight. We researched a lot of text to speech assistive devices, and found that many had really limited voice synthesizers. We had to find one that would have higher tones so that he could hear. I have heard that Microsoft is developing some impressive technology based on REAL human speech This would make a world of difference.

  3. slger Says:

    You point out that adjusting to synthetic speech is even harder with certain hearing defects. This makes it tricky to find and sort out the choices for where to cross into the audio reading world.

    There’s a paradox here regarding human-sounding speech. First, it is technically difficult to reach human quality in applications as different as reading email or news as well as the limited vocabulary of menus on a browser. Even so-called natural voices fall just enough short that people consider them defective rather than supreme accomplishments from decades of research. Second, for those acclimated to text-to-speech, the so-called robotic voices are often preferred for their flexibility, speed, and predictability. Thirdly, whichever voice we choose or get stuck with, our ability to understand it improves with listening time. Fourth, if we really want to read what that voice offers, our motivations propels us across the synthetic voice issues.

    So, I wonder, without knowing much about hearing loss, whether the adjustability of robotic voices may better overcome certain audio limits. I’ll pose this question on some mailing lists.

  4. slger Says:

    Anybody want to experiment with synthetic voices? Here’s an open source kit and my trials.

    eSpeak is the speech synthesizer used in NVDA for installation and an initial voice, as of July 2008. It is a wonderful technical choice, being free, international, and lightweight, but rather repellant to people not familiar with synthetic voices. On the NVDA community mailing list, users have been experimenting with alternatives to eSpeak and with variants of its voices to suit the international target users of NVDA. A really soothing voice using eSpeak would help transfer this fine software as well as other text-to-speech tools.

    Here’s the experiment, with links below.

    Download and install eSpeak. This also brings along some tools for analyzing speech and creating voices in different languages and styles.

    For simple testing, copy a one-paragraph text file into the espeak\command_line directory. You can then cd in the command prompt to that directory and type

    espeak -f test.txt
    or
    espeak “Hello, world”

    to hear the default voice. Be prepared for Synthetic Voice Shock.

    Play with some of voices in the espeak_data directory, with subdirectory names corresponding to languages. Further available are variats that change the inflection, pitch, and other voice characteristics. Variants are listed in espeak_data/voices/!v. For example, to get a female voice in English type

    espeak -v en+f1 -f test.txt

    Add some more variants developed by the NVDA community as described in the message referenced below. Download the zip file, unzip into the !v directory. Now, try

    espeak -v en+annie -f test.txt
    or
    espeak -v en+max -f test.txt

    Do you like any of these voices? Which qualities of clarity and pleasantness work best for you? Be patient and listen to a full paragraph a few times to play fair in this synthetic voice beauty contest.

    Let me know if you could live with any of these voices. What does it take to get over the hurdle of listening to synthetic voices if you are not familiar with this type of interface?

    Design your own variant for American English that would appeal to Vision Losers being introduced to a screen reader, perhaps hearing this rough type of synthetic voice for the first time..

    Personally, I use Neospeech Kate with NVDA for both screen navigation and text reading. However, I realize I must train myself to understand an espeak variant when I travel using a portable version of NVDA for other Windows PCs. So far, I prefer variants f3 and Linda.

    Links to eSpeak software:

    eSpeak project at Sourceforge
    NVDA and other mailing lists, query for “NVDA freelists eSpeak”

    Instructions and voice variant archive from NVDA mailing list<.

    Quick start for changing voices for NVDA screen reader

  5. LIVE.VEEDEE.BE » Overcoming synthetic voice shock Says:

    [...] menu choices and text selections aloud. As always, I experienced the reaction that Susan, in her latest post, calls synthetic voice [...]

  6. Speech-Technology » SpeechTEK Will Showcase the Many Uses of Speech Technology Says:

    [...] Synthetic Voice Shock Reverberates Across the Divides!Conversely, those long experienced with screen readers and reading appliances may be surprised at these adverse reactions to the text-to-speech technology they listen to many hours a day. They know the clear benefits of such voices, … [...]

  7. The ‘Talking ATM’ Is My Invisible Dream Machine. « As Your World Changes Says:

    [...] coined the term in Synthetic Voice Shock Reverberates Across the Divides to explain responses I heard about voices offered in assistive technologies to overcome vision [...]

  8. slger Says:

    Clever rap mash up of people and TTS at

    Also a good intro to accessibility following WCAG 2.0.

  9. slger Says:

    Great video Introduction Screen Readers

    http://www.jeffreybigham.com/accessibility/2009/04/introduction-to-screen-readers-by-victor-tsaran-yahoo.html

    Gentle voice exploring Windows desktop and web pages.

  10. Living Visually Impaired in Prescott Arizona — The 2013 Story | As Your World Changes Says:

    […] Overcoming Synthetic Voice Shock Accessible voting, yes, it works […]

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

%d bloggers like this: