Posts Tagged ‘text to speech’

Could TTS news reading beat Kindle and smart phones?

January 27, 2010

This post responds to concerns in ComputingEd post ‘Kindles versus Smart phones: Age matters, testing matters’. A UGa study and commentary focus on news reading as screen-dependant and vision-only. I suggest considering the print-disabled TTS-dependant ecosystem to expand understanding of human reading and assistive device capabilities.

Reading experiments might be broadened to include pure TTS, i.e. no screens. But first, what criteria matter: reading rate, absorption level; device comfort, simulated print experience, distribution costs and convenience,..?


For the record, I just read this article by RSS, then switched to my Newstand, downloaded NYTimes and other papers from Bookshare.org, cooperating with NFB Newsline, and news companies I gratefully thank. Papers are delivered wirelessly in XML-based DAISY format, retrieved and read on a Linux-powered mobile device (Levelstar Icon), spoken in an old-style “robotic voice”. For delivery efficiency and cost, this cannot be beat and I think I absorb selective news reading better than ever. But how is experience of print-disabled news readers factored into comparisons like this article?


This will soon be relevant if Kindle, iPod/iTouch, etc. TTS reading is fully enabled and adopted by some readers from proprietary delivery systems, like Amazon. For proper evaluation, it will be necessary to compare eReading by TTS on mainstream devices to that provided by evolved readers like APH book port, Humanware Victor Reader Stream, PlexTalk Pocket, Levelstar Icon, and (my favorite) GW Micro booksense. Also important is the media format, currently favored as DAISY on these devices. And finally is the provision of media, currently limited legally to print-disabled readers, as by NFB (National Federation of Blind) and non-profit Bookshare.org. In other words, there’s another ecosystem of reading open only to print-disabled that might benefit those attracted to eReading.


Oh, my, here’s the “universal design” mantra again. ‘Reading news by screen’ is, of course, more limited than ‘reading by print or audio”. It’s possible than for some reading criteria the screen-free mode or open XML-based format and its reading devices and experienced reader population may beat mainstream strategies!


Could these experiments be performed? Certainly, most universities have students who currently, or could, offer their experience with equipment provided through Disability Services. Fact quizzes and comprehension tests might raise questions about how our reading brains work and how well our reading devices and formats help or hinder. What research is in progress? Is there a CS agenda for this social and economic ecosystem? Why do people think reading is a vision-only activity? Ok, comics, photos, and crosswords are a bit challenging, but plain old print is so well handled by TTS. Let’s open our eyes and ears and fingers to a fuller range of capabilities. I would love to be a test subject for eReading experiments.

The Pleasures of Audio Reading

May 22, 2009

This post expands my response to an interesting
Reading in the Dark Survey
Sighted readers will learn from the survey how established services provide reading materials to be used with assistive technology. Vision Losers may find new tools and encouragement to maintain and expand their reading lives.

Survey Requesting feedback: thoughts on audio formats and personal reading styles?

Kestrell says:

… hoping to write an article on audio books and multiple literacies but, as far as I can find, there are no available sources discussing the topic of audio formats and literacy, let alone how such literacy may reflect a wide spectrum of reading preferences and personal styles.

Thus, I am hoping some of my friends who read audio format books will be willing to leave some comments here about their own reading of audio format books/podcasts. Feel free to post this in other places.

Some general questions:
Do you read audio format books?
Do you prefer special libraries or do you read more free or commercially-available audiobooks and podcasts?
What is your favorite device or devices for reading?
Do elements such as DRM and other security measures which dictate what device you can read on influence your choices?
Do you agree with David Rose–one of the few people who has written academic writings about audio formats and reading–that reading through listening is slower than reading visually?
How many audiobooks do you read in a week (this can include podcasts, etc.)?
Do you ever get the feeling form others that audiobooks and audio formats are still considered to be not quote real unquote books, or that reading audiobooks requires less literacy skills (in other words, do you feel there is a cultural prejudice toward reading audiobooks)?
anything else you want to say about reading through listening?

This Vision Loser’s Response

Audio formats and services


I read almost exclusively using TTS on mobile readers from DAISY format books and newspapers. I find synthetic speech more flexible and faster than narrated content. For me, human narrators are more distracting than listening “through” the voice into the author’s words. I also liberally bookmark points I can re-read by sentence, paragraph, or page.


Bookshare is my primary source of books and newspapers downloaded onto the Levelstar Icon PDA. I usually transfer books to the APH BookPort and PlexTalk Pocket for reading in bed and on the go, respectively. My news streams are expanded with dozens of RSS feeds of blogs, articles, and podcasts from news, magazines, organizations, and individuals. Recently, twitter supplies a steady stream of links to worthy and interesting articles, followed on either the Icon or browser in Accessible Twitter.

I never seem to follow through with NLS or Audible or other services with DRM and setups. I find the Bookshare DRM just right and respect it fully but could not imagine paying for an electronic book I could not pass on to others. I’m about to try Overdrive at my local library. I’ve been lax about signing up for NLS now that Icon provides download. No excuses, I should diversify my services.


I try to repay authors of shared scanned books with referrals to book clubs and friends, e.g. I’ve several now hooked on Winspear’s “Macy Dobbs” series.

Reading quality and quantity

I belong to two book clubs that meet monthly as well as taking lifelong learning classes at the community college. Book club members know that my ready book supply is limited and take this into consideration when selecting books. My compact with myself is that I buy selected books not on Bookshare and scan and submit them. I hope to catch up submitted already scanned books soon. Conversely, I can often preview a book before selection and make recommendations on topics that interest book club members, e.g. Jill B. Taylor’s “Stroke of Insight”. I often annoy an avid reader friend by finishing a book while she is #40 on the local library waiting list. This happens with NYTimes best sellers and Diane Rehm show reader reviews. No, I don’t feel askance looks from other readers but rather the normal responses to an aging female geek.


At any one time, I usually have a dozen books “open” on the Bookport and PlexTalk as I switch among club and course selections, fiction favorites, and heavy nonfiction. However, I usually finish 2 or 3 books a week, reading at night, with another 120 RSS feeds incoming dozens of articles daily. I believe my reading productivity is higher than before vision loss due to expedient technology delivery of content and my natural habits of skimming and reading nonlinearly. Indeed, reading by listening forces focus and concentration in a good sense and, even better, performed in just about any physical setting, posture, or other ambient conditions.
Overall, I am exquisitely satisfied with my reading by listening mode. I have more content, better affordable devices, and breadth of stimulating interests to forge a suitable reading life.

Reading wishes and wants


I do have several frustrations. (1) Books with tables of data lose me as a jumble of numbers unless the text describes the data profile. (2) While I have great access through Bookshare and NFB NewsLine to national newspapers and magazines, my state and local papers use content management systems difficult to read either online or by RSS feed. (3) Google Book Search refuses to equalize my research with others by displaying only images of pages.


For demographics, I’m 66 years old, lost last sliver of reading vision three years ago from myopic degeneration, and was only struggling a few months before settling into Bookshare. As a technologist first exposed to DECTalk in the 1980s, I appreciate TTS as a fantastically under-rated technology. However, others of my generation often respond with what I’ve dubbed “Synthetic voice shock” that scares them away from my reading devices and sources. I’d like to see more gentle introductions from AT vendors and the few rehab services available to retired vision losers. Finally, it would be great to totally obliterate the line between assistive and mainstream technology to expand the market and also enable sighted people to read as well as some of us.

References and Notes on Audio Reading

  1. Relevant previous posts from ‘As Your World Changes’

  2. Audio reading technology
    • LevelStar Icon Mobile Manager and Docking Station is my day-long companion for mail, RSS, twitter, and news. The link to Bookshare Newsstand and book collection sold me on the device. Bookshare can be searched by title, author, or recent additions, and I even hit my 100 limit last month. Newspapers download rapidly and are easy to read — get them before the industry collapses. The book shelf manager and reader are adequate but I prefer to upload in batches to the PC then download to Bookport. The Icon is my main RSS client for over 100 feeds of news, blogs, and podcasts.
    • Sadly, the American Printing House for the Blind is no longer able to maintain or distribute the Bookport due to manufacturing problems. However, some units are still around at blindness used equipment sites. The voice is snappy and it’s easy to browse through pages and leave simple bookmarks. Here is where I have probably dozens of DAISY files started, like a huge pile of books opened and waiting for my return. My biggest problem with this little black box is that my pet dog snags the ear buds as his toy. No other reader comes close to the comfort and joy of the Bookport, which awaits a successor at APH.
    • Demo of PlexTalk Pocket provides a TTS reader in a very small and comfortable package. However, this new product breaks on some books and is awkward managing files. The recording capabilities are awesome, providing great recording directly from a computer and voice memos. With a large SD card, this is also a good accessible MP3 player for podcasts.
  3. Article supporting Writers’ Guild in Kindle dispute illustrates the issues of copyright and author compensation. I personally would favor a micro payment system rather than my personal referral activism. However, in a society where a visually impaired person can be denied health insurance, where 70% unemployment is common, where web site accessibility is routinely ignored, it’s wonderful that readers have opportunities for both pleasure and keeping up with fellow book worshipers.
  4. Setting up podcast, blog, and news feeds is tricky sometimes and tedious. Here is my my OPML feeds for importing into other RSS readers or editing in a NotePad.

  5. Here’s another technology question. Could DAISY standard format, well supported in our assistive reading devices become a format suitable for distributing the promised data from recovery.gov?
    Here is a interview with DAISY founder George Kerscher on XML progress.

  6. Another physiological question is what’s going on in my brain as I switch primarily to audio mode? Are there exercises that can make that switch over more comfortable and accelerated than just picking up devices and training oneself? I’m delving into Blogs on ‘brain plasticity’
  7. (WARNING PDF) Listening to the Literacy Events of a Blind Reader – an essay by Mark Willis asks whether audio reading can cope with the critical thinking required in a complex and sometimes self-contradictory doctrine like Thomas Kuhn’s “Scientific Revolutions”. This would be a great experiment for psychology or self. Let’s also not forget the resources of Book Club Reading Lists to help determine what we missed in a reading or may have gained through audio mental processing.

Audio reading of this blog post

Synthetic Voice Shock Reverberates Across the Divides!

July 30, 2008

Synthetic Voice Shock — oh, those awful voices!


As I communicate with other persons with progressive vision loss, I often sense a quite negative reaction to synthetic, or so-called ‘robotic’, voices that enable reading digital materials and interfacing with computers. Indeed, that’s how I felt a few years ago. Let’s call this reaction "synthetic voice shock" as in:

  • I cannot understand that voice!!!
  • The voice is so inhuman, inexpressive, robotic, unpleasant!
  • How could I possibly benefit from using anything that hard to listen to?
  • If that’s how the blind read, I am definitely not ready to take that step.

Conversely, those long experienced with screen readers and reading appliances may be surprised at these adverse reactions to the text-to-speech technology they listen to many hours a day. They know the clear benefits of such voices, rarely experience difficult understandability, exploit voice regularity and adjustability, and innovate better ways of "living big" in the sighted world, to quote the LevelStar motto.

The ‘Synthetic Speech’ divide


Synthetic voice reactions appear to criss-cross many so-called divides: digital, generational, disability, and developer. The free WebAnywhere is the latest example with a robotic voice that must be overcome in order to gain the possible benefits of its wide dissemination. Other examples are talking ATM centers and accessible audio for voting machines. The NVDA installation and default voice can repel even sighted individuals who could benefit from a free screen reader as a web page accessibility checker or a way to learn about the audio assistive mode. Bookshare illustrates book reading potential by a robotic, rather than natural, voice. Developers of these tools seen the synthetic voice as a means to gain the benefits of their tools while users not accustomed to speech-enabled hardware and software run the other way at the unfriendliness and additional stress of learning an auditory rather than visual sensory practice.


This is especially unfortunate when people losing vision may turn to magnifiers that can only improve spot reading, when extra hours and energy are spent twiddling fonts then working line by line through displayed text, when mobile devices are not explored, when pleasures of book reading and quality of information from news are reduced.

Addressing Synthetic Voice Shock


I would like to turn this posting into messages directed at developers, Vision Losers, caretakers, and rehab personnel.

To Vision Losers who could benefit sooner or later

Please be patient and separate voice quality from reading opportunities when you evaluate potential assistive technology.


The robotic voice you encounter with screen readers is used because it is fast and flexible and widely accepted by the blind community. But there do exist better natural voices that can be used for reading books, news, and much more. While these voices seem initially offensive, synthetic voices are actually one of the great wonders of technology by opening the audio world to the blind and gradually becoming common in telephony and help desks.


As one with Myopic Macular Degeneration forced to break away from visual dependency and embrace audio information, I testify it takes a little patience and self-training and then you hear past these voices and your brain naturally absorbs the underlying content. Of course, desperation from print disability is a great motivator! Once overcoming the resistance to synthetic voices, a whole new world of spoken content becomes available using innovative devices sold primarily to younger generations of educated blind persons. Freed of the struggle to read and write using defective eyesight, there is enormous power to absorb an unbelievable amount of high quality materials. As a technologist myself, I made this passage quickly and really enjoyed the learning challenge, which has made me into an evangelist for the audio world of assistive technology.


If you have low vision training available, ask about learning to listen through synthetic speech. For the rest of our networked lives, synthetic voices may be as important as eccentric viewing and using contrast to manage objects.


So, when you encounter one of these voices, maybe think of them as another rite of passage to remain fully engaged with the world. Also, please consider how we can help others with partial sight. With innovations from web anywhere and free screen readers, like NVDA, there could be many more low cost speaking devices available world wide.

To Those developing reading tools with Text-to-Speech

>


Do not expect that all users of your technology will be converts from within the visually impaired communities familiar with TTS. Provide a voice tuned in pitch and speed and simplicity for starters to achieve the necessary intelligibility and sufficient pleasantness. Suggest that better voices are also available and show how to achieve their use.


It’s tough to spent development effort on such a mundane matter as the voice, but technology adoption lessons show that it only takes a small bit of discouragement to ruin a user’s experience and send a tool they could really use straight into their recycle bin. Demos and warnings could be added to specifically address Synthetic Voice Shock and show off the awesome benefits to be gained. The choice of a freely available voice is a perfectly rational design decision but may indicate a lack of sensitivity to the needs of those newly losing vision forced to learn not only the mechanics of a tool but also how to lis en to this foreign speech.

To Sighted persons helping Vision Losers

>
You should be tech savvy enough to separate out the voice interface from the core of the tool you might be evaluating for a family member or demonstration. Remember the recipient of the installed software will be facing both synthetic voice shock and possibly dependency on the tool as well as long learning curve. Somehow, you need to make the argument that the voice is a help not a hindrance. Of course, you need to be able to understand the voice yourself, perhaps translate its idiosyncrasies, and tune its pitch and speed. A synthetic voice is a killer software parameter.


You may need to seek out better speech options, even outlay a few bucks to upgrade to premium voices or a low cost tool. Amortizing $100 for voice interface over the lifetime hours of listening to valuable materials, maintaining an independent life style, and expanding communication makes voices such a great bargain.


And, who knows, many of the voice-enabled apps may help your own time shifting, multi-tasking, mobile life styles.

To Rehab Trainers

From the meager amount of rehab available to me, the issue of Synthetic Voice Shock is not addressed at all. Eccentric viewing, the principles of contrast for managing objects, a host of useful independent living gadgets, font choices, etc. are traditional modules in standard rehab programs. Perhaps it would be good to have a simple lesson listening to pleasant natural voices combined with more rough menu readers just to show it can be done. Listening to synthetic voices should not be treated like torture but rather like a rite of passage to gain the benefits brought by assistive technology vendors and already widely accepted in the visually impaired communities. Indeed, inability to conquer Synthetic Voice Shock might be considered a disability in itself.


As I have personally experienced, it must be especially difficult to handle Vision Losers with constantly changing eyesight and a mixed bag of residual abilities. It could be very difficult to tell Vision Losers they might fare better reading like a totally blind person. But when it comes to computer technology, that step into the audio world can both reduce stress of struggling to see poorly in a world geared toward hyperactive visually oriented youngsters, especially when print disability opens the flow of quality reading materials, often ahead of the technology curve for sighted people.


The most useful training I can imagine is a session reading an article from AARP or sports Illustrated or New York times editorial copied into a version of TextAloud, or similar application, with premium voices. Close those eyes and just relax and listen and imagine doing that anywhere, in any bodily position, with a daily routine of desirable reading materials. To demonstrate the screen reader aspect, the much maligned Microsoft sam in Narrator can quickly show how menus, windows, and file lists can be traversed by reading and key strokes. The takeaway of such a session should be that there are other, perhaps eventually better, ways of reading print materials and interacting with computers than struggling with deteriorating vision, assuming hearing is sufficient.

So, let us pay attention to Voice Shock


In summary, more attention should be paid to the pattern of adverse reactions of Vision Losers unfamiliar with the benefits of the synthetic speech interaction that enables so many assistive tools and interfaces.

References on Synthetic Voice Shock

  1. Wikipedia on Synthetic Speech. Technical and historical, back to 1939 Worlds Fair.
  2. Wired for Speech, research and book by Clifford Nass. Experiments with effects of gender, ethnicity, personality in perception of synthetic speech.
  3. Audio demonstrations using synthetic speech
  4. NosillaCast podcaster Allison Sheridan interviewing her macular degenerate mother on her new reading device. Everyzing is a general search engine for audio, as in podcasts.
  5. Example of a blog with natural synthetic speech reading. Warning: Political!
  6. Google for ‘systhetic voice online demo’ for examples across the synthetic voice marketplace. Most will download as WAY files.
  7. The following products illustrate Synthetic Voice Shock.
  8. Podcast Interview with ‘As Your World Changes’ blog author covering many issues of audio assistive technology
  9. Audio reading of this posting in male and female voices