Posts Tagged ‘synthetic voice’

voting Without Viewing? Yes, but It’s so Slow!

August 20, 2008

Taking advantage of accessible voting


I decided that since the Help America vote act had encumbered quite a view million $$$ for fancy electronic equipment with accessible extension, I would take my chances to vote as independently as possible this round. Here’s the story of early voting in an Arizona primary. Vision Losers might use this experience to evaluate their own voting options. Other citizens and technologists will learn how electronic voting works for one tech savvy Vision Loser.

Against a background of the sorry state of American voting processes


First, let me say that, as an informed computer scientist, I do not for one nanosecond believe the odds are very high that my voting precinct actually got a correct tally of votes, including mine. I voted on a setup from the infamous Diebold, now renamed to Premiere Election, Systems. There’s just no way any independent assurance organization can reasonably test a black box version of software and hardware, let alone all the combinations of diverse local ballot designs multiple configurations of the setup, and inevitable versions of evolving software. And that’s not worrying about human error by voting board personnel, malicious people, or silly policies like Ohio’s sleep-over procedures. Business ideology has trumped common sense democracy for Americans, unlike Australia and other countries that adopt an open approach.

Here is how I voted in September 2008

A preview and trial at my local voting board


Nevertheless, I wanted my independence and to force myself through the best possible preparation. A few months ago, I paid a visit to the Yavapai county recorder’s Office for a personal trial on a mock ballot so I would be familiar with the equipment. I was reasonably impressed with the audio system, very enthusiastic about the personnel who welcomed the opportunity to try out their audio setup, and comfortable about working the equipment rather than asking someone to read and mark my ballot. I knew the actual voting would be slow and that I needed to do my homework on candidates and races so I could concentrate on the voting act itself.

Getting from sample to real ballot


I was pleased to find a nice little primary coming up in September with early voting several weeks ahead. One primary race is especially important in Arizona district No. 1, to replace rep. Rick Renzi who was indicted on 35 counts of fraud and other bad stuff. With a senator as presumptive Presidential candidate and a 40% voting record, Poor representation of this region for months especially annoys me as economic and social policies have consequences I had not foreseen as I grapple with my own rehabilitation and my family’s future. Both major parties had a good slate of 4 or 5 candidates with experience relative to a highly diverse region of Indian reservations, small cities, and lots of open space.


I made my choice of party and candidate for Congress and began to look for the other races of interest. There were few contests so I assumed the ballot would be a piece of cake. Actually, I had some trouble figuring out the full set of races. I used VoteSmart, the AZ clean elections site, the county listing of candidates, Arizona Republic and Daily Courier candidate blurbs, even Wikipedia. A sample ballot arrived just before my trip to the polling place, but my reader and I were confused about a long list of write-in lines.

The nitty-gritty mechanics of voting


So, as much prepared as I could be, I entered the county office lobby and asked to vote using the audio system. I think I was the first to request this as a flurry of calls upstairs quickly produced an access card to a screen protected by side blinders and the headset and keypad I had used in my previous experiment. Oh, and most important was a chair.


To summarize the audio voting process, you click the appropriate numbered buttons to advance through races, making and confirming choices while hearing the race titles, constraints and candidate names through headphones. There is nothing visual happening. I listened to the instructions and tried to adjust the volume to match both a synthetic voice announcement of races and human recorded reading of candidate names, using female voices. Occasionally, other customers and voters in a noisy lobby overcame the headset ear pads. The input device was a simple phone keypad with larger sized keys, comfortably held in my lap.

Uh, oh, am I in a loop?


I moved quickly through my choice for the congressional and legislative races. Then things became unfamiliar with more races for county offices and state supreme Court seats all with only a write-in option. Not having any choice, I kept hitting 6 to next race, 9 to confirm my under-voting for continuation to next race. At one point , my attention drifted and I seemed to be in a loop of hitting next without actually having races announce, maybe between district, county, and state races.


After a while I got bored and tried an actual write-in, “gump” sounded good at the moment, and was easy to type although tedious to spell and confirm. Then I got serious and canceled out of write-in. In successive races for supreme court seats, the synthetic voice seemed to be getting faster, and very high pitched. Now, I can listen to really fast voices on my reading appliances. But by the end of what seemed like 50 races, I couldn’t understand the voice. Nor could I remember how to get the main menu or adjust voices. I was stuck, hoping the end would come before I fell asleep at the keypad. Finally, the printer attached to the side clattered and the voice trailed off into oblivion. My nearly trance state lifted and I called for the attendant to complete the session.


Had I actually accomplished my voting goals? I think so as the early races that mattered seemed to be OK, but since I lost control in the middle and was pretty confused toward the end, I can only hope nothing invalidated those early race clicks. This whole process took about 30 minutes, long enough I had to wake up my driver to leave . I reported my troubles to the poll assistants but left unsure we understood the cause of my loop and voice speed-up. My guess is that the speed up started when I hit the relevant key during my write-in fumbling and the modes got confused as I skipped through further write-in choices.

Yes, I will vote this way again, but can others?


I had hoped this experience could be recommended to others, but, alas, I fear those less adept at computer interactions might not find the humor in the loop and could freak out with babbling voices. I will vote again this way in November but next time pay lots more attention to the exit, speed, and volume options. Everybody has a limit to attention and energy to put into this voting exercise. Half an hour for a handful of races and an enormous number of later vacuous choices is a dubious way of getting the job done.

Further concerns about time commitments, voice shocks, and practice


Another lesson for next time is to seriously invest more effort into learning about picking candidates. I hope to find more help from the SunSounds state audio assistance radio system or locate better candidate description materials. For example, the AZ Clean elections brochure that arrived in the mail was organized by race, then district, then party, then candidate which was beyond my patience to scan or anybody else’s willingness to read to me only the district No. 1 choices on pages 4, 39, and so on. Perhaps voting early beats preparation of more candidate comparisons and recommendations from organizations like league of women voters. Perhaps my “domain knowledge” of elections and state offices made my Google and dog pile searches susceptible to donate Now organizations. Certainly, I have not yet found a good source of advice directed to people like me voting blind for the first time. What I really want is a web page duplicating the ballot, divided into levels of government, with attached very short bios and links to longer histories, position statements, and reputable sources of candidate comparisons. The HTML and hypertext structuring are important as PDF is hard to use by audio and often loses the content structure when converted to a text stream. It might also be nice to have a candidate-a-day RSS feed to make the information more digestible in smaller chunks.


I would recommend to others considering using an audio or visually assisted voting workstation to request a trial. Yes, that means taking up time from election board workers, but I found them helpful, friendly, and interested in feedback. Anybody who can handle a bank ATM via audio should be ready to try out the system. However, someone with hearing problems might not be able to adjust the equipment to their needs in a noisy environment. The long-time blind who readily adapt to new devices should appreciate the new-found independence. However, new Vision Losers are faced with lot of work to master both the information gathering and the audio assisted voting process.


My biggest warning is the time commitment to survive the rigors of a long ballot. Had I wanted to actually write in a lot of names, I would have been there until closing time. With so few voters like me, there seems little data to accumulate experience for a warning label, but this is a practical constraint. Voters need to know how much time to ask of their drivers. With more voters using the assistive workstation, there would be a long wait just to get your chance. I suppose I could have asked for assistance during my loops and voice accelerations, but I just wanted to get out of write-in hell. Far more instructional time could be required for first time users of the audio assistance, especially if the equipment balks at start up or printing. And, what happens if a voter gives up during a voting session or nearly goes into a trance, as happened to me? Of course, there are other disabilities more complex than vision, such as strength and mobility, for using different input devices.


Getting a bit more technical, in my earlier visit for a trial, we discussed the need for a simulator for voter training using the audible equipment. I’d appreciate knowing if this exists anywhere. Since the user interaction is by phone keypad, a simulator with a mock ballot, as in my trial, could service widespread people if they knew the voting system designated for them. This could be done by phone or be a downloaded or web 2.0 app, something even I could write if I knew the rules. I could have called up and learned the instructions in the quiet of my home, memorized my way out when I hit a snag, and also reported problems back to the ballot designers and equipment vendors. Had I known about the write-in race survivor test, I’m not sure I would have followed through an actual vote. Those suffering from synthetic voice shock could at least determine whether they wanted to try to and were able to interpret the race announcements and instructions.


While the overall interaction of voting with only audio is really pretty easy, clearly the keypad needs a separate HELP key and RESTORE DEFAULTS action. Maybe these were available, but I was so deep into figuring out how to reach the end of the ballot, I was not interested in finding the escape button. More seriously, as a software testing expert and veteran system breaker, I really would like to replicate my experiences with the next-race loop and accelerating voice problems. It would be too irreverent and silly for a 65 year old lady to whiz around a county office building crowing that I’d broken the system, lookee, the computer is in a really bad state. No, I really appreciated the professionalism and help of the voting staff, but, well, I think I did break something and wish it could be reported and corrected.


So, why don’t I, a formerly reputable software professional try to do more? Well, first, with only two years of legal blindness I am still a learner in the assistive technology world. But more seriously, getting on my high horse, this whole system is an affront to U.S. citizenry. In my previous post, I equated electronic voting with two mixed metaphors, a “moon shot for democracy” and “extreme voting”, like a sporting challenge.

A rant on eVoting as a ‘bungled moon shot’


Just as sputnik shocked the U.S. into action for education in science, just as a catastrophe on the moon in 1969 would have undermined U.S. Self-confidence, just as the later space shuttles failures signaled a decline in space travel prowess, a definitive failure in our voting system undermines our feeling of living in a democracy. Yet, there is every sign that our voting system continues to be bungled, in the names of fancier technology and free enterprise. In my mind, the quest for a technological solution is a doable, long term project but only if committed to the technologists with expertise and freedom to question the safety of every step in the process, test each component down to its core against its specifications, simulate to exhaustion, and finally rely on combined community acceptance of safety to launch. In many ways, a rocket system is easier to design because it works with and against the continuous laws of physics, whereas a voting system works on discrete math and with and against the laws of human capabilities and differences. The security quality of human interactions with system is another dimension of complexity, but the bottom line is that voting systems cannot be black box. Discrete systems must be subjected to inductive reasoning applied to the code, hardware, user scenarios, with a huge dose of version control. Experimental software engineering has established the efficacy of software inspection, especially performed early and often using multiple viewpoints from varieties of expertise. Asking a weak testing regime to accept the assurance of vendors of proprietary systems, even against clear signs of fallibility, is like delivering a rocket to the pad, asking the astronauts to jump on, and not telling mission control how the rocket will behave.


My other metaphor of extreme voting is based on both user and developer experience. it is a lot to ask voting equipment vendors to produce extensions to service all ranges of human differences, including those considered disabilities. I was amazed the keypad and audio system worked as well as it did. Indeed, I might ask why spend all that money on fancy visual interfaces when audio will do, except for hearing impaired people. Users like me are forced into extreme and unknown conditions like long ballots read by unfamiliar voices marked by never before touched keypads. Please accept my invitation to use a bank ATM by audio to get a feeling for this experience. My current ATM transaction time is about a minute by knowing the exact sequence of key clicks, but at first I had little idea of the menu structures or the confirmation, cancellation, and selection instructions held in mind. Voting by audio is a similar experience.


To sum up, even though I had prepared myself well, I fell into a mess of write-in races which cause me to either mishandle the keypad input or to find an actual flaw in the system. In either case, the unpredictability of the long ballot and time required to work through it present, not insurmountable, but discomfiting conditions of voting independently. But I survived, and will continue to vote this way in the big election in November. I will also work hard in perhaps better information conditions to identify the races and candidates where I really care about my vote. I certainly do not want to leave wondering if I have voted for the right guy.

References for Voting without Vision

  1. Previous post on extreme Voting and a Moon Shot for Democracy
  2. California Secretary of State appraisal of voting system security and accessibility
  3. Concerns of computer scientists about electronic voting systems
  4. Audio version of this post

Synthetic Voice Shock Reverberates Across the Divides!

July 30, 2008

Synthetic Voice Shock — oh, those awful voices!


As I communicate with other persons with progressive vision loss, I often sense a quite negative reaction to synthetic, or so-called ‘robotic’, voices that enable reading digital materials and interfacing with computers. Indeed, that’s how I felt a few years ago. Let’s call this reaction "synthetic voice shock" as in:

  • I cannot understand that voice!!!
  • The voice is so inhuman, inexpressive, robotic, unpleasant!
  • How could I possibly benefit from using anything that hard to listen to?
  • If that’s how the blind read, I am definitely not ready to take that step.

Conversely, those long experienced with screen readers and reading appliances may be surprised at these adverse reactions to the text-to-speech technology they listen to many hours a day. They know the clear benefits of such voices, rarely experience difficult understandability, exploit voice regularity and adjustability, and innovate better ways of "living big" in the sighted world, to quote the LevelStar motto.

The ‘Synthetic Speech’ divide


Synthetic voice reactions appear to criss-cross many so-called divides: digital, generational, disability, and developer. The free WebAnywhere is the latest example with a robotic voice that must be overcome in order to gain the possible benefits of its wide dissemination. Other examples are talking ATM centers and accessible audio for voting machines. The NVDA installation and default voice can repel even sighted individuals who could benefit from a free screen reader as a web page accessibility checker or a way to learn about the audio assistive mode. Bookshare illustrates book reading potential by a robotic, rather than natural, voice. Developers of these tools seen the synthetic voice as a means to gain the benefits of their tools while users not accustomed to speech-enabled hardware and software run the other way at the unfriendliness and additional stress of learning an auditory rather than visual sensory practice.


This is especially unfortunate when people losing vision may turn to magnifiers that can only improve spot reading, when extra hours and energy are spent twiddling fonts then working line by line through displayed text, when mobile devices are not explored, when pleasures of book reading and quality of information from news are reduced.

Addressing Synthetic Voice Shock


I would like to turn this posting into messages directed at developers, Vision Losers, caretakers, and rehab personnel.

To Vision Losers who could benefit sooner or later

Please be patient and separate voice quality from reading opportunities when you evaluate potential assistive technology.


The robotic voice you encounter with screen readers is used because it is fast and flexible and widely accepted by the blind community. But there do exist better natural voices that can be used for reading books, news, and much more. While these voices seem initially offensive, synthetic voices are actually one of the great wonders of technology by opening the audio world to the blind and gradually becoming common in telephony and help desks.


As one with Myopic Macular Degeneration forced to break away from visual dependency and embrace audio information, I testify it takes a little patience and self-training and then you hear past these voices and your brain naturally absorbs the underlying content. Of course, desperation from print disability is a great motivator! Once overcoming the resistance to synthetic voices, a whole new world of spoken content becomes available using innovative devices sold primarily to younger generations of educated blind persons. Freed of the struggle to read and write using defective eyesight, there is enormous power to absorb an unbelievable amount of high quality materials. As a technologist myself, I made this passage quickly and really enjoyed the learning challenge, which has made me into an evangelist for the audio world of assistive technology.


If you have low vision training available, ask about learning to listen through synthetic speech. For the rest of our networked lives, synthetic voices may be as important as eccentric viewing and using contrast to manage objects.


So, when you encounter one of these voices, maybe think of them as another rite of passage to remain fully engaged with the world. Also, please consider how we can help others with partial sight. With innovations from web anywhere and free screen readers, like NVDA, there could be many more low cost speaking devices available world wide.

To Those developing reading tools with Text-to-Speech

>


Do not expect that all users of your technology will be converts from within the visually impaired communities familiar with TTS. Provide a voice tuned in pitch and speed and simplicity for starters to achieve the necessary intelligibility and sufficient pleasantness. Suggest that better voices are also available and show how to achieve their use.


It’s tough to spent development effort on such a mundane matter as the voice, but technology adoption lessons show that it only takes a small bit of discouragement to ruin a user’s experience and send a tool they could really use straight into their recycle bin. Demos and warnings could be added to specifically address Synthetic Voice Shock and show off the awesome benefits to be gained. The choice of a freely available voice is a perfectly rational design decision but may indicate a lack of sensitivity to the needs of those newly losing vision forced to learn not only the mechanics of a tool but also how to lis en to this foreign speech.

To Sighted persons helping Vision Losers

>
You should be tech savvy enough to separate out the voice interface from the core of the tool you might be evaluating for a family member or demonstration. Remember the recipient of the installed software will be facing both synthetic voice shock and possibly dependency on the tool as well as long learning curve. Somehow, you need to make the argument that the voice is a help not a hindrance. Of course, you need to be able to understand the voice yourself, perhaps translate its idiosyncrasies, and tune its pitch and speed. A synthetic voice is a killer software parameter.


You may need to seek out better speech options, even outlay a few bucks to upgrade to premium voices or a low cost tool. Amortizing $100 for voice interface over the lifetime hours of listening to valuable materials, maintaining an independent life style, and expanding communication makes voices such a great bargain.


And, who knows, many of the voice-enabled apps may help your own time shifting, multi-tasking, mobile life styles.

To Rehab Trainers

From the meager amount of rehab available to me, the issue of Synthetic Voice Shock is not addressed at all. Eccentric viewing, the principles of contrast for managing objects, a host of useful independent living gadgets, font choices, etc. are traditional modules in standard rehab programs. Perhaps it would be good to have a simple lesson listening to pleasant natural voices combined with more rough menu readers just to show it can be done. Listening to synthetic voices should not be treated like torture but rather like a rite of passage to gain the benefits brought by assistive technology vendors and already widely accepted in the visually impaired communities. Indeed, inability to conquer Synthetic Voice Shock might be considered a disability in itself.


As I have personally experienced, it must be especially difficult to handle Vision Losers with constantly changing eyesight and a mixed bag of residual abilities. It could be very difficult to tell Vision Losers they might fare better reading like a totally blind person. But when it comes to computer technology, that step into the audio world can both reduce stress of struggling to see poorly in a world geared toward hyperactive visually oriented youngsters, especially when print disability opens the flow of quality reading materials, often ahead of the technology curve for sighted people.


The most useful training I can imagine is a session reading an article from AARP or sports Illustrated or New York times editorial copied into a version of TextAloud, or similar application, with premium voices. Close those eyes and just relax and listen and imagine doing that anywhere, in any bodily position, with a daily routine of desirable reading materials. To demonstrate the screen reader aspect, the much maligned Microsoft sam in Narrator can quickly show how menus, windows, and file lists can be traversed by reading and key strokes. The takeaway of such a session should be that there are other, perhaps eventually better, ways of reading print materials and interacting with computers than struggling with deteriorating vision, assuming hearing is sufficient.

So, let us pay attention to Voice Shock


In summary, more attention should be paid to the pattern of adverse reactions of Vision Losers unfamiliar with the benefits of the synthetic speech interaction that enables so many assistive tools and interfaces.

References on Synthetic Voice Shock

  1. Wikipedia on Synthetic Speech. Technical and historical, back to 1939 Worlds Fair.
  2. Wired for Speech, research and book by Clifford Nass. Experiments with effects of gender, ethnicity, personality in perception of synthetic speech.
  3. Audio demonstrations using synthetic speech
  4. NosillaCast podcaster Allison Sheridan interviewing her macular degenerate mother on her new reading device. Everyzing is a general search engine for audio, as in podcasts.
  5. Example of a blog with natural synthetic speech reading. Warning: Political!
  6. Google for ‘systhetic voice online demo’ for examples across the synthetic voice marketplace. Most will download as WAY files.
  7. The following products illustrate Synthetic Voice Shock.
  8. Podcast Interview with ‘As Your World Changes’ blog author covering many issues of audio assistive technology
  9. Audio reading of this posting in male and female voices