Seeing Through Google Book Search

Google’s blog recently announced the availability of Google Book Search with direct links in search results to out-of-copyright books for download as PDF. This action opens the portion of scanned books in the Google Library to print-disabled readers with traditional text-to-speech tools. However, this sub-collection is, by virtue of its vintage years, of value to only a few scholars and occasional readers. The remaining scanned books remain inaccessible in both their stored content and page images displayed in the book search results.

I decided to experiment to learn (1) what’s in the Google Library that relates to my professional and personal interests and (2) what could I actually get to expand my library suitable for my print-disabled status? The BLUF (Bottom Line Up Front): (1) I can access less that fully sighted professional colleagues and (2) this experiment didn’t yield any additions to my library.

Here’s the experiment: query Google Book Search on a topic I know something about and assess the value of the resulting books. The topic picked was one where, well, being immodest, I had myself published several articles in the 1980s and 1990s, using terms “software testing”, “formal specification”, “formal methods”. Working from memory, the book list showed books I’d previously owned, some I’d forgotten about, and a few I’d never known of. Several books were actually government publications, e.g. from NIST, and several were primarily reprints, many available from other electronic sources, such as the IEEE Digital Library. None appeared to be available in downloadable form, but I wasn’t sure what annotation would tell me that. I was at first confused about “full view” which did NOT mean downloadable but rather available for display as images in search results. The book lists were fairly long, between 50 and 100 books, indicating a comprehensive scanning or publisher contribution on my topics of interest.

So, what did I actually get to “see”? I was running Windows XP in its standard accessibility mode, with docked magnifier and narrator screen reader plus a simple zoomed magnifier associated with the Microsoft Laser Mouse; High Contrast Black Windows theme; Maxilla Firefox with images off, using FireVox free screen reader and TextAloud for reading page text. Of course, with images off I got a snippet of page text, a big empty block of missing image, and various book meta data, including where to buy or borrow. So, I turned images ON in the browser and, ouch, was it bright! I could recognize a page, almost read the bright text in inverted magnifier at size 4, but could not really glean much. Probably, more effective, and more costly, zoom could get more words into clarity, but there was no substitute for having the text read to me to gain context of the search result. This is the major point – there’s nothing in, around, or any way out of the image into screen readable mode. The image might as well have been a lake, a building, or porn for all the information I could glean from it. I wondered why the omnipotent Google toolbar, gathering data about my searches, and offering me various extra search information could not also be the reader.

Staring at the empty image was really disconcerting, even demoralizing. Were I still in the grant-grubbing, publication-hungry mode of an academic researcher, I would be disadvantage in getting paragraph-sized chunks of information to quote or cite (without ever handling the book itself). And these book references were to my own work, either citations or reprints of articles I’d written. Google Book Search does provide an excellent overview and snapshot of an era of research – who, what, and why – but not much more than a reminder for me. Of course, I could buy or borrow the book but then I’d need to scan it to get anything “readable”. Alternatively, running Google Scholar would lead me to many of the same resources in the Digital Libraries where I could buy or use a subscription to get the articles. Or, perhaps, a local employer or public library could get the article through Inter-library loan. It does appear that my search was biased toward retreiving as many reprint collections as books with original content, perhaps a side effect of computing literature publishing practices.

What about the promised downloadable content? I looked up “Wuthering Heights”, downloaded (link in upper corner) the PDF, and went through the usual screens of Adobe Reader updating itself. The, damn, ouch, another bright window of PDF I could not read. I remember Adobe had nicely provided an accessibility wizard buried down on the Help Menu and Read Out Loud on the View menu. After switching to more restful yellow on black of the PDF, I was able to hear the very interesting page of Google warnings about use of the book. Were I to actually read the book, I’d want it converted to text for downloading to BookPort reading in synthetic voice (good old “Precise Pete”) or in mp3 for an audio player. The PDF was, for me, just a format to get out of the way, although I could have read the book via Adobe’s Read Out Loud staying tethered to the PC. It would have been preferable to have by option the book in the DAISY format directly input to many text-to-speech tools, i.e. more standard than PDF.

Well, does Google Book Search do anything for this print-disabled person? I don’t think so.

These issues are discussed at more length in a 2005 white paper by Benetch/Bookshare founder Jim Fruchterman. He points out various ways that images might be annotated to stay within the conventions of web pages but the main problem is that publishers, Google, and some intermediaries need to cooperate to live up to the spirit of the legal rights of print-disabled people to access book content on a par with fully sighted individuals.

My wish is that Google would extend its toolbar to provide an audio from of the page for those who hold a certificate of print-disability similar to Bookshare’s policy. This would provide as much Entitlement as seems feasible for print-disabled, preserve rights to images, for deaf, and slightly raise Empowerment for print-disabled who can listen and make notes or do something else.

Another concern I’ve mentioned, and may have just missed in the page links, is the range of other options for some of the content in the books sampled in this experiment. Many reprints are available by Google Scholar, by the former NEC CiteSeer, from Digital Libraries of professional societies, and from the “database” collections of traditional library services. Is there a disconnect from Google Book Search to these alternative services?

Bottom Line: Getting a list of books discussing my topics is a good thing, but displaying ONLY page images was better for sighted users. And, I doubt my reading profile would identify benefits from downloadable full text of out-of-copyright books.

Well, this article has a definite “what’s in it for me?” tone, but I’d like to refrain that a tiny slice of the content here are words I wrote myself, receiving no compensation from publishers, only employers or research contracts. It’s ironic that I cannot enjoy going back to read myself what others have written about my work, or to continue the work with the same ease as sighted colleagues, nor get those empty images out of my mind. In the theme of this blog, Google Book Search is not classy use of technology, except in the “digital divide” sense of establishing different classes of users depending on their sigtht capabilities. I am not anti-Google, just disappointed.

“Comments on Accessibility of Google Print”, white paper by Jim Fruchterman


