Article 30H9G This blind software developer's display is 450 word-a-minute speech synthesizer

This blind software developer's display is 450 word-a-minute speech synthesizer

by
Mark Frauenfelder
from on (#30H9G)

450wpm_blog.jpg

Tuukka Ojala is a blind software developer in Finland. When he works, he keeps his laptop closed (it has an external keyboard attached to it). In this fascinating interview on Vincit, he explains how he works:

How do you use the computer?

The computer I use is a perfectly normal laptop running Windows 10. It's in the software where the "magic happens". I use a program called a screen reader to access the computer. A screen reader intercepts what's happening on the screen and presents that information via braille (through a separate braille display) or synthetic speech. And it's not the kind of synthetic speech you hear in today's smart assistants. I use a robotic-sounding voice which speaks at around 450 words per minute. For comparison, English is commonly spoken at around 120-150 words per minute. There's one additional quirk in my setup: Since I need to read both Finnish and English regularly I'm reading English with a Finnish speech synthesizer. Back in the old days screen readers weren't smart enough to switch between languages automatically, so this was what I got used to. Here's a sample of this paragraph being read as I would read it:

[audio mp3="https://media.boingboing.net/wp-content/uploads/2017/08/mpsample.mp3"][/audio]

And here's the same text spoken by an English speech synthesizer:[audio mp3="https://media.boingboing.net/wp-content/uploads/2017/08/essample.mp3"][/audio]

A mouse is naturally not very useful to me so I work exclusively at the keyboard. The commands I use should be familiar to anyone reading this post: Arrow keys and the tab key move you around inside a window, alt+tab changes between windows etc. Screen readers also have a whole lot of shortcuts of their own, such as reading various parts of the active window or turning some of their features on or off.

It's when reading web pages and other formatted documents that things get a little interesting. You see, a screen reader presents its information in chunks. That chunk is most often a line but it may also be a word, a character or any other arbitrary piece of text. For example, if I press the down arrow key on a web page I hear the next line of the page. This type of reading means that I can't just scan the contents of my screen the same way a sighted person would do with their eyes. Instead, I have to read through everything chunk by chunk, or skip over those chunks I don't care about.

Speech or braille alone can't paint an accurate representation of how a window is laid out visually. All the information is presented to me in a linear fashion. If you copy a web page and paste it into notepad you get a rough idea of how web pages look to me. It's just a bunch of lines stacked on top of another with most of the formatting stripped out. However, a screen reader can pick up on the semantics used in the HTML of the web page, so that links, headings, form fields etc. are announced to me correctly. That's right: I don't know that a check box is a check box if it's only styled to look like one. However, more on that later; I'll be devoting an entire post to this subject. Just remember that the example I just gave is a crime against humanity.

I spend a good deal of my time working at the command line. In fact I rarely use any other graphical applications than a web browser and an editor. I've found that it's often much quicker to do the task at hand on the command line than to use an interface which was primarily designed with mouse users in mind.

Wbs7FvKsL0M
External Content
Source RSS or Atom Feed
Feed Location http://feeds.boingboing.net/boingboing/iBag
Feed Title
Feed Link http://feeds.boingboing.net/
Reply 0 comments