Here’s an historical post. The key part is the video below which was created in 1972.
The video has been promised to accompany my chapter in the Festschrift for Ursula Bellugi and Ed Klima published in 2000. In that chapter I analyzed Lou Fant’s live interpretation of a spoken reminiscence. I was the (English) speaker; Lou used American Sign Language (ASL). My story told about trying to ride a Flexiflyer down a steep U-shaped driveway and back up.
Spoiler alert: I crashed.
Key Findings: Fant managed to get the description of the relationship of the house, the driveway and the sled’s riders correct, if mirror-image of the actual space, without having seen the place, without any gestures from me (the speaker), and with the English input message being pretty sketchy. And the cool part? He continues still interpreting for at least a full minute, before he commits himself to those spatial relationships.
It takes a village…
Support from the National Institutes of Health and National Science Foundation over the past 40 years fueled the research of Klima and Bellugi, and many of their students and colleagues. I was privileged to be a member of the laboratory research staff from Fall 1970 through Spring 1973, and an irregular visitor thereafter. This video was created in 1972 at the Salk Institute. It was preserved during the early 1980s (on VHS cassette). It was digitized in 2000 with help from Stanford’s Academic Technology Laboratory (a support facility for faculty), and from Treehouse Video. I’m thrilled to report that the .mov (Quicktime format) video still plays, and has now been uploaded to YouTube, and is presented for your viewing pleasure, with gratitude to all those who helped along the way.
Pack rat unveiled
I saved that bit of video from Fant’s first (or perhaps second) visit to Salk Institute on one of my irregular visits back to La Jolla. When copying from helical scan 1/2″ videotape to VHS cassette, I remembered that I had consciously chosen to tell about an event that no one in the room had heard before, one from my childhood, so that it would be a genuine listening and viewing experience for both the hearing and deaf people present (not a retelling of a familiar story). Of course the interpreter hadn’t heard this story before, and he didn’t have much context about me either. I thought I had been quite clear about the physical space – I could picture it even many years after the event – how the house was situated, where the driveway started, turned, and ended at the street again, and what it was like to ride the wheeled sled. On listening again, I realize that the physical space is difficult to imagine if you were depending on the spoken message only.
Interpreter stays vague for a full minute
Our visitor, the exemplary interpreter and sign language educator, Lou Fant agreed to contrast “transliteration” and “interpreting.” I’ll offer a brief definition of these terms, knowing full well that other experts out there can elaborate in greater depth. Transliteration is a more English-influenced rendering into signs; interpreting is provides simultaneous translation into ASL, a different language, with English influence kept to a minimum. The excerpt shown here was the first part of the illustration of “interpreting.” The key surprise for me in reviewing the video was that Fant managed to keep the message vague as he worked out how all the different parts of the space described fit together. The ability to be vague had never been catalogued as a characteristic of the competent interpreter before. When I told him I was planning to look at this bit of video at long last and asked whether he’d like to see what I was finding and writing about him, he gave his blessing to my work without his review. I’m delighted to be able to present his spontaneous interpretation now almost 40 years after it was first produced.
And thanks to an interpreting instructor who uses the chapter from the Festschrift for asking where that video is. Rachel, it’s here now.
Tuesday, March 9, we got the next update on YouTube’s automated captioning efforts. I heard it on NPR’s “All Things Considered” afternoon program, in which Robert Siegel interviewed Ken Harrenstien of Google with a (female) interpreter providing voice for the Google engineer.
Harrenstien acknowledges that automated captioning today stumbles on proper names, including trademarks and product names: ”YouTube” that comes out “You, too!” And automated captioning has difficulty with videos that have music or other sounds in the background. But, he characterizes himself as a technology-optimist, anticipating that in 10 years things will be much improved.
Benefits of captioning
Like “curb cuts” which have become the symbol indicating that solutions for disabled people (here, those in wheelchairs) resolve needs for others (strollers, roll-aboard luggage, shopping carts), captions have benefits that extend beyond hearing impairment.
Deaf and hearing impaired people can enjoy the huge inventory of videos on YouTube. (The still frame that opens this post is from an announcement by President Obama in response to the Chilean earthquake. Making emergency and other time-sensitive news available to those who cannot hear meets the requirements of laws and regulations in the US. And more importantly, it meets the moral or ethical standards we expect from a civilized society where we include everyone in the polity.)
If you’re in a noisy environment or located close to others who will be bothered by the audio, you can figure out what the video is saying even without benefit of headphones
Small companies can afford to provide captions on their webcasts, often the heart of learning about new products
Non-native speakers of English have a much better chance of understanding speech at ordinary (rapid) rates with the assist of captions
Captions provide input to machine translation services, so that there soon will be captions in other languages besides English as well; as automated speech-to-text technology improves, we’re going to see other input languages as well
Captions provide much better input to (current) search technology than speech does, so there’s hope of finding segments of videos that might not appear in written form
Professional captioners need not despair
I read the YouTube blog post of March 4 and the comments following it, and recalled the announcement of the limited trial with selected partners last November. James expresses concern in his comment about the recent YouTube announcement that people, like him, who earn their living as captioners for post-production houses will lose their jobs as a result of the automated captioning. My response seconds HowCheap’s comment that professional captioners will continue to find work both as editors of the automated speech-to-text and for organizations prefer doing their own captioning. Organizations that produce professional quality video typically start from a written script, adjust for the few changes that happen in the spoken version, and then set the timing of the text with the video.
The huge number of videos on YouTube are uploaded by individuals or by small organizations who may not be aware of the benefits from captioning, and likely don’t know about the tools available. According to YouTube’s fact sheet: “Every minute 20 hours of video is uploaded to YouTube.” That’s a volume that is beyond the capacity of professional captioners and the organizations that employ them.
A proposal for improving the quality of captions
How shall we improve the quality of automatically produced captions?
I’d like to see interpreter training programs (ITPs) make editing automated captions a course assignment, a program requirement, or a component of an internship. Engagement with spoken language, not one’s own, is a challenge. People phrase things in ways you don’t; they use unfamiliar vocabulary and proper names (streets, towns, people, products) that I need to look up. Both ITPs for training sign language interpreters and those for people learning to interpret between 2 spoken languages may allow entry to students whose skills in listening, writing or spelling may be lacking. How many caption-editing assignments are enough? Shall we also coordinate quality checks by others in the same or a different program? Such assignments will guide students toward greater appreciation for the challenges of speech in online settings, with a task that provides an authentic service.
VRS and VRI
In the case of ITPs for sign language interpreters, the improved listening to online speech is great preparation for work settings such as VRS and VRI. Video Relay Service (VRS) in the US is regulated by the FCC: deaf signers who cannot use the telephone (because their speech is not intelligible and they cannot hear well enough to understand speech over the phone) make use of intermediaries (interpreters) to communicate with hearing non-signers. (Think of simple tasks such as calling the school to notify them that your child will be absent; scheduling a haircut; ordering a pizza for delivery, not to mention more complex transactions involving prescriptions, real estate contract negotiation, billing disputes.) Video Remote Interpreting (where the deaf and hearing parties are physically together, with the interpreter remote from them) is a service with similar requirements for the interpreter (listening to speech over a phone or data line, and rendering accurate translations in real time).
Broad multi-disciplinary open source content quality
Programs training instructors in English as a Second Language (ESL) could also participate. Students in speech therapy and audiology would benefit from both the direct engagement with spoken language “in the wild” and with future colleagues in other disciplines. There are advantages to engaging a variety of people who are studying for professions that emphasize expertise in spoken and written English.
Looks like an open source content development effort to me. Yes, it will require a little bit of coordination, but not terrific overhead. How about it, ITP program directors?
My spoken French is more or less limited to menu items and courtesy phrases. I’m better at comprehension, but unable to express myself to my own satisfaction in a business setting. One of the challenges of the managing the tutorial at WIF 2008, was that this session was held in Limoges, France.
As a tutorial leader, I might have been out of luck, but I was ably assisted by two professional interpreters. In a room of about 35 people, I estimate that there were 6-8 French speakers; the remainder were willing to work in English. Among them were native speakers of perhaps 4-5 additional languages, but English was the lingua franca for most.
After a brief introduction about design games in the product development process, and Innovation Games® in particular, we broke into groups to create “Product Boxes.” I said, but perhaps not forcefully enough, that people could work alone or in small groups. When Product Box starts, people dig into the materials and start making their sketches and notes. They probably weren’t paying close attention to me (nor, in this case, the interpreter).
In the midst of the game play one participant approached me through the interpreter, identifying himself as a game designer. He announced that “teams are the enemy of freedom.” Either this was very deep and philosophical statement, or there was a simpler interpretation that I was overlooking. Asking for clarification, I realized that he didn’t want to work with the people he happened to share a table and language community with: he wanted to create on his own.
I encouraged him to go ahead and work solo. Although this exchange prevented him from completing the full design he had envisioned, he did realize at least one good idea as shown in this photo
ideas filling or spilling from head
The ideas fly freely (into?) out of the head!
Are teams the enemy of freedom? I might agree that teams constrain individual freedom, but I’m also a subscriber to the aphorism “many hands make light work” (there must be French for this one!). There’s more to say about games as focused toward individuals, groups or teams, but I’ll save that for another occasion.