Wednesday, July 4, 2007
Language as technology: voice versus text
Normally I try to avoid advertising for companies, but the Lindens and other creator-residents of Second Life have contributed SO MUCH for free, I think they deserve an exception - and besides, I'm not advertising for their headsets so much as referencing their announcement to begin a conversation about what bringing voice to SL might mean.
Though we don't know for sure when humans first started using spoken language, some estimates are between 40,000 and 250,000 years ago - but there is no way to know for sure without written records! Of course, written records don't always lead to definitive answers either, but at least they provide an artifact to examine and debate. The origins of language may be obscure, but language and particularly the differences between spoken and written expression are more relevant than ever in the digital age.
Second Life began as an image & text-based format where participants type messages to one another's avatars. The conversation appears on the screen and can be logged for future reference. When an avatar is communicating, its hands come up as if it were typing, thus providing a visual cue and the familiar sound of the clacking keyboard adds a sonic cue for participants. Conversational coordination, just like in verbal exchange, is often tricky and related responses rarely follow immediately.
Just as in email and text messaging, some SL residents textual communications are abbreviated or accentuated with emoticons or other semiotic adaptations. And, just as in email and text messaging we never really know for sure who is on the other keyboard - it is *always* an act of faith. Anyone with sufficient technical skill can represent themselves as anyone else and only a close reading by an intimate friend would be likely to detect such deception.
Oral communication on the other hand is more difficult to fake, particularly since human hearing is quite adept at detecting subtleties of sound. Voice recognition has been one of the advantages of spoken communications over electronic media, and could provide certain identification since the pattern of our vocal expression is as singular as our finger print. Though fingerprinting or dactyloscopy has been around since the 19th Century, the science of biometrics is booming in our timid, terrified post 9/11 world. From entry gates at the local gym to forensic analysis, biometric technologies are being widely deployed for identity detection.
Thought I'm not certain whether the voice choice in SL will relay our spoken expression with accuracy and an identical voice print via spectrographic analysis, simply introducing the option of vocal verbal communcation will be a fascinating change offering many opportunities for intriguing intellectual discussion.
VUI (voice user interface) design and ASR (automated speech recognition) may combine to provide an accurate and unique transmission or rendition of our spoken communications such that we will no longer have to make that "leap of faith" that has become our default response when communicating via typed text. Further, if SL includes the option of an audio log as it does for text communications a heightened level of security or certainty may be possible.
While this is appealing, and perhaps an answer to the prayers of the most frightened, like all policing technologies, it may have an ugly underside. Since the human brain has had the longest "imprinting" with spoken language and most of us learned how to speak before we learned how to write, our oral composition skills are so automatic that we hardly notice them. Because of this, most of us speak freely and spontaneously, and sometimes to our great regret. What would a world be like that recorded every word we ever spoke?
I trust the Lindens more than I'd trust the government, so I doubt that Second Life will devolve into such an invasive policing technology, but the introduction of voice communications will certainly revolutionize this new exciting realm.