It's a powerful tool, but as Dunayev says there's a class of user that needs to go further. Some need a highly accurate transcription of the spoken words. That's where his Auckland-based TranscribeMe enters the picture.
Livescribe users can send files to TranscribeMe for an accurate transcription of the audio material. TranscribeMe takes one or two days to process most files. It costs $1 per minute of audio to transcribe material when there is a single speaker and $2 per minute for multiple speakers.
Dunayev readily admits this isn't cheap, but says the service is mainly used by those who see transcription as "mission critical". He says his customers include medical and legal professionals along with market researchers and law enforcement agencies. For those groups, he says, the service is cost-effective and beats alternatives.
TranscribeMe operates at the high-end of a market which features services like Apple's Siri and Dragon Dictate at the low-end. The difference is a matter of quality. Dunayev says his business "lives or dies by the quality of its service and that's something customers demand."
TranscribeMe’s secret sauce
The company claims a 98 percent accuracy rate with its transcriptions – that's considerably higher than the accuracy achieved by speech recognition software. The secret sauce is that TranscribeMe uses a combination of speech recognition software and outsourced human transcribers to turn sounds into words.
Humans are the expensive part of the equation, but not that expensive, Dunayev says crowdsourcing services mean he can buy the highest quality labour at the lowest cost.
We're not talking about hoards of poorly paid workers slaving in third-world battery farms here. He says over half the transcribers are in North America. That might mean stay-at-home-mothers leaping online for a few minutes transcribing between nappies and feeds.
This becomes possible and practical because of the way TranscribeMe breaks work into small blocks and then schedules jobs piecemeal to an army of remote typists. Atomising work this way has a second advantage – it means no-one gets to hear a whole recording and that makes the service more secure for customers concerned about privacy.
Dunayev says the crowdsourcing and smart scheduling give his business a structural advantage in the professional services space. Not running to the third world, but using native English speakers helps with quality.
He says his company has a wealth of data on transcribers, so jobs are sent to those best suited to work with the source material – for example, TranscribeMe tracks which regional accents its workers are best at dealing with.
Will computers replace people? Not for years
Will computers ever be able to completely replace humans for this work? Perhaps in the distant future, but even Nuance, the company behind Dragon Dictate and Siri, now says it will take another 10 years for computers to match humans transcription quality.
Dunayev sees computers playing a bigger role, but says even when computer transcription gets to 99 percent accuracy there will still be need for humans to proofread and check their work. He doesn't see full automation for at least a decade.
To date TranscribeMe has only scratched the surface. Dunayev says the total market for transcription is around $10 billion a year – although that isn't all addressable by his business.
Still there are opportunities. Dunayev points to the intelligent voice recognition software integrated into large-scale systems such as the one used by United Airlines for taking phone bookings.
He says the voice component will have cost the airline around $150 million, for a system which may be in use for just seven years. He says using a service like TranscribeMe would cost a fraction on the amount spent on the system.