Deep Speech 2: Mandarin and English Recognized
End-to-end deep learning presents the opportunity to improve speech recognition systems continually with increases in data and computation. Indeed, this paper proves that transcription performance can be vastly improved using the same approach for very different languages.
We show that an end-to-end deep learning approach can be used to recognize either English or Mandarin Chinese speech--two vastly different languages. Because it replaces entire pipelines of hand-engineered components with neural networks, end-to-end learning allows us to handle a diverse variety of speech including noisy environments, accents and different languages. Key to our approach is our application of HPC techniques, resulting in a 7x speedup over our previous system. Because of this efficiency, experiments that previously took weeks now run in days. This enables us to iterate more quickly to identify superior architectures and algorithms. As a result, in several cases, our system is competitive with the transcription of human workers when benchmarked on standard datasets. Finally, using a technique called Batch Dispatch with GPUs in the data center, we show that our system can be inexpensively deployed in an online setting, delivering low latency when serving users at scale.
As far as I know, the first sf reference to the idea of real, automated translation of language is the translatophone from a story of the same name by Frank Stockton - in 1901!
One of the most successful of these various contrivances, and the one, indeed, in which I was most deeply interested, was a small machine very much resembling in appearance the tube, with a mouth-piece at one end and an ear-piece at the other, frequently used by deaf persons, but very different in its construction and action. In the ordinary instrument the words spoken into the mouth-piece are carried through the tube to the ear, and are then heard exactly as they are spoken. When I used my instrument the person spoke into the mouth-piece exactly as if it were an ordinary tube, but the result was very different, for the great feature of my invention was that, no matter what language was spoken by the person at the mouth-piece, be it Greek, Choctaw, or Chinese, the words came to the ear in perfect English...
Via Deep Speech 2: End-to-End Speech Recognition in English and Mandarin.
Scroll down for more stories in the same category. (Story submitted 12/20/2015)
Follow this kind of news @Technovelgy.
| Email | RSS | Blog It | Stumble | del.icio.us | Digg | Reddit |
you like to contribute a story tip?
Get the URL of the story, and the related sf author, and add
Comment/Join discussion ( 0 )
Related News Stories -
BloxVox Mutes Cellphone Convos
'had he not been talking into a hush-a-phone which he had plugged into the telephone jack...' - Robert Heinlein, 1940.
Sonitus Audio Interface Positioned Beyond The Noise
'... an instrument having relatively small bit pieces adapted to be gripped between the teeth.' - Hugo Gernsback, 1923.
FlexPai Foldable Phone By Royole
'...A paper thin polycarbon screen unfurled.' - William Gibson, 1986.
BrainNet Social Network Of Brains
'I used my implant to tell MILLIE what we wanted and she took care of it' - Pournelle and Niven, 1981.
Technovelgy (that's tech-novel-gee!)
is devoted to the creative science inventions and ideas of sf authors. Look for
the Invention Category that interests
you, the Glossary, the Invention
Timeline, or see what's New.
North Focals Smart Glasses Provide Augmented Reality In Style
'The world ... is drenched in unfamiliar information all the way to the horizon.'
Tesla Driver Caught Napping Behind The Wheel
'Mary Risling settled back for a little nap...'
Hayabusa 2 To Begin Asteroid Mining
'We must dig down, and then doubtless we shall find the metal.'
Ionocraft Drone Powered By Electrohydrodynamic Thrust
'He saw one hiss by him as he rounded the corner, trailing a short whip antenna...'
Purdue Pharma Ready To Profit From OxyContin Use Or Addiction Recovery
'It may be organic damage. It may be permanent. Time'll tell, and only after you are off Substance D for a long while.'
BloxVox Mutes Cellphone Convos
It's the polite thing to do, and has been the polite thing to do for about four generations.
Superfast Replicator: Volumetric Additive Manufacturing
I can't wait. Bring it on.
DNA May Contain Malware
'You were told to embed the logical pathogen.'
I Can't Resist Worm Robots
'Seen close it was not completely flexible...'
Rplate Digital License Plates Now Legal In Michigan
'Gragg's digital ink license plates ...'
Can Musk Starship Astronauts Use Magnetic Boots?
'Walking awkwardly in the magnetic boots that held him to the black mass of meteoric iron...'
Giant Dolphin Spotted On Jupiter!
'Now at last he could appreciate its real size and complexity...'
Musk's Starship An SF Fan's Dream Come True
Perfect for testing, perfect for fans!
TinyMobileRobots Are Sewer Sentinels
Every movie monster gets its start someplace.
Fishy Facial Recognition Now Possible
'Palenkis can identify random line patterns better than any other species in the universe.'
Spicy Tomatoes Created With Genetic Engineering
How about mashed potatoes and brown gravy?
More SF in the News Stories
More Beyond Technovelgy science news stories