Humans have officially given their voice to machines. A research paper published by Google this month—which has not been peer reviewed—details a text-to-speech system called Tacotron 2, which claims near-human accuracy at imitating audio of a person speaking from text. The system is Google’s second official generation of the technology, which consists of two deep neural networks. The first network translates the text into a spectrogram (pdf), a visual way to represent audio frequencies over time.
Continue reading...
No comments:
Post a Comment