Amazon scientists' work from Interspeech 2022

Amazon Science

The latest news and research from Amazon’s science community. #AmazonScience

Published Sep 27, 2022

Learn more about the work Amazon researchers presented at Interspeech 2022—the world's largest and most comprehensive conference on the science and technology of spoken-language processing.

Amazon’s 40-plus papers at Interspeech 2022

Amazon researchers had more than 40 papers accepted, ranging from topics such as automatic speech recognition and text-to-speech to acoustic watermarking and automatic dubbing.

The training behavior of the algorithm proposed in "Sub-8-bit quantization aware training for 8-bit neural network accelerator with on device speech recognition", in which weights are optimized to lower quantization loss.

The growth of interdisciplinary research

Senior applied scientist Penny Karanasou was an area and session chair for Interspeech 2022. Across her career, she has worked on speech recognition, language understanding, and text-to-speech. Find out why cross-pollination of speech-related fields intrigues her and how the conference program reflected that.

Alexa speech science developments

Illustration of the arbitrator and Transformer backbone of each block. The lightweight arbitrator toggles whether to evaluate subcomponents during the forward pass.

Alexa AI senior principal scientist Andreas Stolcke highlighted some speech-related papers, focusing on end-to-end models and fairness. He also wrote about the techniques Amazon scientists are using, like toggling neural blocks on and off, adding multiple CNN front ends to RNN-T models, and adversarial reweighting.

Alexa’s spoken-language-understanding research

Alexa AI senior principal scientist Gokhan Tur selected papers that covered a wide range of topics in spoken-language understanding—like learning from noisy data, using phonetic embeddings to improve entity resolution, and quantization-aware training.

The architecture of the weighted-sum model.

Alexa’s text-to-speech research

A new approach to building expressive text-to-speech voices can make do with only an hour of expressive speech from the target speaker.

Senior applied scientist Antonio Bonafonte wrote about work being done on transference—of prosody, accent, and speaker identity—in text-to-speech, and the new ways scientists have used tools like normalizing flows and variational autoencoders.

Get a monthly digest of the latest news, research papers, conferences, and career opportunities at Amazon, by signing up for our newsletter.

Matthew Hepburn

Principal Product Marketing Manager, Amazon Science

👏👏

To view or add a comment, sign in

Sign in

Stay updated on your professional world

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Amazon scientists' work from Interspeech 2022

Amazon Science

The latest news and research from Amazon’s science community. #AmazonScience