4 open jobs
Give a voice to your vision for conversational AI
Amazon Alexa is leading the way in making spoken language the next user interface. Alexa is the voice service that powers Amazon’s family of Echo products, Amazon Fire TV and other third-party products. Echo is a device that you can talk to from across the room to play music, get the news, set timers, make hands-free calls, manage to-do and shopping lists, control lights, your thermostat and so much more.
Technologies We Focus On
The Alexa AI team contributes to the magic that is Alexa. Our goal is to make voice interfaces ubiquitous and as natural as speaking to a human. We have a relentless focus on the customer experience and customer feedback. We use many real-world data sources including customer interactions and a variety of cutting-edge techniques, such as highly scalable deep learning, to train our speech models. Learning on this massive scale requires new research and development. The team is responsible for cutting-edge research and development in virtually all fields of human language technology. This WIRED article, which includes interviews with several Alexa scientists, provides good insight into our customer-centric approach to research and development, as does this interview with Rohit Prasad, Vice President and Head Scientist, Amazon Alexa.
Alexa scientists and developers have a significant impact on customers’ lives and are leading the industry in its shift toward conversational AI. Alexa scientists and engineers also invent new tools and APIs to accelerate the development of voice services by empowering developers through the Alexa Skills Kit and the Alexa Voice Service. For example, developers can now create a new voice experience by simply providing a few sample sentences.
Our research is primarily customer focused. Your discoveries in speech recognition, natural language understanding, deep learning and other disciplines of machine learning can fuel new ideas and applications that have a direct impact on people’s lives. We also firmly believe that our team must engage deeply with the academic community and be part of scientific discourse. There are many opportunities for presentations at internal machine learning conferences, which can be a springboard for publication at premier industry and academic conferences. We also partner with universities through the Alexa Prize.
We encourage the publication of research that will contribute to a future of more natural and engaging computing experiences. Research recently published by the Alexa science team is listed below.
- Fast and Scalable Expansion of Natural Language Understanding Functionality for Intelligent Agents, Anuj Goyal, Angeliki Metallinou, Spyros Matsoukas, NAACL 2018
- Selecting Machine-Translated Data for Quick Bootstrapping of a Natural Language Understanding System, Judith Gaspers, Penny Karanasou, Rajen Chatterjee, NAACL, 2018.
- The Alexa Meaning Representation Language, Thomas Kollar, Danielle Berry, Lauren Stuart, Karolina Owczarzak, Tagyoung Chung, Lambert Mathias, Michael Kayser, Bradford Snow, Spyros Matsoukas, NAACL, 2018.
- Efficient Large-Scale Domain Classification With Personalized Attention, Young-Bum Kim, Dongchan Kim, Anjishnu Kumar, Ruhi Sarikaya, ACL 2018
- A Scalable Neural Shortlisting-Reranking Approach for Large-Scale Domain Classification in Natural Language Understanding, Young-Bum Kim, Dongchan Kim, Joo-Kyung Kim, Ruhi Sarikaya, NAACL 2018
- Monophone-Based Background Modeling for Two-Stage On-Device Wake Word Detection, Minhua Wu, Sankaran Panchapagesan, Ming Sun, Jiacheng Gu, Ian Thomas, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister, Arindam Mandal, ICASSP 2018
- Time-Delayed Bottleneck Highway Networks Using A DFT Feature for Keyword Spotting, Jinxi Guo, Kenichi Kumatani, Ming Sun, Minhua Wu, Anirudh Raju, Nikko Strom, Arindam Mandal, ICASSP 2018
- Multilayer Adaptation-Based Complex Echo Cancellation and Voice Enhancement, Jun Yang, ICASSP 2018
- Combining Acoustic Embeddings and Decoding Features for End-of-Utterance Detection in Real-Time Far-Field Speech Recognition Systems, Roland Maas, Ariya Rastrow, Chengyuan Ma, Guitang Lan Kyle Goehner, Gautam Tiwari, Shaun Joseph, Bjorn Hoffmeister, ICASSP 2018
- Context Aware Conversational Understanding for Intelligent Agents with a Screen, Vishal Ishwar Naik, Angeliki Metallinou, Rahul Goel, AAAI, 2018.
- Multi-Task Learning For Parsing The Alexa Meaning Representation Language, Vittorio Perera, Tagyoung Chung, Thomas Kollar, Emma Strubell, AAAI, 2018.
- Conversational AI: The Science Behind the Alexa Prize, Ashwin Ram, Rohit Prasad, Chandra Khatri, Anu Venkatesh, Raefer Gabriel, Qing Liu, Jeff Nunn, Behnam Hedayatnia, Ming Cheng, Ashish Nagar, Eric King, Kate Bland, Amanda Wartick, Yi Pan, Han Song, Sk Jayadevan, Gene Hwang, Art Pettigrue, 2018.
- Efficient Large-Scale Domain Classification with Personalized Attention, Young-Bum Kim, Dongchan Kim, Anjishnu Kumar, Ruhi Sarikaya, 2018.
- Direct Modeling of Raw Audio with DNNS for Wake Word Detection, Kenichi Kumatani, Sankaran Panchapagesan, Minhua Wu, Minjae Kim, Nikko Strom, Gautam Tiwari, Arindam Mandal, ASRU, 2017.
- Just ASK: Building an Architecture for Extensible Self-Service Spoken Language Understanding, Anjishnu Kumar, Arpit Gupta, Julian Chan, Sam Tucker, Bjorn Hoffmeister, and Markus Dreyer, NIPS, 2017.
- On Evaluating and Comparing Conversational Agents, Anu Venkatesh, Chandra Khatri, Ashwin Ram, Fenfei Guo, Raefer Gabriel, Ashish Nagar, Rohit Prasad, Ming Cheng, Behnam Hedayatnia, Angeliki Metallinou, Rahul Goel, Shaohua Yang, and Anirudh Raju, NIPS, 2017.
- Topic-based Evaluation for Conversational Bots, Fenfei Guo, Angeliki Metallinou, Chandra Khatri, Anirudh Raju, Anu Venkatesh, and Ashwin Ram, NIPS, 2017.
- Learning Robust Dialog Policies in Noisy Environments, Maryam Fazel-Zarandi, Shang-Wen Li, Jin Cao, Jared Casale, David Whitney, and Alborz Geramifard, NIPS, 2017.
- Domain-Specific Utterance End-Point Detection for Speech Recognition, Roland Maas, Ariya Rastrow, Kyle Goehner, Gautam Tiwari, Shaun Joseph, Bjorn Hoffmeister, Interspeech, 2017.
- Zero-Shot Learning across Heterogenous Overlapping Domains,Anjishnu Kumar, Pavankumar Muddireddy, Markus Dreyer, Bjorn Hoffmeister, Interspeech, 2017.
- Robust Speech Recognition Via Anchor Word Representations, Brian King, I-Fan Chen, Yonatan Vaizman, Yuzong Liu, Roland Maas, SHK (Hari) Parthasarathi, Bjorn Hoffmeister, Interspeech, 2017.
- Robust online i-vectors for unsupervised adaptation of DNN acoustic models: A study in the context of digital voice assistants" Harish Arsikere, Sri Garimella, Interspeech, 2017.
- Compressed time delay neural network for small-footprint keyword spotting, Ming Sun, David Snyder, Yixin Gao, Varun Nagaraja, Mike Rodehorst, Sankaran Panchapagesan, Nikko Strom, Spyros Matsoukas, Shiv Vitaladevuni, Interspeech, 2017.
- Transfer Learning for Neural Semantic Parsing, Xing Fan, Emilio Monti, Lambert Mathias, and Markus Dreyer, ACL 2017 Workshop on Representation Learning for NLP.
- Deep Learning Based Automatic Volume Control And Limiter System, Jun Yang, Philip Hilmes, Brian Adair, David W. Krueger, ICASSP 2017
- Anchored Speech Detection, Roland Maas, Sree Hari Krishnan Parthasarathi, Brian King, Ruitong Huang, Bjorn Hoffmeister, Interspeech, 2016.
- Multi-task learning and Weighted Cross-entropy for DNN-based Keyword Spotting, Sankaran Panchapagesan, Ming Sun, Aparna Khare, Spyros Matsoukas, Arindam Mandal, Bjorn Hoffmeister, Shiv Vitaladevuni, Interspeech, 2016.
- LatticeRNN: Recurrent Neural Networks over Lattices, Faisal Ladhak, Ankur Gandhe, Markus Dreyer, Lambert Mathias, Ariya Rastrow, Bjorn Hoffmister, Interspeech, 2016.
- Optimizing Speech Recognition Evaluation Using Stratified Sampling, Janne Pylkkonen, Thomas Drugman, Max Bisani, Interspeech, 2016.
- Active and Semi-Supervised Learning in ASR: Benefits on the Acoustic and Language Models, Thomas Drugman, Janne Pylkkonen, Reinhard Kneser, Interspeech, 2016.
- Model Compression applied to small- footprint keyword spotting, George Tucker, Minhua Wu, Ming Sun, Sankaran Panchapagesan, Gengshen Fu, Shiv Vitaladevuni, Interspeech, 2016.
- Search-based Evaluation from Truth Transcripts for Voice Search Applications, Francois Mairesse, Paul Raccuglia, Shiv Vitaladevuni, SIGIR, 2016.
- Robust i-vector based Adaptation of DNN Acoustic Model for Speech Recognition, Sri Garimella, Arindam Mandal, Nikko Strom, Bjorn Hoffmeister, Spyros Matsoukas, Sree Hari Krishnan Parthasarathi, Interspeech, 2015.
- Scalable Distributed DNN Training Using Commodity GPU Cloud Computing, Nikko Strom, Interspeech, 2015.
- fMLLR based feature-space speaker adaptation of DNN acoustic models, Sree Hari Krishnan Parthasarathi, Bjorn Hoffmeister, Spyros Matsoukas, Arindam Mandal, Nikko Ström, Sri Garimella, Interspeech, 2015.
- Accurate Endpointing with Expected Pause Duration, Baiyang Liu, Bjorn Hoffmeister, Airya Rastrow, Interspeech, 2015.
Do you want to give your vision for conversational AI a voice? If so, here are some hints on how you can join our team. Please check out our open positions below, ranging from speech and machine learning scientist to language data specialist and technical programme manager. We have hundreds of opportunities available in the following global locations:
Meet Amazonians working in Alexa AI
Meet the Alexa AI Team
Meet the science team that created Alexa.
What does customer-obsessed science look like at Alexa AI?
We asked Alexa AI scientists what customer-obsessed science means at Alexa AI.
Chart Your Own Path
Learn from former interns how they charted their career path as Scientists on the Alexa teams.
The people and teams behind Alexa AI
Learn more about the people and teams that are innovating on new capabilities for Alexa.
How Alexa AI is engaging with the broader science community
Learn how Alexa AI scientists are encouraged to publish research that will contribute to a future of more natural and engaging computing experiences.
Advice for scientists considering a role in Alexa AI
Five Alexa AI scientists share their tips for any scientists interested in joining us to create a better experience for Alexa’s customers.