Megatron-LM by Nvidia
Research on training transformer language models at scale, including BERT:
Personal VAD by Google
https://arxiv.org/abs/1908.04284
Personal VAD: Speaker-Conditioned Voice Activity Detection
Shaojin Ding, Quan Wang, Shuo-yiin Chang, Li Wan, Ignacio Lopez Moreno
Comments: To be submitted to ICASSP 2020Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Machine Learning (stat.ML)
Affordable AI
Concern: https://arxiv.org/abs/1906.02243
Good examples:
- $40 to train imagenet: https://www.fast.ai/2018/08/10/fastai-diu-imagenet
- Fastest end2end ASR: https://github.com/facebookresearch/wav2letter
- Neural architecture search in 1.5 days on a single GPU: https://arxiv.org/abs/1806.09055