Cascaded multilingual audio-visual learning from videosAndrew RouditchenkoAngie Boggustet al.2021INTERSPEECH 2021Conference paper
Grounding spoken words in unlabeled videoAngie BoggustKartik Audhkhasiet al.2019CVPRW 2019Conference paper