Episode 12
Why Multi-Modality is the Future of Machine Learning w/ Letitia Parcalabescu (University of Heidelberg, AI Coffee Break)
Letitia Parcalabescu is a PhD candidate at the University of Heidelberg focused on multi-modal machine learning, specifically with vision and language.
Learn more about Letitia:
https://www.cl.uni-heidelberg.de/~parcalabescu/
https://www.youtube.com/channel/UCobqgqE4i5Kf7wrxRxhToQA
Every Thursday I send out the most useful things I’ve learned, curated specifically for the busy machine learning engineer. Sign up here: http://bitly.com/mle-newsletter
Follow Charlie on Twitter: https://twitter.com/CharlieYouAI
Take the Giving What We Can Pledge: https://www.givingwhatwecan.org/
Subscribe to ML Engineered: https://mlengineered.com/listen
Comments? Questions? Submit them here: http://bitly.com/mle-survey
Timestamps:
01:30 Follow Charlie on Twitter (https://twitter.com/CharlieYouAI)
02:40 Letitia Parcalabescu
03:55 How she got started in CS and ML
07:20 What is multi-modal machine learning? (https://www.youtube.com/playlist?list=PLpZBeKTZRGPNKxoNaeMD9GViU_aH_HJab)
16:55 Most exciting use-cases for ML
20:45 The 5 stages of machine understanding (https://www.youtube.com/watch?v=-niprVHNrgI)
23:15 The future of multi-modal ML (GPT-50?)
27:00 The importance of communicating AI breakthroughs to the general public
37:40 Positive applications of the future “GPT-50”
43:35 Letitia’s CVPR paper on phrase grounding (https://openaccess.thecvf.com/content_CVPRW_2020/papers/w56/Parcalabescu_Exploring_Phrase_Grounding_Without_Training_Contextualisation_and_Extension_to_Text-Based_CVPRW_2020_paper.pdf)
53:15 ViLBERT: is attention all you need in multi-modal ML? (https://arxiv.org/abs/1908.02265)
57:00 Preventing “modality dominance”
01:03:25 How she keeps up in such a fast-moving field
01:10:50 Why she started her AI Coffee Break YouTube Channel (https://www.youtube.com/c/AICoffeeBreakwithLetitiaParcalabescu/)
01:18:10 Rapid fire questions
Links:
AI Coffee Break Youtube Channel
Exploring Phrase Grounding without Training
AI Coffee Break series on Multi-Modal learning
What does it take for an AI to understand language?
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations