Aleksa Gordić, ex-Google DeepMind/Microsoft ML engineer, discusses pioneering AI models for regional languages, focusing on the development of YugoGPT. They explore unique language dynamics in the Balkans, business opportunities for multilingual models, and challenges in deploying large language models. The conversation delves into Aleksa's experience with vision and image models, collaborations with tech players, and use of advanced technologies. They also discuss open sourcing models, the lack of language support, and advantages/limitations of working for a company.
Read more
AI Summary
AI Chapters
Episode notes
auto_awesome
Podcast summary created with Snipd AI
Quick takeaways
The need for creative approaches to adapt existing language models to different languages while preserving reasoning capabilities.
Opportunities in the multi-lingual space and the potential for tackling different markets by training and adapting large-scale language models.
The importance of creating language models for underrepresented languages to support cultural diversity and promote inclusivity.
Deep dives
Building non-English large language models
The speaker used to work as a research engineer at DeepMind and is now building non-English large language models and starting a company around it.
Challenges in training language models for under-represented languages
One challenge is the lack of data and tokens for training large language models in many under-represented languages. The speaker discusses the need for creative approaches to adapt existing models to different languages while preserving their reasoning capabilities.
Opportunities in the multi-lingual space and the importance of language adaptation
The speaker highlights the opportunities in the multi-lingual space and the potential for tackling different markets. They emphasize the need for technical expertise in training large-scale language models and adapting them to various languages.
Creating Language Models for Underrepresented Languages
The podcast episode discusses the importance of creating large language models for underrepresented languages. The guest, who is working on a language model for HBS languages, explains that while the focus has been predominantly on English, there are numerous underserved languages in the world. The guest highlights the need for language models that can support these languages, both in written and spoken form. They emphasize the significance of preserving and promoting cultural diversity by developing language models for these underrepresented languages.
Choosing Startups Over Big Tech Corporations
The episode delves into the decision to leave big tech corporations like DeepMind and pursue startup opportunities. The guest shares their entrepreneurial mindset and the desire for high agency roles that allow them to leverage their skills and build what they want. They express that even though big tech offers lucrative compensation packages, the expected value of pursuing a startup is much higher. The guest emphasizes the importance of long-term vision and the potential for greater financial gain by taking risks and building one's own company.
Aleksa Gordić is an ex-Google DeepMind / Microsoft ML engineer currently working on non-English LLMs at OrtusAI, open-sourcing Meta's NLLB (no language left behind) project and
YugoGPT.
MLOps podcast #203 with Aleksa Gordić, Founder of OrtusAI, Pioneering AI Models for Regional Languages.
// Abstract
Dive deep into Aleksa's work with the YugoGPT, a language model serving Serbian, Croatian, Bosnian, and Montenegrin dialects - emphasizing the need for multilingual AI developments.
Explore the unique language dynamics in the Balkans and Eastern Europe, the potential business opportunities around multilingual models, and the challenges in deploying large language models. Aleksa shares his experience with vision and image models, his collaborations with key tech players, and his use of advanced technologies. Hear about Aleksa Gordić's journey of being active and visible in the tech community and his insights into the world of machine learning and AI. Prepare to have your thinking challenged and horizons widened as we converse about the intriguing and complex world of MLOps.
// Bio
Working on non-English LLMs at OrtusAI, open-sourcing Meta's NLLB (no language left behind) project. Worked at DeepMind on the Flamingo project as a research engineer. Worked at Microsoft on the HoloLens 2 project & next-gen mixed reality glasses.
// MLOps Jobs board
https://mlops.pallet.xyz/jobs
// MLOps Swag/Merch
https://mlops-community.myshopify.com/
// Related Links
Website: https://gordicaleksa.com/
https://github.com/gordicaleksa - I build stuff :)
https://discord.com/invite/peBrCpheKE - active AI Discord server (~6000) I bring the best AI researchers in the world to give talks (James Betker DALL-E 3 author, Tri Dao (Flash Attention), etc.)
https://gordicaleksa.medium.com/how-i-got-a-job-at-deepmind-as-a-research-engineer-without-a-machine-learning-degree-1a45f2a781de - how I landed a job at DeepMind (and a couple more potentially interesting writings)
Aleksa Gordić The AI Epiphany Youtube Channel: https://www.youtube.com/channel/UCj8shE7aIn4Yawwbo2FceCQ/videos
W&B AI Academy: http://wandb.me/mlops_com_llm_course
--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Aleksa on LinkedIn: https://www.linkedin.com/in/aleksagordic/
Timestamps:
[00:00] Aleksa's preferred coffee
[00:17] Takeaways
[02:51] Humming the GPU's
[06:23] Built Chrome extension for communicating with videos
[08:04] Rig Doubles Throughput Time
[09:32] Vector databases advise
[10:38] Learning from experts, connecting, and gathering insights.
[13:47] Zero to Hero for MLOps
[15:37] Serendipitous moments
[17:52] Depth Over Breaking News
[19:50] Trust in GPT Content
[22:22] Exam Challenges and AI
[26:53] YugoGPT
[31:41] WandB Ad
[33:33] Linguistic Mysteries
[34:52] No Language Left Behind project (NLLB project)
[36:53] YugoGPT Development Overview
[37:49] NLLB vs YugoGPT
[39:35] Yugo GPT parameters
[41:16] Opportunities for unsupported languages
[43:08] Diffusion model
[44:39] Generative AI with image generation models
[47:45] AI Challenges and Excitement
[50:32] Challenges in different alphabet characters
[52:10] Need a co-founder
[56:05] Career transition and entrepreneurial mindset
[1:00:20] Big Tech salary misconceptions
[1:03:02] Inspiring wrap up
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode