DevOps and Docker Talk: Cloud Native Interviews and Tooling

Local GenAI LLMs with Ollama and Docker

19 snips

Jun 14, 2024

Friend of the show, Matt Williams, explains how to run local ChatGPT and GitHub Copilot clones using Ollama and Docker's GenAI Stack. Topics include setting up LLM stacks, deploying models, utilizing RAG for customized responses, and integrating Docker for GPU utilization.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Ollama's Purpose

Ollama makes running LLMs locally easy, addressing data privacy and accessibility concerns.
Local LLMs offer advantages in speed and quality, even compared to cloud-based solutions like ChatGPT.

ANECDOTE

Offline Coding with Ollama

Matt Williams used local LLMs on a ferry with no internet access.
This highlights the offline capability of local LLMs, a key advantage over cloud-based alternatives.

ADVICE

Installing Ollama

Install Ollama for a simplified LLM experience, avoiding complex hardware and software configurations.
Ollama streamlines model setup, similar to Docker's ease of use with containers.

Get the Snipd Podcast app to discover more snips from this episode

Get the app

Bret and Nirmal are joined by friend of the show, Matt Williams, to learn how to run your own local ChatGPT clone and GitHub Copilot clone with Ollama and Docker's "GenAI Stack," to build apps on top of open source LLMs.

We've designed this conversation for tech people like myself, who are no strangers to using LLMs in web products like chat GPT, but are curious about running open source generative AI models locally and how they might set up their Docker environment to develop things on top of these open source LLMs.

🙌 My next course is coming soon! I've opened the waitlist for those wanting to go deep in GitHub Actions for DevOps and AI automation in 2025. I'm so thrilled to announce this course. The waitlist allows you to quickly sign up for some content updates, discounts, and more as I finish building the course. https://learn.bretfisher.com/waitlist🍾

Matt Williams is walking us through all the parts of this solution, and with detailed explanations, shows us how Ollama can make it easier on Mac, Windows, and Linux to set up LLM stacks.

Be sure to check out the video version of this episode for any demos.

This episode is from our YouTube Live show on April 18, 2024 (Stream 262).

★Topics★

Creators & Guests

Cristi Cotovan - Editor
Beth Fisher - Producer
Bret Fisher - Host
Matt Williams - Host
Nirmal Mehta - Host

(00:00) - Intro
(01:32) - Understanding LLMs and Ollama
(03:16) - Ollama's Elevator Pitch
(08:40) - Installing and Extending Ollama
(17:17) - HuggingFace and Other Libraries
(19:24) - Which Model Should You Use?
(26:28) - Ollama and Its Applications
(28:57) - Retrieval Augmented Generation (RAG)
(36:44) - Deploying Models and API Endpoints
(40:38) - DockerCon Keynote and LLM Demo
(47:44) - Getting Started with Ollama

You can also support my free material by subscribing to my YouTube channel and my weekly newsletter at bret.news!

Grab the best coupons for my Docker and Kubernetes courses.
Join my cloud native DevOps community on Discord.
Grab some merch at Bret's Loot Box
Homepage bretfisher.com