Programming Throwdown cover image

161: Leveraging Generative AI Models with Hagay Lupesko

Programming Throwdown

00:00

The Cost of Running Complex Workloads

With regular deep learning models, like predicting the probability of an event, you would want to serve on the CPU because you don't have a batch versus in training. So there has been a lot of interesting work by the community of folks kind of allowing me to run these models on commodity hardware. There's something called LAMA CPP, I think that someone hacked together where it's a super efficient implementation of inference for LAMA on a commodity CPU.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app