
LLMs Are the Key to Unlocking the Next Generation of Search
The Data Exchange with Ben Lorica
00:00
The Future of Machine Learning in the Cloud
Amin Amad: I think it's not that far off. These models will be we can do inference and existing hardware like CPUs or are you talking about a new class of hardware? He says Vectara does a lot of model quantization, both at training time and at inference time. So for instance, going from flow 32 to B float and things like that has been common place for years now.
Transcript
Play full episode