
Llama 3: Scaling open LLMs to AGI
Interconnects
00:00
Exploring Model Training and Fine-Tuning Strategies for Long Context Behavior
Explore training models on token sequences to prevent self-attention crossing document boundaries, importance of long context behavior, extending context through training, data usage variances, fine-tuning impacts, and curated data quality influence.
Transcript
Play full episode