Interconnects

Arcee AI goes all-in on open models built in the U.S.

29 snips
Jan 27, 2026
Lucas Atkins, CTO and pretraining/architecture lead at Arcee AI, talks about building Trinity Large and the six-month sprint to train a 400B total, 13B active MoE. He covers model scaling, Muon optimizer choices, stability and expert balancing, dataset and compute planning on B300 hardware, and deployment, licensing, and business tradeoffs for open U.S.-built models.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
INSIGHT

Trinity Large: A Fast, Competitive U.S. MoE

  • Arcee trained Trinity Large (400B total, 13B active) as a U.S.-built open MoE after a six-month sprint from smaller models.
  • The run prioritized being a true base model and achieved competitive benchmark scores before extra RL tuning.
ANECDOTE

Small Models As Trial Runs

  • Arcee used mini (26B, 3B active) and nano (6B, 1B active) to de-risk large-scale MoE training and experiment with deep small-model research.
  • Mini and Nano taught them practical issues for efficient MoE training before the 400B run.
ADVICE

Validate Large Bets With Small-Scale Evidence

  • When planning a large pretraining run, pick research ideas with precedent and validate them at small scale before committing.
  • Use trusted papers and community examples to make educated bets on optimizers and architectures.
Get the Snipd Podcast app to discover more snips from this episode
Get the app