

TGC's AI Christian Benchmark, with Michael Graham
Oct 8, 2025
Michael Graham, Executive Director of the Keller Center, dives into his work with the TGC AI Christian Benchmark, examining how major AI models respond to theological inquiries. Surprisingly, a Chinese model, DeepSeek, scored highest, prompting discussions about varying results among the models. The conversation explores AI's potential in pastoral care, the ethical implications for Christians, and the importance of human oversight. Graham also emphasizes the need for discernment and training in using AI for church-related tasks.
AI Snips
Chapters
Transcript
Episode notes
Benchmarking AIs On Basic Christian Queries
- The Keller Center tested seven top AI models on basic Christian questions to see default answers without prompt engineering.
- They focused on missiological impact for everyday users like Michael Graham's 70-year-old mother.
Testing With 'My 70-Year-Old Mom' In Mind
- Michael used his 70-year-old mother as an example of an everyday user who won't prompt-engineer and needs reliable default answers.
- The benchmark simulated such blank-slate users asking common Google-style questions.
Human Alignment Explains Model Differences
- Alignment choices by humans drive wide variance in theological reliability across models despite similar underlying LLM tech.
- Graham found 36 alignment protocols with 32 human-centric steps shaping outputs and citations.