Data Engineering Podcast

Semantic Operators Meet Dataframes: Building Context for Agents with FENIC

23 snips
Jan 12, 2026
Kostas Pardalis, a data infrastructure engineer and founder, discusses Fennec, a revolutionary DataFrame engine designed for LLM-powered data workflows. He explains the limitations of traditional data infrastructures and introduces semantic operators that transform unstructured data into structured schemas. Kostas delves into Fennec's architecture, lazy DataFrame plans, and optimizer design, emphasizing its role in enhancing context management for agents. He also shares practical use cases and the future potential of integrating Fennec with other frameworks for scalable, reliable data solutions.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Treat Inference As First-Class Compute

  • Fennec is a PySpark-inspired dataframe engine that treats LLM inference as first-class compute in the logical plan.
  • This lets the optimizer reason about inference, reorder operations, and improve efficiency and fault tolerance.
INSIGHT

Make Implicit Structure Explicit

  • Unstructured data usually contains implicit structure that LLMs can make explicit as schema columns.
  • Turning implicit structure into explicit schema lets you mix LLMs with deterministic processing for scale and reliability.
INSIGHT

Laziness Enables LLM-Aware Optimization

  • Fennec composes lazy dataframe expressions into logical plans, optimizes them, and executes via Polars or DuckDB.
  • Semantic operators (filter, extract, join) are exposed so the optimizer can apply LLM-aware optimizations.
Get the Snipd Podcast app to discover more snips from this episode
Get the app