

Streamlining Data Pipelines with MCP Servers and Vector Engines
10 snips Jul 15, 2025
Kacper Łukawski, a Senior Developer Advocate at Qdrant, specializes in vector databases for large language models. He dives into transforming unstructured data into valuable insights through semantic search and retrieval-augmented generation. Kacper explains the integration of MCP servers for optimizing data pipelines and discusses the challenges of managing embeddings. He also highlights innovative applications in coding practices and the complexities of vector search, offering practical advice on fine-tuning models and reducing costs for enhanced search quality.
AI Snips
Chapters
Transcript
Episode notes
Kacper's Big Data Beginnings
- Kacper Łukawski started working with big data pipelines around 2014-15 in the automotive industry.
- He used Apache tools like Spark, Kafka, and Uzi, expanding into data ingestion and visualization projects.
LLMs Are Not Magic Fixes
- Large language models (LLMs) do not automatically fix poor data quality or solve all data problems.
- Teams often struggle with scalability and deployment, especially in industries avoiding proprietary SaaS tools.
Ontology via Vector and Graph
- One user built a system combining vector embeddings and graphs to derive ontologies in law and medicine.
- This dual modeling approach captures semantics and relationships, popular in graph-RAG scenarios.