In this video I speak with Felix GV, who is a Principal Staff Engineer at Linkedin, and has done major contributions to the data infrastructure and Linkedin, including VeniceDB.
This episode will give you a good understanding of why we need a new database for storing "Derived Data" in a low latency, high performance manner, which is very important for Machine Learning workloads.
Chapters:
00:00 Introduction
01:42 The Evolution of LinkedIn's Databases
03:15 Challenges with Voldemort and the Birth of VeniceDB
08:42 Understanding Derived Data
13:33 Planet-Scale Applications and Multi-Region Support
17:40 Writing Data into VeniceDB
22:53 Merging Data in VeniceDB
40:31 Understanding the Architecture
40:47 Components of the Write Path
41:56 Leader and Follower Architecture
43:58 Partitioning and DaVinci Client
47:57 Read Patterns and Client Options
54:25 Fault Tolerance and Recommender Systems
01:01:19 Kafka Integration and Deployment
01:06:56 Roadmap and Future Improvements
Important links:
VeniceDB blog: https://www.linkedin.com/blog/engineering/open-source/open-sourcing-venice-linkedin-s-derived-data-platform
VeniceDB docs: https://venicedb.org/
Qcon: https://youtu.be/pJeg4V3JgYo?si=vblGUxp5fNdKPHoC
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#kafka #linkedin #venicedb #Rocksdb