Semih Salihoğlu, CEO of Kuzu and professor at the University of Waterloo, shares insights into the fascinating history of graph databases. He traces their evolution from early systems like IDS to modern architectures such as Neo4j and Kùzu. The conversation illuminates the impact of the World Wide Web on document stores like MongoDB and explores RDF as a flexible reasoning system tied to the semantic web. Semih also discusses the future of graph databases, particularly their integration with AI, emphasizing the importance of explainability and advancements in querying capabilities.
The evolution of graph databases originated with the Integrated Data Store (IDS), establishing core concepts that still underpin modern database management systems.
Modern property graph databases, exemplified by Neo4j, enhance user experience and analytics beyond traditional SQL through intricate data interconnectedness.
Deep dives
Origins of Graph Databases
The history of graph databases is traced back to the Integrated Data Store (IDS), which was the first database management system developed between 1961 and 1964 at General Electric. This system utilized a network model, allowing for records to be linked together, akin to how nodes in a graph are connected. Unlike the tedious process of sequentially searching through file systems, IDS allowed users to interact with data through a logical model that abstracted physical storage details. The innovations of IDS laid the groundwork for modern database management by formally defining how records could be linked, a principle that remains foundational in current graph database systems.
Legacy and Impact of IDS
IDS established several core concepts that continue to influence database technology today, including the notion of a logical data model, primary keys for unique identification of records, and basic query languages for data manipulation. The model allowed users to define records without needing to know physical storage details, further simplifying data retrieval. Additionally, the practice of explicitly linking records has a parallel in modern databases, where explicit edges between nodes enhance performance and facilitate more complex queries. This historical context highlights how foundational ideas from the early days of database management still resonate within current technologies.
Transition to Commercial Systems
The transition from early database systems like IDS to commercial offerings is marked by IBM's Information Management System (IMS), which introduced a hierarchical model as a restriction of IDS's more flexible network model. IMS was highly successful and generated significant revenue, showcasing the viability of graph-based concepts in commercial environments. However, while it maintained some graph-like qualities, IMS limited the linking of records to hierarchical relationships, contrasting with the more dynamic linking capabilities of IDS. This development emphasized the importance of both flexibility and structure within database management systems during the evolution of graph technologies.
Emergence of Property Graphs and Modern Trends
Neo4j popularized the property graph model, which maintained simplicity while introducing a graph-based logical structure that appeals to users accustomed to relational databases. This model allows for complex path traversals and analysis, enabling queries that can exploit the interconnectedness of data in a more natural manner than traditional SQL. While legacy systems laid the groundwork, modern implementations like property graphs emphasize user experience and advanced analytics, moving well beyond the constraints of earlier navigational systems. As graph databases evolve, the adoption of AI and advancements in explainable AI points towards a promising future where graph technologies play an integral role in intelligent data analysis.
Listen to Amy Hodler interview Semih Salihoğlu, CEO of Kuzu and professor at the University of Waterloo, to learn about the fascinating history of graphs through the lens of database management systems. In this podcast, Semih walks us through the evolution of systems: from the first database system, IDS, to modern property graph databases, such as Neo4j and Kùzu.
You’ll learn about the connections between the birth of the World Wide Web and document-based datasets and document stores, such as MongoDB. Amy and Semish also discuss the flexibility of RDF as a reasoning system and ties to the semantic web.
In this quick history of graph databases, you’ll discover the roots of features that we see in modern graph database management systems and gain an appreciation for the collective innovations.