VictoriaMetrics internals - Making monitoring simple and reliable at massive scale
Jan 20, 2024
auto_awesome
Join the insightful discussion with creators Alex and Roman on VictoriaMetrics, a highly scalable monitoring solution and time series database. Explore its origins, evolution, unique architecture, data ingestion, and integration. Learn about the Vector Metric architecture, the role of object storage, and the importance of indexing. Discover the process of data ingestion and selection, and explore future plans for VictoriaMetrics.
The architecture of Victoria Metrics consists of ingestion nodes, storage nodes, and select nodes, which work together to handle data ingestion and query processing.
Data can be ingested into Victoria Metrics using a pull or push approach, with support for various ingestion protocols and a metrics collector called VM Agent.
Select nodes in Victoria Metrics fan out user queries to all storage nodes, retrieve and process data blocks, perform aggregations and filtering, and return a JSON response to the user.
Deep dives
Victoria Metrics architecture and scalability
The architecture of Victoria Metrics consists of ingestion nodes, storage nodes, and select nodes. Ingestion nodes use consistent hashing to determine which storage node to send the data to. Data is buffered for one second before being written to disk, and indexes are created for better data retrieval. Select nodes receive user queries and fan out the query to all storage nodes. Data is retrieved from storage nodes and processed to form a JSON response. Victoria Metrics can be scaled horizontally by adding more storage and select nodes. The system is designed to be robust and can handle failures of individual nodes without disrupting data ingestion or query processing.
Integration and ingestion of data in Victoria Metrics
Data can be ingested into Victoria Metrics using a pull or push approach. For the pull approach, Victoria Metrics is compatible with the Prometheus ecosystem and can scrape metrics from targets. For the push approach, there are various ingestion protocols supported, such as Influx line protocol, JSON, Graphite, and more. Victoria Metrics provides a metrics collector called VM Agent that can be deployed close to the services for buffering data before sending it. This allows for offline operation and evaluation of alerting rules. The ingestion process uses consistent hashing to determine which storage node to write the data to, and data is flushed to disk and compressed for efficient storage.
Querying data in Victoria Metrics
When a user query is received, the select nodes fan out the query to all storage nodes. Each storage node checks if it has the requested data and responds accordingly. Data blocks are read from disk and transferred back to the select node for processing. Select nodes perform aggregations, transformations, and filtering on the data to form a JSON response, which is then returned to the user. Streaming aggregation and parallel processing across multiple CPU cores are used for optimizing query performance. Victoria Metrics supports both instant queries and range queries, and has a roll-up cache for caching query results.
Use Cases for Victoria Metrics
Victoria Metrics is commonly used for monitoring systems in Kubernetes environments, where it can collect and store infrastructure and application metrics. However, it has a wide range of use cases beyond monitoring. It can be used for IoT data collection, logging, automotive applications, and more. Any data that has a timestamp and can be represented as a time series can be stored and queried using Victoria Metrics.
Future Developments for Victoria Metrics
Victoria Metrics is actively developing Victoria Logs, a log management solution that integrates with popular log shippers and supports full-text search. They are planning to release a cluster version of Victoria Logs and add more features for transformation and aggregation during queries. Additionally, they are expanding their cloud version of Victoria Metrics on AWS and continuously improving the open-source project based on user feedback and requirements.
Deep Dive into Victoria Metrics with Alex and Roman
Join the insightful discussion with Vitoriametrics creators, Alex and Roman, in the Geekneritor podcast hosted by Kaivalya Apte. This episode explores the internals of Victoria Metrics - a highly scalable monitoring solution and time series database. Discover the origins of Victoria Metrics, understand how it evolved, and learn about its unique architecture and functionality. From the concept of time series, the usage of consistent hashing in data distribution to real-world applications, it's all packed into this engaging conversation.
00:00 Introduction
01:52 The Genesis of VictoriaMetrics
02:18 The Journey from Postgres to Clickhouse
03:19 The Transition from Prometheus to Victoria Metrics
05:08 The Birth and Evolution of Victoria Metrics
13:01 The Architecture of Victoria Metrics
20:10 Data Ingestion and Integration in Victoria Metrics
29:15 Understanding the Vector Metric Architecture
30:30 Comparing Shared Storage and Object Store
31:00 Designing the VictoriaMetrics Architecture
32:01 The Role of Object Storage
36:15 The Importance of Indexing
43:19 Understanding the Ingestion Process
45:46 Exploring the Select Process
55:55 Future Plans for Victoria Metrics
Important Links:
1. Architecture Overview: https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#architecture-overview
2. How ClickHouse Inspired Us to Build a High Performance Time Series Database
https://altinity.com/wp-content/uploads/2021/11/How-ClickHouse-Inspired-Us-to-Build-a-High-Performance-Time-Series-Database.pdf
3. Frequently asked questions.
https://docs.victoriametrics.com/FAQ.html
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.