Ephemeral databases are designed for efficient and flexible test data infrastructure, contrasting with redundant and constantly running databases like RDS. They are quick to start up and shut down, making them ideal for automated API-driven environments in CI/CD processes and for developers' ad hoc testing needs. Ephemeral databases also auto shut down after inactivity, minimizing unnecessary costs. The technology behind ephemeral databases enables fast performance, especially when combined with data subsetting. These databases are pre-populated with data snapshots, streamlining the database recreation process.
All robust technology platforms require testing to ensure that features work as intended. In many cases, tests require data, but getting access to valid and high quality test data is a common challenge, especially when the technology runs on sensitive data. Realistically mimicking data that would normally contain sensitive financial or personal information is not easy.
Tonic.ai was started in 2018 to provide developer tools to transform production data into safe testing data. Andrew Colombi is the CTO and Adam Kamor is the Head of Engineering at Tonic. They join the show to talk about creating realistic synthetic data, data de-identification, validating LLM RAG output, Tonic’s subsetting engine, and much more.
Full Disclosure: This episode is sponsored by Tonic.
Gregor Vand is a security-focused technologist, and is the founder and CTO of Mailpass. Previously, Gregor was a CTO across cybersecurity, cyber insurance and general software engineering companies. He has been based in Asia Pacific for almost a decade and can be found via his profile at vand.hk.
The post Tonic and Synthetic Data with Andrew Colombi and Adam Kamor appeared first on Software Engineering Daily.