Mercado Libre built Fury, a platform for machine-learning solutions supporting 500 users. They discuss platform features, technology, and Carlos de la Torre's mysterious LinkedIn denial. The podcast covers challenges in ML ops, expansion within Mercado Libre, and the evolution of machine learning practices in Latin America.
Read more
AI Summary
AI Chapters
Episode notes
auto_awesome
Podcast summary created with Snipd AI
Quick takeaways
Fury Data Apps extend the Fury platform to support machine-learning solutions for 500 users.
Plans to integrate with external teams for innovative feature development and open-source collaborations.
Automation drives efficient monitoring processes while enabling user-defined flexibility and risk management checks.
Scalability focus shifts from microservices to infrastructure-driven model using cloud services like AWS Kinesis.
Team emphasizes collaboration, feedback, and open-source contributions to drive innovation and platform enhancements.
Deep dives
Building a Scalable Monitoring Solution for Machine Learning Models
Automation and metrics tracking are crucial components of the monitoring solution being developed for machine learning models. The team is focused on automating ETL processes, training, and batch processing of model inferences. The current MVP collects data for statistical and business checks on model inputs and outputs, ensuring performance evaluations. Plans include scaling the monitoring solution to handle large traffic spikes and integrating with existing cloud services for data processing.
Exploring Integration with External Innovation Teams
The team aims to enable external innovation by integrating with specialized teams working on tools like the feature catalog. The platform is designed to facilitate connectivity to external services, ensuring seamless functionality. Collaboration efforts involve defining clear interfaces for integrating with external tools and data sources, allowing for versatile innovation possibilities. Plans include fostering open-source contributions and partnerships to enhance the overall effectiveness of the platform.
Addressing Automation and Standardizing Model Checks
Automation plays a key role in incorporating efficient model monitoring and error detection processes. While automation streamlines ETL, training, and batch processing, certain tasks like model deployment checks are kept manual for risk management purposes. Future plans involve implementing genetic monitors and standardizing model check processes to streamline monitoring tasks for users. The goal is to automate repetitive checks and alerts while maintaining user-defined flexibility in defining monitoring strategies.
Scaling Data Processing and Monitoring Infrastructure
As the monitoring solution evolves, scalability is a top priority, necessitating a shift from microservices to an infrastructure-driven model. Plans include leveraging cloud services like AWS Kinesis and enhancing the data processing capabilities of the platform. The team is working on transitioning from MVP to full-scale production by integrating with data processing services and ensuring efficient traffic handling. Focus areas include implementing advanced monitoring features and establishing a robust automated monitoring system.
Encouraging Collaboration and Feedback for Continuous Improvement
The team emphasizes a collaborative approach to innovation and invites feedback from stakeholders to enhance operational efficiency. Through open dialogue and sharing of best practices, the team aims to refine the monitoring solution's features and ensure alignment with user requirements. Continuous improvement is driven by a culture of openness to constructive criticism and a willingness to adapt based on user insights and industry feedback.
Acknowledging the Role of Open Source Contribution and Innovation
Embracing open-source principles, the team recognizes the value of contributing to existing projects while fostering innovation internally. Plans include investing in open-source initiatives, such as dedicating resources to improve tools like Jupyter Labs. By aligning with the ethos of sharing knowledge and resources, the team aims to foster collaborative efforts and drive advancements in the machine learning monitoring space.
Enhancing Experimentation and Model Monitoring Capabilities
Future development efforts involve expanding experimentation features for online model testing and verification. The team seeks to empower users to conduct A/B testing and statistical checks on model performance. By offering flexible integration options and customizable monitoring features, the platform aims to support diverse machine learning use cases and ensure robust performance evaluation processes.
Optimizing Automation for Scalability and User Flexibility
Automation strategies within the platform focus on enhancing scalability and user flexibility while maintaining efficient monitoring processes. Plans include developing a base class structure for automated model checks and integrating user-defined monitoring scripts. By balancing automation with user control over monitoring parameters, the team aims to offer a comprehensive and adaptable monitoring solution for machine learning models.
Building Secure and Scalable Monitoring Infrastructure
The team is dedicated to enhancing the security and scalability of the monitoring infrastructure to accommodate varying data processing needs. Transitioning from microservices to cloud-based data handling, the focus lies on adopting tools like AWS Kinesis for improved data processing capabilities. Plans include increasing the system's capacity to handle large data volumes and spikes in traffic, ensuring reliable monitoring and performance evaluation processes.
Promoting Transparency and Knowledge Sharing for Continuous Growth
Establishing a culture of transparency and knowledge sharing, the team invites constructive feedback and collaboration to drive continuous growth and improvement. Through open communication channels and collaborative efforts like code contributions and discussions, the platform aims to evolve based on user insights and industry best practices. The emphasis on openness and receptivity to feedback underscores a commitment to innovation and operational excellence.
MLOps community meetup #11 Machine Learning at scale in Mercado Libre with Carlos de la Torre
Mercado Libre hosts the largest online commerce and payments ecosystem in Latin America. The IT department built Fury: a PaaS framework for the development and deployment of multi-cloud, multi-technology, microservices. This platform leveraged the growth of the IT area, which now counts ~4000 people.
As such, it lacked support for machine-learning based solutions: an experimentation environment for data-scientists, infrastructure and data access support for ETL and model’s training tasks, etc. Therefore, for over a year now, they have been developing Fury Data Apps (FDA). An extension of Fury to for the design, experimentation, development and deployment of machine-learning based solutions. It is already supporting ~500 users and some high-performance production APIs.
In this meetup we talk about the main features of the platform, the supporting technology and why Carlos never accepted my Linkedin request.
Link to the article Carlos references: https://martinfowler.com/articles/cd4ml.html
Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw
follow us on twitter:@mlopscommunity
Sign up for the next meetup: https://zoom.us/webinar/register/WN_a_nuYR1xT86TGIB2wp9B1g
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Carlos on LinkedIn: https://www.linkedin.com/in/carlosdelatorre/
Follow Carlos on Twitter: @py_litox
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode