MLOps.community

Demetrios
undefined
Aug 30, 2022 • 46min

MLOps at DoorDash // Hien Luu and DoorDash Leads // Coffee Sessions #119

MLOps Coffee Sessions #119 with Hien Luu, Sr. Engineering Manager of DoorDash, MLOps at DoorDash: 3 Principles for Building an ML Platform That Will Sustain Hypergrowth co-hosted by Skylar Payne. // Abstract Machine Learning plays a big part at DoorDash in terms of what they do on a daily basis. It powers many of their core infrastructures.   When it comes to DoorDash's business, they have to be leveraging machine learning and it is such a huge piece of the business that it is critical. // Bio Hien Luu is an Engineering Manager at DoorDash, leading the Machine Learning platform team at DoorDash. He is particularly passionate about the intersection between Artificial Intelligence and Big Data. He is the author of the Beginning Apache Spark 3 book.  He has given presentations at various conferences like Data+AI Summit, MLOps World, Deep Learning Summit, and apply() conference. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links engineering.linkedin.com/hadoop/user-engagement-powered-apache-pig-and-hadoop * https://doordash.engineering/2020/07/20/enabling-efficient-machine-learning-model-serving/ * https://doordash.engineering/2020/11/19/building-a-gigascale-ml-feature-store-with-redis/ * https://doordash.engineering/2021/03/04/building-a-declarative-real-time-feature-engineering-framework/ * https://doordash.engineering/2021/05/20/monitor-machine-learning-model-drift/ * https://doordash.engineering/2021/01/26/computational-graph-machine-learning-ensemble-model-support/ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Skylar on LinkedIn: https://www.linkedin.com/in/skylar-payne-766a1988/ Connect with Hien on LinkedIn: https://www.linkedin.com/in/hienluu/ Timestamps: [00:00] Introduction of DoorDash team [01:58] Overview of DoorDash [03:32] DoorDash's platform [13:23] Experimenting and testing new models [15:15] Experience transferring [17:16] Effective engagement with customers [24:15] Team sizes [25:37] Metrics [33:25] App for users [34:04] Using Databricks and Snowflake together [37:49] Supporting power users [40:17] Advice and experiences [43:53] Wrap up
undefined
44 snips
Aug 26, 2022 • 53min

ML Platforms, Where to Start? // Olalekan Elesin // Coffee Sessions #118

MLOps Coffee Sessions #118 with Olalekan Elesin, Director of Data Platform & Data Architect at HRS Product Solutions GmbH, co-hosted by Vishnu Rachkonda. // Abstract You don't have infinite resources? Call out your main metrics! Focus on the most impactful things that you could do for your data scientists. Olalekan joined us to talk about his experience previously building a machine learning platform at Scaleout24.    From our standpoint, this is the best demonstration and explanation of the role of technical product management in ML that we have on the podcast so far! // Bio Olalekan Elesin is a technologist with a successful track record of delivering data-driven technology solutions that leverages analytics, machine learning, and artificial intelligence. He combines experience working across 2 continents and 5 different market segments ranging from telecommunications, e-commerce, online marketplaces, and current business travel.    Olalekan built the AI Platform 1.0 at Scout24 and currently leads multiple data teams at HRS Group. He is an AWS Machine Learning Community Hero in his spare time. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links What Customers Want: Using Outcome-Driven Innovation to Create Breakthrough Products and Services book by Anthony Ulwick: https://www.amazon.com/What-Customers-Want-Outcome-Driven-Breakthrough/dp/0071408673 Empowered: Ordinary People, Extraordinary Products by Marty Cagan:   https://www.amazon.com/EMPOWERED-Ordinary-Extraordinary-Products-Silicon/dp/111969129X How to Avoid a Climate Disaster: The Solutions We Have and the Breakthroughs We Need by Bill Gates: https://www.amazon.com/How-Avoid-Climate-Disaster-Breakthroughs/dp/059321577X --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Olalekan on LinkedIn: https://www.linkedin.com/in/elesinolalekan/ Timestamps: [00:00] Introduction to Olalekan Elesin [00:42] Takeaways [02:52] Situation at Scaleout24 [07:53] Data landscape engineer and architect [11:27] Depiction of events [13:53] Platform approach investment [15:59] Exceptional need or opportunity to the most intense need [17:41] Long-tail pieces [22:01] Metrics [24:15] Nitty-gritty product works [26:00] Educating people metrics [30:02] Upskilling fundamentals of the product discipline [34:05] Investing in AWS [37:53] Best-of-breed tools [44:34] Continuous development for AutoML [47:26] Rapid fire questions [52:19] Wrap up
undefined
58 snips
Aug 19, 2022 • 58min

Data Engineering for ML // Chad Sanderson // Coffee Sessions #117

MLOps Coffee Sessions #117 with Chad Sanderson, Head of Product, Data Platform at Convoy, Data Engineering for ML co-hosted by Josh Wills. // Abstract Data modeling is building relationships between core concepts within your data. The physical data model shows how the relationships manifest in your data environment but then there's the semantic data model, the way that entity relationship design is extracted away from any data-centric implementation.   Let's do the good old fun of talking about why data modeling is so important! // Bio Chad Sanderson is the Product Lead for Convoy's Data Platform team, which includes the data warehouse, streaming, BI & visualization, experimentation, machine learning, and data discovery. Chad has built everything from feature stores, experimentation platforms, metrics layers, streaming platforms, analytics tools, data discovery systems, and workflow development platforms. He’s implemented open source, SaaS products (early and late-stage) and has built cutting-edge technology from the ground up. Chad loves the data space, and if you're interested in chatting about it with him, don't hesitate to reach out. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links https://odsc.com/speakers/scaling-machine-learning-with-data-mesh/ https://docs.google.com/presentation/d/1rVtltHkRkP_JaGZdkAS3U_SXfr5Gg-RP980FKXh0YNU/edit?usp=sharing Josh Wills will be teaching a course on Data Engineering for Machine Learning in September here: https://www.getsphere.com/ml-engineering/data-engineering-for-machine-learning --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Josh on LinkedIn: https://www.linkedin.com/in/josh-wills-13882b/ Connect with Chad on LinkedIn: https://www.linkedin.com/in/chad-sanderson/ Timestamps: [00:00] Introduction of the new co-host Josh Wills   [00:54] Introduction to Chad Sanderson [01:46] Josh will lead a course for Machine Learning in mid-September [02:16] Data modeling blog post of Chad [06:10] Idea of Strategy [09:40] Modern cloud data warehouses   [17:01] Layering on contracts [20:38] Scaling at larger companies [25:30] Carrot-stick strategy [34:27] Second and third-order effects [39:53] Stockholm Syndrome [41:22] Quality checks at Slack [45:28] Success in two main ways according to Chad [47:35] Completely and utterly different universes [53:42] Product use case to push semantic events [56:00] Pattern of analysis of the sequence of events [57:23] Wrap up
undefined
Aug 17, 2022 • 54min

Scaling Machine Learning with Data Mesh // Shawn Kyzer // Coffee Sessions #116

MLOps Coffee Sessions #116 with Shawn Kyzer, Principal Data Engineer at Thoughtworks (Spain), Scaling Machine Learning with Data Mesh co-hosted by Adam Sroka. // Abstract You can't just get something done by using tools. You need to go much deeper than that and it is very clear that Data Mesh is the same thing. You have to educate the organization about the movement.   In this session, Shawn broke down the cultural piece of data mesh and how many parallels there are with the MLOps Movement when it comes to the cultural side of MLOps. // Bio Shawn is passionate about harnessing the power of data strategy, engineering, and analytics in order to help businesses uncover new opportunities. As an innovative technologist with over 13 years of experience, Shawn removes technology as a barrier and broadens the art of the possible for business and product leaders. His holistic view of technology and emphasis on developing and motivating strong engineering talent, with a focus on delivering outcomes whilst minimising outputs, is one of the characteristics which sets him apart from the crowd. Shawn’s deep technical knowledge includes distributed computing, cloud architecture, data science, machine learning, and engineering analytics platforms. He has years of experience working as a consultant practitioner for a variety of prestigious clients ranging from secret clearance level government organizations to Fortune 500 companies. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links https://odsc.com/speakers/scaling-machine-learning-with-data-mesh/ https://docs.google.com/presentation/d/1rVtltHkRkP_JaGZdkAS3U_SXfr5Gg-RP980FKXh0YNU/edit?usp=sharing --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Adam on LinkedIn: https://www.linkedin.com/in/aesroka/ Connect with Shawn on LinkedIn: https://www.linkedin.com/in/shawn-kyzer-msit-mba-b5b8a4b/ Timestamps: [00:00] Introduction to Shawn Kyzer [00:43] Takeaways [04:00] Data Mesh for ML projects [11:22] The signal for the exploratory part of a new modeling project [14:13] Ownership and centralization [16:20] Lack of technology and some implementations literature [17:10] Python stronghold from Microsoft blogs [23:09] Integration with self-serve data platform [25:31] Starting a platform team [30:04] Quick wins [32:09] Metrics monitoring [34:18] Metrics break up [38:32] Limit to capabilities and not worth doing [41:39] Culture and technology holds [44:03] Setting the foundation [46:53] Unforeseen benefits [52:19] Lightning question
undefined
Aug 14, 2022 • 42min

How Hera is an Enabler of MLOps Integrations // Flaviu Vadan // Coffee Sessions #115

MLOps Coffee Sessions #115 with Flaviu Vadan, Senior Software Engineer at Dyno Therapeutics, How Hera is an Enabler of MLOps Integrations co-hosted by Vishnu Rachakonda. // Abstract Flaviu talks about the internal ML platform at Dyno Therapeutics called Hera. His team uses Hera as an internal innovation engine to help discover new breakthroughs with machine learning in the biotech healthcare industry. / Bio Flaviu is a Senior Software Engineer at Dyno Therapeutics, the leading organization in the design of novel gene therapy vectors with transformative delivery properties for a vast landscape of human diseases. Flaviu comes from a background focused on Bioinformatics, which is a field that combines Computer Science, Mathematics, and Biology. He took stints in academia by working as a research assistant in Computer Science and Bioinformatics labs before joining Dyno Therapeutics to work on machine-guided design of adeno-associated viruses (AAVs).    At Dyno, Flaviu works on compute and core infrastructure, DevOps, MLOps, and approaches that combine AI/ML to design AAVs in silico. He is also the author and maintainer of Hera, a Python SDK that facilitates access to Argo Workflows by making workflow construction and submission easy and accessible. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Flaviu on LinkedIn: https://www.linkedin.com/in/flaviuvadan/ Timestamps: [00:00] Introduction to Flaviu Vadan [00:50] Takeaways [02:06] Share this episode with a friend! [03:20] What Dyno does [05:44] CRISPR and Gene Editing [06:21] Kidney transplants and using pig organs [07:31] Deciding what genes to put in the body   [07:48] Role of ML at Dyno [10:07] Higher dose [13:41] Process of Machine Learning Deployment and Productionizing at Dyno [16:22] Proliferation of models [17:31] Building the internal platform [19:37] Interaction with data, translation to compute layer, evaluation [24:21] Venn diagram for MLOps [27:06] Leveraging Argo Workflows [30:34] Hera [35:28] Open sourcing [38:44] Human power at Dyno [41:17] Wrap up
undefined
Aug 10, 2022 • 57min

Product Enrichment and Recommender Systems // Marc Lindner and Amr Mashlah // Coffee Sessions #114

MLOps Coffee Sessions #114 with Marc Lindner, Co-Founder COO and Amr Mashlah, Head of Data Science of eezylife Inc., Product Enrichment and Recommender Systems co-hosted by Skylar Payne. // Abstract The difficulties of making multi-modal recommender systems. How it can be easy to know something about a user but very hard to know the same thing about a product and vice versa? For example, you can clearly know that a user wants an intellectual movie, but it is hard to accurately classify a movie as intellectual and fully automated. // Bio Marc Lindner Marc has a background in Knowledge Engineering. He's Always extremely product-focused with anything to do with Machine Learning.    Marc built several products working together with companies such as Lithium Technologies etc. and then co-Founded eezy. Amr Mashlah Amr is the head of data science at eezy, where he leads the development of their recommender engine. Amr has a master's degree in AI and has been working with startups for 6 years now. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Children of Time book by Adrian Tchaikovsky:   https://www.amazon.com/Children-Time-Adrian-Tchaikovsky/dp/0316452505 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Skylar on LinkedIn: https://www.linkedin.com/in/skylar-payne-766a1988/ Connect with Marc on LinkedIn: https://www.linkedin.com/in/marc-lindner-883a0883/ Connect with Amr on LinkedIn: https://www.linkedin.com/in/mashlah/
undefined
51 snips
Aug 6, 2022 • 1h 2min

Building Better Data Teams // Leanne Fitzpatrick // Coffee Sessions #113

MLOps Coffee Sessions #113 with Leanne Fitzpatrick, Director of Data Science of Financial Times, Building Better Data Teams co-hosted by Mihail Eric. // Abstract We spent a lot of time talking about data tooling but we maybe spent not as much time talking about data organizations and efficiently running and organizing data teams.    What about starting with limitations instead of aspirations? Right constraints instead of the north star? In this session, let's learn more about a realistic take on the state of data organizations of today. // Bio Leanne is Director of Data Science at the Financial Times and is a passionate data leader with experience building and developing empowered data science and analytics teams in a variety of businesses. Leanne is in her element when developing and implementing strategic, technical, and cultural solutions to getting machine learning and data science into the operational ecosystem. Leanne is an active part of the data and technology community, sharing innovation and insights to encourage best practices, from Manchester, UK to Austin, TX, and is an Advisory Panel Board Member. Outside of all things data you can ask Leanne about her golf swing (it’s not good - yet), her passion for American Football (specifically the Cincinnati Bengals), her latest sewing project, and her love for good music, food, and whisky. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Children of Time book by Adrian Tchaikovsky:   https://www.amazon.com/Children-Time-Adrian-Tchaikovsky/dp/0316452505 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Mihail on LinkedIn: https://www.linkedin.com/in/mihaileric/ Connect with Leanne on LinkedIn: https://www.linkedin.com/in/leanne-kim-fitzpatrick-29204341/ Timestamps: [00:00] Introduction to Leanne Fitzpatrick [04:23] Write us your suggestions! [05:43] Tri-pawed dog called Seaweed! [08:43] How to architect data teams [14:44] Organizational deficiencies [19:19] Tensions and conflicts for starters [24:07] Misunderstandings from marketing [25:59] The Middle Layer [28:48] Data science work at publications [31:11] Mystique of going to real-time [35:29] Third parties with fraud [37:40] Augmenting data practitioners with third-party tools [41:00] Principle of reinventing the wheel and avoiding undifferentiated heavy lifting   [46:29] Different Abstraction Layer recommendations [48:42] RN Production [51:56] Will Python eats RN Production away? [56:05] Julia as a dark horse [56:39] Future of RN Production [58:00] Rapid fire questions
undefined
Aug 3, 2022 • 50min

MLX: Opinionated ML Pipelines in MLflow // Xiangrui Meng // Coffee Sessions #112

MLOps Coffee Sessions #112 with Xiangrui Meng, Principal Software Engineer of Databricks, MLX: Opinionated ML Pipelines in MLflow co-hosted by Vishnu Rachakonda. // Abstract MLX is to enable data scientists to stay mostly within their comfort zone utilizing their expert knowledge while following the best practices in ML development and delivering production-ready ML projects, with little help from production engineers and DevOps. // Bio Xiangrui Meng is a Principal Software Engineer at Databricks and an Apache Spark PMC member. His main interests center around simplifying the end-to-end user experience of building machine learning applications, from algorithms to platforms and to operations. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Good Strategy Bad Strategy: The Difference and Why It Matters book by Richard Rumelt: https://www.amazon.com/Good-Strategy-Bad-Difference-Matters/dp/0307886239 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Xiangrui on LinkedIn: https://www.linkedin.com/in/mengxr/ Timestamps: [00:00] Introduction to Xiangrui Meng [00:39] Takeaways [02:09] Xiangrui's background [03:38] What kept Xiangrui in Databricks [07:33] What needs to be done to get there [09:20] Machine Learning passion of Xiangrui [11:52] Changes in building that keep you fresh for the future [14:35] Evolution core challenges to real-time and use cases in real-time [17:33] DevOps + DataOps + ModelOps = MLOps [19:21] MLFlow Support [21:37] Notebooks to production debates   [25:42] Companies tackling Notebooks to production [27:40] MLOoops stories [31:03] Opinionated MLOps productionizing in a good way [40:23] Xiangrui's MLOps Vision [44:47] Lightning round [48:45] Wrap up
undefined
Jul 30, 2022 • 49min

More than a Cache: Turning Redis into a Composable, ML Data Platform // Samuel Partee // Coffee Sessions #111

MLOps Coffee Sessions #111 with Samuel Partee, Principal Applied AI Engineer of Redis, More than a Cache: Turning Redis into a Composable, ML Data Platform co-hosted by Mihail Eric. This episode is sponsored by Redis. // Abstract Pushing forward the Redis platform to be more than just the web-serving cache that we've known it up to now. It seems like a natural progression for the platform, we see how they're evolving to be this AI-focused, AI native serving platform that does vector similarity, feature stored provides those kinds of functionalities. // Bio A Principal Applied AI Engineer at Redis, Sam helps guide the development and direction of Redis as an online feature store and vector database.    Sam's background is in high-performance computing including ML-related topics such as distributed training, hyperparameter optimization, and scalable inference. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links https://partee.io Redis VSS demo: https://github.com/Spartee/redis-vector-search Redis Stack: https://redis.io/docs/stack/ Github - https://github.com/Spartee   OSS org Sam co-founded at HPE/Cray - https://github.com/CrayLabs This paper last year was some of the best research and collaborations Sam has been a part of. The Paper is published here: https://www.sciencedirect.com/science/article/pii/S1877750322001065?via%3Dihub Do you really need an extra database for vectors? https://databricks.com/dataaisummit/session/emerging-data-architectures-approaches-real-time-ai-using-redis Blink: The Power of Thinking Without Thinking by Malcolm Gladwell,  Barry Fox,  Irina Henegar (Translator): https://www.goodreads.com/book/show/40102.Blink --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Mihail on LinkedIn: https://www.linkedin.com/in/mihaileric/ Connect with Sam on LinkedIn: www.linkedin.com/in/sam-partee-b04a1710a Timestamps: [00:00] Introduction to Samuel Partee [00:24] Takeaways [02:46] Updates on the Community [05:17] Start of Redis [08:10] Vision for Vector Search [11:05] Changing the narrative going from the "Cache" for all servers and web endpoints [14:35] Clear value prop on demos [20:17] Vector Database [26:26] Features with benefits [28:41] AWS Spend [30:39] Vector Database upsell model and bureaucratic convenience   [32:08] Distributed training hyperparameter optimization and scalable inference [35:03] Core infrastructural advancement [36:55] Tools movement to help [39:00] Using Machine Learning at scale in numerical simulations with SmartSim: An application to ocean climate modeling (published paper) [42:52] Future applications of tech to get excited with [44:20] Lightning round [47:48] Wrap up
undefined
Jul 29, 2022 • 52min

Just Fetch the Data and then... // David Bayliss // Coffee Sessions #110

MLOps Coffee Sessions #110 with David Bayliss, Chief Data Scientist of LexisNexis Risk Solutions, Just Fetch the Data and then... co-hosted by Vishnu Rachakonda. // Abstract Composing data to extract features can be a significant problem. Key factors are the data size, compliance restrictions, and real-time data. Ethics (and law) can drive extremely complex audit requirements. In the cloud, you can do anything - at a price. // Bio One of the creators of the world's first big data platform (HPCC);  David has been tackling big data problems for two decades. A mathematician, compiler writer, and data sponge with more than five dozen patents spanning platforms linking, and search. Most inventors think outside the box; David can't even remember where the box is. He leads the team that creates their core Data Science methods used by hundreds of data scientists. // MLOps Jobs board   https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Interesting insight in this post. Would be cool to learn from David about his view on things https://www.google.com/url?q=https://www.linkedin.com/posts/david-bayliss-426556a_datascience-platform-portability-activity-6913448643303759872-2dqq?utm_source%3Dlinkedin_share%26utm_medium%3Dmember_desktop_web&sa=D&source=calendar&ust=1649078059106132&usg=AOvVaw26wAevExeEfW_AdZSA8UhF --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with David on LinkedIn: https://www.linkedin.com/in/david-bayliss-426556a/ Timestamps: [00:00] Introduction to David Bayliss [01:03] Takeaways [04:56] LexisNexis and David's role [07:15] Evolution of LexisNexis in 20 years with so many use cases [08:51] Role of David in structuring data for working with data change [14:32] Data management and data access [17:45] Unique challenges of scale, use case, and diversity at LexisNexis [24:47] Tardis Iron Box [30:05] Iron Box translation [32:56] JVM for data science [34:24] Iron Box meaning [36:52] Metadata with PII [39:08] Detrimental privacy / Hairy Kneecap Theory [40:57] Speeding things up and Anonymized linking [46:47] What kept David working at LexisNexis? [50:30] Wrap up

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app