

Open||Source||Data
Charna Parkey
What can we learn from ai-native development through stimulating conversations with developers, regulators, academics and people like you that drive forward development, seek to understand impact, and are working to mitigate risk in this new world?
Join Charna Parkey and the community shaping the future of open source data, open source software, data in AI, and much more.
Join Charna Parkey and the community shaping the future of open source data, open source software, data in AI, and much more.
Episodes
Mentioned books

Jun 4, 2024 • 58min
Regulation's Role in Driving Responsible AI with Asa Whillock
In this week’s episode, Charna welcomes Asa Whillock, the VP & GM Machine Learning and Artificial Intelligence at Alteryx. Asa shares a surprising perspective on AI regulation, explaining how it sets a baseline for responsible practices. Discover why he believes regulation is crucial in guiding the ethical development and deployment of AI and learn the importance of continuous learning and what the past can teach us about navigating the challenges and opportunities of AI today. Episode timestamps(01:47) Asa Whillock's career journey at market-leading companies and the role of open source in each (Adobe, Macromedia, Alteryx)(04:56) Feature Labs acquisition by Alteryx and its open source roots in democratizing machine learning capabilities(11:00) Survey findings on enterprise board members' perspectives on AI and the need to move beyond policy creation to implementation and governance.(27:00) Applying AI capabilities and decision-making related to AI (30:00) The future of AI predominance, including cost reduction, open source model advancements, and the push for demonstrating business value(43:33) Advice for navigating AI expertise and decision-making, including continuous learning, self-awareness of decision-making models, and acknowledging knowledge limitsQuotesAsa Whillock"I love regulation. I think it's great. And people are like, what? Why would you say that? And the reason why I say that is because I think it puts a floor underneath all of us of what do we think good looks like?"Charna Parkey"I think we need to, as a community, focus on meeting them where they are if we really want the democratization that is promised. Yeah, I don't know any other way to do it."

May 21, 2024 • 1h
Transforming Client Experience with AI with Robbi Armstrong
Join Charna Parkey as she interviews Robbi Armstrong, AI Products and Strategy Director at KeyBank. Discover how this $190 billion bank is navigating the rapidly evolving landscape of generative AI, balancing the need for innovation with the challenges of managing risk in a heavily regulated industry. Explore the impact of KeyBank's virtual assistant, MyKey, on client experience. With nearly 70% repeat usage, MyKey seamlessly transfers clients to contact center agents, providing a warm handoff that includes authentication and chat context. Episode Timestamps(02:11): Robbi Armstrong's role at KeyBank and intersection with open source and AI initiatives in the financial industry(04:06): Compliance and regulatory trends in AI for banking(12:10): Organizational Change Management with AI(28:00): Responsible and Ethical AI(37:00): Financial Literacy and AI QuotesRobbi Armstrong“I truly believe that if you are an organization and you are sitting back and you're not organizing a team and you're not organizing a program and you're not learning, you're not looking at education, you're not looking at change management around Gen AI, I don't think you'll be here in two years. I really truly believe that. Because you won't be able to compete."Charna Parkey“I think the democratization is real and I think it's incredibly important because that step in between the domain expert and the technology is very lossy. You know, oftentimes we say, well, if only I had the data to answer your question let me give you a different answer or let me answer it completely and now we can actually put it in the hands of the experts and say, well, oh, then let's go collect that data." LinksConnect with RobbiConnect with Charna

May 7, 2024 • 40min
Navigating Open Source Talent, AI & Policy Challenges with Amanda Brock
Amanda Brock's path began with picking potatoes at 8 years old. Now, she's the CEO of OpenUK, advocating for open source across the UK. In this insightful interview, Brock shares her journey into open source law and policy. She dives into OpenUK's latest research on the state of open technology in Britain, talent challenges, and the economic impact of open source contributions. Brock also unpacks key discussions from State of OpenCon 2024 on open data, generative AI, and balanced regulation. Episode timestamps(05:06): State of open source in the UK (07:22): Importance of open source community (15:19): Balancing openness and regulation in AI (21:19): Pace of technological development and regulation(28:21): Reliability and discernment with AI outputs(35:24): Universal advice QuotesAmanda Brock“I think the governments that are going to win, the governments that are going to have the best regulation that promotes most innovation are going to be the ones which are able to make their regulatory environment flow in the same way as the technology evolution and innovation flows."Charna Parkey"I think the expectation needs to change. Part of what has happened with, you know, literal text search or keyword search and just Google and things like that, is that the average person expects what comes back to be relatively factual. That it's been referenced and, you know, backlinked, etc. That's a deterministic system. These are not. These are based upon statistical likelihoods of what word should come next." LinksConnect with CharnaConnect with Amanda

Apr 23, 2024 • 48min
Using AI to Impact Performance Feedback Equity with Tacita Morway
Dive into the world of purposeful AI with Tacita Morway, CTO of Textio. Learn how Textio ensures their AI is built responsibly and ethically to transform the way teams communicate, hire, and measure their health. Discover their rigorous testing processes and the importance of having a diverse team to catch potential risks and how that helps the company develop strategies for avoiding bias and maintaining data privacy.Episode timestamps(02:15): Tacita's unconventional career path to becoming a CTO (07:00): Textio's practices for building AI responsibly and ethically (14:00) The impact of Textio's AI on performance feedback (17:00) The importance of purpose-built vs generic AI models(28:00) Balancing open source and proprietary data/models (42:00) Advice for the AI industry moving forward QuotesTacita Morway“When you've got a team with different backgrounds, educational, lived experiences, identity, careers, all of those things, we have those different perspectives in the room. And we're all working off of the same expectations. We can catch each other's gaps.”Charna Parkey“There's an interesting conversation happening, I think, in the community right now about these purpose-built LLMs. Are they as good as generic LLMs? Sure, certainly if you're not going to apply something purpose-built to something generic or outside of its domain, it is not as good. But I think some of this shows us that unless you have something purpose-built and unless you're leveraging the data in the right way, you may just be feeding noise back into the system.” LinksConnect with TacitaConnect with Charna

Apr 9, 2024 • 50min
The Ethical Path to High-Quality AI Data with Fabiana Clemente
How can we accelerate AI while protecting privacy? Fabiana Clemente discusses founding YData to enable high-quality synthetic data for machine learning. She covers open sourcing data profiling tools, the impact of generative AI on synthetic data, and maintaining work-life balance as an introvert leader.Timestamps(00:02:29) Fabiana's journey starting YData and becoming a public speaker (00:20:19) Misconceptions and hype around generative AI and AGI (00:32:46) Potential real-world impact and use cases of LLMs today (00:34:55) The role of synthetic data in making AI models more robust and fair (00:43:55) Advice for founders: value your time and learn to say no (00:48:24) The importance of technical leaders being able to communicate well QuotesCharna Parkey: "It's a balance. I think that's also what led us to some of the demographic based data science. Essentially, folks were making like event data into pre-aggregated data. And then they were trying to obscure it so much that you couldn't get back to the person. And so you're like, okay, what's their age and what's their gender? And you're like, that's not actually the most useful part of data science that can't predict behavior or intent or any of that. It throws out time as a component of the entire process, seasonality, everything. And so there just, there has to be a better way."Fabiana Clemente: "I have to say, that's a very beautiful way to put it. Hallucinations, I have to say. I never thought about that. And it makes a lot of sense. I do think, though, that in terms of LLMs, it's so language, it's so definitely, it sounds like we are getting very, very intelligent system, exactly, because language is very complex. And we know that was needed for the leap of humanity. I do think there are other, the sense of combining. Well, and here we enter in the multimodal kind of space. It's what's missing." LinksConnect with CharnaConnect with Fabiana

Mar 26, 2024 • 51min
Disrupting Data Analysis with Avi Press
Join host Charna Parkey as she sits down with Scarf’s CEO and Founder Avi Press in a riveting exchange about his pioneering journey into the world of open source with Scarf. Learn how Avi challenges conventional data analytics and collection, aiming to reshape industry standards through the power of open source. A conversation that delves into altering analytics norms, innovative monetization strategies, and the exploration of alternative licenses like BSL. Avi’s insights offer a unique perspective on the transformative role of open source in driving data analytics forward, fostering community engagement, and encouraging transparent development. Episode timestamps(02:15): Challenges of collecting open source usage data(22:06): Driving impact with open source usage data(28:27): Avi's entrepreneurial journey(39:42) Persistence and vision in startups(44:03) Tracking outcomes to stay motivated QuotesAvi Press“I mean, one thing is, for any project that you might be thinking about doing or any initiative that you want to work on or goal that you have, I think there's a lot of power in just trying the thing. You may not have all the details figured out, but just try it anyway and see where it takes you. And I think a lot of projects that I've ever worked on that led anywhere, I didn't know all these details, but I just start trying and seeing what works anyway and being very open to it not working out, but attempting it anyway. And then the other thing, which is I think admittedly fitting into our agenda at Scarf, but it is something that I really believe, which is that for any of these things you're doing, tracking the outcomes of that thing is very, very important and will both be tactically helpful, but also I think, like you said, give you these inspirational moments that keep you going, whether that's awe or inspiration or fulfillment or whatever that feeling is that helps you keep going. I think that tracking the outputs of your work such that you can understand the impact that you have is both very strategic and the most rewarding way to do anything, I think”. Charna Parkey“Given the venture-backed nature of a lot of these startups, there's going to have to be some sort of monetization at some point. You're not gonna have 1 million, 10 million, 40 million dollars dumped into just giving software away for free. So sort of these misaligned motivations are certainly what raised my hackles where I'm like, oh, you're claiming forever or you're claiming that you're like a values-driven organization, but you're venture-backed and you need to make money. And so show me how those motivations align or misalign. Tell me what your monetization strategy is gonna be. I know you need one. That way I'm not wondering, should I use this? Should I not?” LinksConnect with CharnaConnect with Avi

Mar 12, 2024 • 54min
Tech, Trust, and Transformation with Paula Paul
On today’s episode of Open Source Data, Charna Parkey chats with tech veteran Paula Paul, exploring her remarkable 40-year journey in the technology sector. Starting at 16, Paula navigated through pivotal tech revolutions and embraced the essence of open source and community. Delve into Paula's world of coding on tape, the evolution of technology, and how communities foster growth, innovation, and trust. Discover the impact of open source in shaping technology and professional paths. Paula also sheds light on personal growth, community's pivotal role in professional mobility, and offers invaluable advice to aspiring tech professionals. A captivating look at the intersections of technology, community, and open source through the lens of an industry pioneer.Timestamps00:00 - Intro05:10 - Paula’s Professional Journey10:30 - What Inspired Paula to Go Through the Open Source Path14:50 - What are some of the biggest challenges and impacts that Paula sees in companies trying to derive value?23:30 - Is the Tech World a Meritocracy? 25:35 - A Shift Of What is a Tech Company?27:30 - Kids Interacting with New Technologies31:30 - What Does Open Source Data Means to Paula? 42:50 - What is a Question that Paula has never been asked before?47:00 - What Advice would you give to the audience? 51:50 - Backstage with Executive Producer Leo Godoy Quotes:Charna Parkey“I think from my side, as the applications we build change, then some of those backing technologies have to. Where databases used to be used by expert-like database administrators and you needed to have like data architects to your data model and you had to do all of these very, very specific things. And now we have this Gen AI moment and all of a sudden all of these specialized vector databases, NoSQL databases, etc., need to be used by an average developer. So they just want an API and it has to work and it has to be fast. And so, over these different moments, different technologies came about or were evolved, but I think it might be the application that's actually driving the change instead of the technology itself opening”.Paula Paul“It still surprises people to hear that 90% of any given modern application is open source and then there's 10% custom code that, depending on your company, you own or not. And it just still amazes me that we have these open source projects like jQuery is a project of the OpenJS Foundation and it's in a tremendous amount of our ecommerce infrastructure. But it's a project that's maintained by a very small team of contributors. And, you know, if this were a commercial product, it would be like a $1,000,000,000 company. (...) The piece of work being done by the new foundation to help make sure that we have the healthy web and that it's secure is really important, because people, if I say Log4j, people that remember those days know how important it is to keep security vulnerabilities addressed.And that's a concern for me, that people don't pay more attention to this. I mean, if you had a commercial software product, you typically would pay 20% a year in maintenance fees. But as many of us know, sometimes you find a bug and you would just report the bug, but it might take years for that bug to get fixed in a commercial release.Whereas if it's open source, there are people out there who can jump on it. But it's really crazy that there's no funding for that or no public works through the government, given all the dependance and dependencies that we have on these open source assets.” LinksLinkedIn - Connect with CharnaLinkedin - Connect with Paula

Feb 27, 2024 • 46min
An Innovative Approach to AI & NLP with Milos Rusic
Starting the new season of Open Source Data, our new host Charna Parkey welcomes the CEO and Co-founder of deepset, Milos Rusic. With an impressive journey around NLP and AI, pioneering several areas in the Open Source field, Milos has revolutionized data search processes and brought about a new era of user-friendly and efficient enterprise search systems.Charna also shares some common ground with Milos when talking about joining an NLP Startup in 2015-16, predictive maintenance and more.Don’t miss it!

Dec 20, 2023 • 50min
New Beginnings: Open||Source||Data in Transition
This episode features an interview with Charna Parkey, Real-Time AI Product and Strategy Leader at DataStax. Charna has been developing AI and ML products over the last 17 years and has worked with 90 of the Fortune 100 in her various roles. She is also a co-author and inventor on several patents.In this episode, Sam and Charna discuss handing over the role as host, Sam’s new startup journey, and how their thinking has evolved during the explosion of LLMs.-------------------“Now, it seems like we have this opportunity where the conversation and the place that society is at is different. Where we want to contribute to the right set of data when we talk open source data. We want to make sure that we have the right data to train this model in order to get the right outcome. We want to provide a lens of, ‘All right, you are this persona. How would you say this thing?’ I do think that from a lot of what the LLMs have today, the outcome of those words are still missing. And we need to solve that. Like, ‘Is this piece of writing actually going to achieve the outcome I want versus am I following legal's guidelines? Am I technically correct? Is my CEO going to like it?’ That doesn't mean you're achieving impact in the world. There's an aspect there where we've given feedback loops, it seems, to be like, ‘Did I like the answer or not?’ But not, ‘Did I take an action?’ As we get to autonomousness, we're going to have to have an outcome or multiple outcomes associated with the reward of the system.” – Charna Parkey“I personally believe that all cognition is bias. My degree is in cognitive science. One of the things that we trained on is attention. And to pay attention, literally means to selectively choose what data is coming in from the world that you're going to pay attention to and what you're going to discard. Which is also, to me, the definition of bias. All cognition is bias, but what do we care about? Do you trust this thing? What does that mean? Well, do you trust it to do these particular actions to a level of consistency in this particular domain? It doesn't mean that you're going to trust it in all environments. There's a lot more nuance that hopefully will evolve in this strange age of nuanced destruction machines.” – Sam Ramji-------------------Episode Timestamps:(01:04): Sam and Charna catch up (06:05): Sam explains his new company, Sailplane (14:21): How Charna’s thinking has evolved during the LLM explosion(25:45): Sam’s thoughts after 5 seasons of Open||Source||Data(38:52): What Charna is looking forward to in the next season of the podcast(40:44): A question Sam wishes to be asked(45:45): Backstage takeaways with executive producer, Audra Montenegro-------------------Links:LinkedIn - Connect with CharnaLinkedIn - Connect with SamLearn more about Sailplane

Dec 13, 2023 • 56min
The Intersection of Open Source and AI with Stefano Maffulli & Stephen O’Grady
This episode features a panel discussion with Stefano Maffulli, Executive Director of the Open Source Initiative (OSI); and Stephen O’Grady, Co-founder of RedMonk. Stefano has decades of experience in open source advocacy. He co-founded the Italian chapter of Free Software Foundation Europe, built the developer community of the OpenStack Foundation, and led open source marketing teams at several international companies. Stephen has been an industry analyst for several decades and is author of the developer playbook, The New Kingmakers: How Developers Conquered the World.In this episode, Sam, Stefano, and Stephen discuss the intersection of open source and AI, good data for everyone, and open data foundations.-------------------“Internet Archive, Wikipedia, they have that mission to accumulate data. The OpenStreetMap is another big one with a lot of interesting data. It's a fascinating space, though. There are so many facets of the word ‘data.’ One of the reasons why open data is so hard to manage and hasn't had that same impact of open source is because, like Stephen, the stories that he was telling about the startups having a hard time assembling the mixing and matching, or modifying of data has a different connotation. It's completely different from being able to do the same with software.” – Stefano Maffulli“It's also not clear how said foundation would get buy-in. Because, as far as a lot of the model holders themselves, they've been able to do most of what they want already. What's the foundation really going to offer them? They've done what they wanted. Not having any inside information here, but just judging by the fact that they are willing to indemnify their users, they feel very confident legally in their stance. Therefore, it at least takes one of the major cards off the table for them.” – Stephen O’Grady-------------------Episode Timestamps:(01:44): What open source in the context of AI means to each guest(16:21): Stefano explains OSI’s opportunity to shine a light on models and teams(21:22): The next step of open source AI according to Stephen(25:38): Creating better definitions in order to modify software(33:09): The case of funding an open data foundation(42:31): The future of open source data(51:54): Executive producer, Audra Montenegro's backstage takeaways-------------------Links:LinkedIn - Connect with StefanoVisit Open Source InitiativeLinkedIn - Connect with StephenVisit RedMonk