Ines Montani, Co-founder and CEO of Explosion, discusses the evolution of the web and machine learning, development of SpaCy, NLP vs. NLU, misconceptions of starting a software company, value of understanding business problems, labeling data, combining large models with specific models, evolution of Spacey and its goals, T-shaped vs Tree-shaped skills in software engineering, creating holiday special emojis as a data scientist, embracing an entrepreneurial spirit.
SpaCy evolved from an open-source project into a software company, offering practical machine learning and NLP solutions for industry use.
Explosion embraces the tree-shaped skills model, allowing individuals to excel in different areas and meet diverse startup needs.
SpaCy demonstrates versatility by being used in diverse fields like journalism, agriculture, and medical research.
Deep dives
Spacey: From Open Source Library to Software Company
Explosion, the company behind Spacey, started as an open source project and has evolved into a software company. They focus on providing a practical approach to machine learning and natural language processing (NLP) for industry use. The co-founders recognized the need for a library optimized for real-world usage and designed Spacey to meet that demand. The library combines different components and techniques, allowing users to customize and build NLP pipelines. Spacey has been used in various innovative ways, including detecting fake letters in the net neutrality debate and facilitating investigative journalism. The team follows a growth mindset and aims for sustainability, focusing on creating value and meeting the needs of their customers and users. They continue to develop and improve the library, with plans for more components and advancements in large language models.
T-shaped vs. Tree-shaped Skills
The traditional T-shaped skills model, where individuals have a strong base with additional specialty skills, is viewed as static. The concept of tree-shaped skills is preferred, where individuals have a solid base and skills that branch out in various directions. This dynamic representation better represents the constantly evolving nature of skills. At Explosion, the founders and team members have embraced the tree-shaped skills mindset, allowing them to adapt and excel in different areas. This approach complements the diverse skill sets required in a startup environment.
Innovative and Unexpected Use Cases of Spacey
Spacey has been used in a wide range of innovative and unexpected ways. One such use case involved detecting fake letters during the net neutrality debate. The library's natural language processing capabilities helped uncover fraudulent letters and reveal hidden patterns. Spacey has also been applied in fields like journalism, allowing journalists to explore large datasets and extract valuable insights. Additionally, Spacey has found utility in domains such as agriculture and medical research, showcasing its versatility and impact across diverse industries.
Difference Between Natural Language Processing and Natural Language Understanding
Natural Language Processing (NLP) encompasses the range of techniques and technologies used to analyze and process text data. It includes tasks such as information extraction, text generation, and more. On the other hand, Natural Language Understanding (NLU) refers to the goal of building systems that can understand and interpret human language. NLU leverages the tools and techniques of NLP to achieve this objective. While there is some overlap between the two, NLU specifically focuses on enabling systems to comprehend and respond to human language accurately.
The Gap Between Hype and Reality in Machine Learning and AI
The podcast episode discusses the growing hype surrounding large language models, such as GPT-4, and their perceived ability to solve various problems. While there is excitement surrounding the accessibility and capabilities of these models, it is important to understand the limitations. The guest emphasizes that the technology, although impressive, is not a one-size-fits-all solution. There are still fundamental problems that need to be addressed, such as information extraction systems, which have been reliably used in NLP. The guest emphasizes the need to have a clear purpose and goals when using these models, considering that specific cases may require other techniques to achieve optimal results.
The Value of Logic, Reason, and Trusting Instincts
The podcast episode also explores the importance of logic, reason, and trusting one's instincts in the machine learning field and in entrepreneurial endeavors. The guest reflects on the journey of self-discovery and taking responsibility for one's own life. They highlight the significance of aligning passion with work and having control over one's model and decisions. They also stress the need to acknowledge individual circumstances and the privileges that come with working in the machine learning industry. The podcast concludes that life can be challenging and unfair, but it is essential to strive for creating valuable solutions and doing things that are not terrible.
This episode features co-founder and CEO of Explosion, Ines Montani. Listen in as we discuss the evolution of the web and machine learning, the development of SpaCy, Natural Language Processing vs. Natural Language Understanding, the misconceptions of starting a software company, and so much more! Ines is a software developer working on Artificial Intelligence and Natural Language Processing technologies.
She's the co-founder and CEO of Explosion, the company behind SpaCy, one of the leading open-source libraries for NLP in Python and Prodigy, an annotation tool to help create training data for Machine Learning Models. Ines has an academic background in Communication Science, Media Studies and Linguistics and has been coding and designing websites since she was 11. She's been the keynote speaker at Python and Data Science conferences around the world.
Learning from Machine Learning, a podcast that explores more than just algorithms and data: Life lessons from the experts.