
From Homework to Real Work: What Generative AI Means for Every Business - with Ori Goshen of AI21 Labs
The AI in Business Podcast
GPT Encoding Knowledge
The transformer architecture, known as the transformer model, has been around since it was first discussed in a 2017 publication by Maswani and the team at Google. When this technology is scaled with more compute power and data, it becomes capable of generating nuanced representations of words and concepts. By prompting the system with specific requests, such as asking for an essay about Napoleon or a poem in a certain style, the system can generate unique and impressive outputs. This process of encoding knowledge not only involves autocomplete-like completion of word sequences but also incorporates a large amount of word knowledge and useful information. These models can surpass the capabilities of most human beings, although there are still instances of errors. In a business context, it is important to train these models on a foundation of general knowledge, such as Wikipedia, and then fine-tune them for specific domains, like finance. Models specialized in certain domains are referred to as 'language blades' and provide both broad knowledge and domain-specific expertise.