In this video, we discuss fine-tuning transformer neural networks. We explain why fine-tuning is important and how it can be used to create bespoke models. We identify two specific problems that need to be solved: grabbing relevant text from the Internet and breaking it up into prompts and completions.
Click '>Play' above to discover how to build simple prompts and completions for fine-tuning GPT-3
To solve the second problem, we build a 'splitter' class in Python. We demonstrate how to split text into individual sentences and use every "n'th" sentence as a prompt, with the intervening sentences as completions. We also emphasize that manually populating an Excel spreadsheet with prompts and completions is not scalable and should be considered only as a last resort.
To solve the other problem (retrieving text information from the Internet and rather than display it in a browser, use it in one of our programs), we use Beautiful Soup, a popular Python library that allows users to extract HTML components of a web page instead of displaying them in a browser. We show how to use Beautiful Soup to extract text from a website, such as the "Quotes to Scrape" website or the Wikipedia page for Beautiful Soup.
Furthermore we provide a Python class that breaks down a string of text into its component sentences and groups them into prompts and completions. We use 'Pandas' (another useful python library that is used to manipulate tables of data in rows and columns called 'Dataframes'). The dataframe in our python class contains two columns, one for prompts and one for completions, making it easy to feed the data into a language model for fine-tuning or other applications.
We conclude by emphasizing the importance of fine-tuning AI models to achieve efficiency multipliers and scalability.
In the next episode, where we will look at more sophisticated ways of building prompts and completions using newsfeed APIs, audio, and video.