Tools for Creating AI Assistants

OpenAI released a tool that enables relatively easy fine-tuning of a base large language model. The development illustrates the tremendous speed with which AI systems are being developed. As recently as a few weeks ago (at the time this article was written), no such fine-tuning tool existed and the process of creating an AI assistant required months of effort instead of days.

[Click on image for larger view.] Figure 1: "An AI Assistant Fine-Tuning an AI Assistant" Created by the DALL-E Image Creation Assistant

The ChatGPT application from OpenAI can be used as-is to answer a broad range of general fact-based questions. However, there are many efforts underway to create specialized systems, often called AI assistants, that can work in specific problem domains. Examples include financial planning, sports wagering, medical vaccine creation and so on. The process of adding new information to an existing system to create an AI assistant is called fine-tuning the base large language model.

Brief Background
So, what does this mean to you? Some brief background is needed for you to understand the opportunities and challenges related to AI assistants.

A large language model (LLM) is a software system based on a neural network that uses transformer architecture. Examples include BERT (Bidirectional Encoder Representations from Transformers) from Google and GPT (Generative Pre-trained Transformer) from OpenAI. These systems are trained on huge amounts of existing text information such as the entire content of Wikipedia, thousands of books, news reports and so on. Creating an LLM is extremely expensive and requires advanced technical expertise.

Chat-GPT is an application built on top of the GPT-3.5 model. Chat-GPT accepts a question (sometimes called a prompt) in natural language form and responds. You can think of Chat-GPT as having roughly the capabilities of a 12-year-old child who has mostly mastered English vocabulary and grammar, and who is now ready to learn topics such as algebra and chemistry.

Programmatically fine-tuning a large language model from scratch is possible but extremely difficult. Efforts to make fine-tuning easier fall into two main categories. First, there are efforts to create software tools that automate much of the work involved in fine-tuning. A rough analogy is the way that an Excel spreadsheet automates mathematical calculations. This idea is the one related to the OpenAI fine-tuning tool. Second, there are efforts to do all the fine-tuning for a specific problem domain and then expose the assistant as a software service. An example is the Science Engine Copilot service from Microsoft (currently under development).

Fine-Tuning GPT-3
In late August 2023, the OpenAI company announced the release of a software tool (it has no fancy name) that can fine-tune a large language model to create a specialized AI assistant. The key ideas are best explained by example.

OpenAI has several (four at the time this article was written, but likely more as you read this) LLMs. The models differ in size and corresponding capability. The current recommended base large language model is named "gpt-3.5-turbo-0613." Two other OpenAI LLMs that can be programmatically fine-tuned are "babbage-002" and "davinci-002."

The first, and most difficult, step is creating new training data. Suppose you want to create an AI assistant that specializes in the history of chess. You would set up data like this:

 [{"role": "system", "content":
  "Kronstein is a chatbot for the history of chess."},

  {"role": "user", "content":
  "How many times has the French Defense been played in
     world championships?"},
  {"role": "assistant", "content":
  "The French Defense has been played 40 times, scoring
     3 wins and 19 draws."}

  {"role": "user", "content":
  "How many world champions have there been?"},
  {"role": "assistant", "content":
  "The consensus is 17 world champions since 1886."}


In general, more training data is better. Good results can often be obtained with as few as 100 training items. Each training item is limited to 4,096 tokens. A token is a word fragment, typically four characters, so 4,096 tokens is roughly 3,000 words, or about eight to 10 pages of text.

After the training data has been created, a base language model can be fine-tuned with Python language code that resembles:

import os
import openai
openai.api_key = os.getenv("(an OpenAI key)")

result = openai.File.create(
  file=open("chess.jsonl", "rb"),


Of course there are a lot of details, but the point is that all the heavy lifting is being performed behind the scenes by the software tools.

Wrapping Up
Dr. James McCaffrey from Microsoft Research is one of the Pure AI technical experts. McCaffrey commented, "The speed with which AI systems are being developed is remarkable. The only thing I can think of that comes close is the explosion of the Internet in the late 1990s."

He continued, "The emergence of tools to fine-tune a large language model is a significant advance."

McCaffrey added, "I don't think it's possible to predict all the direct and indirect consequences of the current AI revolution. But that said, it's likely that new opportunities will arise. Businesses and individuals who spot a niche AI opportunity and move quickly to exploit that opportunity will change things in ways that can't be easily foreseen."