OpenAI Unveils More Powerful GPT-4 'Turbo' at First Devcon Event
- By John K. Waters
Today, at its first-ever developer conference, ChatGPT creator OpenAI unveiled an enhanced version of its flagship text-generating AI model, GPT-4. The new GPT-4 Turbo comes with knowledge of world events up to April 2023, a 128k context window that allows it to fit the equivalent of more than 300 standard book pages of text into a single prompt, and availability at a 3x cheaper price for input tokens and a 2x cheaper price for output tokens compared with GPT-4.
CEO Sam Altman announced the updates at the OpenAI DevDay conference in San Francisco. Speaking to a reported 900 attendees locally and remotely, Altman recounted the rocket rise of his company's GenAI software, from the launch of GPT roughly one year ago as a "low-key research preview," to the launch in the following March of GPT-4, and recently innovations with voice and vision capabilities.
"Today, we've got about 2 million developers building on our API for a wide variety of use cases doing amazing stuff," Altman said. "Over 92% of fortune 500 companies are building on our products. And we have about 100 million weekly active users now on ChatGPT."
The event focused on the developer community and new features that were the direct result of feedback from ChatGPT users, including:
- Function Calling, which lets users describe functions of their app or external APIs to models, and have the model intelligently choose to output a JSON object containing arguments to call those functions. As the company explained it in a blog post, this upgrade provides the ability to call multiple functions in a single message. Users can send one message requesting multiple actions, such as "open the car window and turn off the A/C," which would previously require multiple roundtrips with the model.
- JSON Mode, which ensures the model will respond with valid JSON. The new API parameter response format enables the model to constrain its output to generate a syntactically correct JSON object. JSON mode is useful for developers generating JSON in the Chat Completions API outside of function calling.
- Reproducible Outputs, which make the model return consistent completions "most of the time." This is a beta feature that's useful for such use cases as replaying requests for debugging, writing more comprehensive unit tests, and generally having a higher degree of control over the model behavior. (The company says it's using the feature internally for unit testing.)
- Log Probabilities: The company is launching a feature to return the log probabilities for the most likely output tokens generated by GPT-4 Turbo and GPT-3.5 Turbo in the next few weeks, which it says will be useful for building features such as autocomplete in a search experience.
OpenAI also announced the upcoming release of a new version of GPT-3.5 Turbo that supports a 16K context window by default. The new version supports improved instruction following, JSON mode, and parallel function calling. Developers can access this new model by calling gpt-3.5-turbo-1106 in the API, the company says. Applications using the gpt-3.5-turbo name will automatically be upgraded to the new model on December 11, while older models will continue to be accessible by passing gpt-3.5-turbo-0613 in the API until June 13, 2024.
Altman used the keynote stage to unveil the new Assistant API, which he characterized as his company's first step towards helping developers build "agent-like experiences" within their own applications. An "assistant," he explained, is a purpose-built AI that has specific instructions, leverages extra knowledge, and can call models and tools to perform tasks. The new Assistants API provides new capabilities such as Code Interpreter and Retrieval as well as function calling to handle a lot of the heavy lifting developers previously had to do themselves and enable them to build high-quality AI apps, he said.
The Assistants API is in beta and available to all developers starting today.
GPT-4 Turbo can now accept images as inputs in the Chat Completions API, which enables use cases such as generating captions, analyzing real world images in detail, and reading documents with figures. Altman pointed to BeMyEyes, which uses this technology to help people who are blind or have low vision with their daily tasks, such as identifying products in the grocery store.
Also, developers can now integrate DALL-E 3 GenAI image maker directly into their apps via the Images API. Companies like Snap, Coca-Cola, and Shutterstock have used DALL·E 3 to programmatically generate images and designs for their customers and campaigns, the company said in a blog post.
And developers can now generate human-quality speech from text via the text-to-speech API. The company's new TTS model offers six preset voices to choose from and two model variants.
Altman welcomed Microsoft CEO Satya Nadella to the keynote stage with a reminder of the "instrumental" role Redmond has played in the warp-speed evolution of OpenAI's technology. Microsoft famously invested $10 billion in the company.
"You guys have built something magical," Nadella said to Altman, adding that the shape of Azure, Microsoft's cloud platform "is changing rapidly in support of these models that you're building."
Altman asked Nadella what he saw in terms of the future of their partnership. "The systems that are needed as you aggressively push forward on your roadmap requires us to be on the top of our game," Nadella said, "and we intend to commit ourselves deeply to making sure you (attendees) as builders of these foundation models are not only the best systems for training and inference, but the most compute so that you can keep pushing forward on the frontiers because I think that that's the way we want to make progress."
"And so, our job number one is to build the best systems so that you can build the best models and then make that all available to developers," he added.
John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at email@example.com.