OpenAI is launching a faster and cheaper version of the artificial intelligence model that underpins its chatbot, ChatGPT, as the start-up works to hold on to its lead in an increasingly crowded market.
During a live-streamed event on Monday, OpenAI debuted GPT-4o. It’s an updated version of its GPT-4 model, which is now more than a year old. The new large language model, trained on vast amounts of data from the internet, will be better at handling text, audio and images in real-time. The updates will be available in the coming weeks.
Asked a question verbally, the system can reply with an audio response in milliseconds, the company said, allowing for a more fluid conversation. In a demonstration of the model, OpenAI researchers and chief technology Mira Murati held a conversation with the new ChatGPT using just their voices, showing that the tool could talk back. During the presentation, the chatbot also appeared to translate speech from one language to another almost instantaneously, and at one point sang part of a story upon request.
“This is the first time that we’re making a huge leap in the interaction and ease of use,” Murati told Bloomberg News. “We’re really making it possible for you to collaborate with tools like ChatGPT.”
The update will bring a number of features to free users that previously had been limited to those with a paid subscription to ChatGPT, such as the ability to search the web for answers to queries, speak to the chatbot and hear response in various voices, and command it to store details that the chatbot can recall in the future.
The release of GPT-4o is poised to shake up the rapidly evolving AI landscape, where GPT-4 remains the gold standard. A growing number of start-ups and Big Tech companies, including Anthropic, Cohere and Alphabet’s Google, have recently pushed out AI models that they say match or surpass the performance of GPT-4 in certain benchmarks.
In a rare blog post on Monday, OpenAI chief executive Sam Altman said that while the original version of ChatGPT gave a hint for how people could use language to interact with computers, using GPT-4o feels “viscerally different.”
“It feels like AI from the movies; and it’s still a bit surprising to me that it’s real,” he said. “Getting to human-level response times and expressiveness turns out to be a big change.”
Rather than relying on different AI models to process different inputs, GPT-4o – the “o” stands for omni – combines voice, text and vision into a single model, allowing it to be faster than its predecessor. For example, if you feed the system an image prompt, it can respond with an image. The company said that the new model is two times faster and significantly more efficient.
“When you have three different models that work together, you introduce a lot of latency in the experience, and it breaks the immersion of the experience,” Murati said. “But when you have one model that natively reasons across audio, text and vision, then you cut all of the latency out and you can interact with ChatGPT more like we’re interacting now.”
But the new model hit some snags. The audio frequently cut out as the researchers spoke during their demo. The AI system also surprised the audience when, after coaching a researcher through the process of solving an algebra problem, it chimed in with a flirtatious-sounding voice: “Wow, that’s quite the outfit you’ve got on.”
OpenAI is beginning to roll out GPT-4o’s new text and image capabilities to some paying ChatGPT Plus and Team users, and is offering those capabilities to enterprise users soon. The company will make the new version of its “voice mode” assistant available to ChatGPT Plus users in the coming weeks.
As part of its updates, OpenAI said it’s also enabling anyone to access its GPT Store, which includes customised chatbots made by users. Previously, it was only available to paying customers.
Speculation about OpenAI’s next launch has become a Silicon Valley parlour game in recent weeks. A mysterious new chatbot caused a stir among AI watchers after it showed up on a benchmarking website and appeared to rival GPT-4’s performance. Altman offered winking references to the chatbot on X, fuelling rumours that his company was behind it. On Monday, an OpenAI employee confirmed on the social platform X, that the mystery chatbot was indeed GPT-4o.
The company is working on a wide range of products, including voice technology and video software. OpenAI is also developing a search feature for ChatGPT, Bloomberg previously reported.
On Friday, the company quelled some of the rumours by saying it wouldn’t imminently launch GPT-5, a much anticipated version of its model that some in the tech world expect to be radically more capable than current AI systems. It also said that Monday’s event wouldn’t unveil a new search product, a tool that could compete with Google. Google’s stock ticked higher on the news.
But after the event wrapped, Altman was quick to keep the speculation going. “We’ll have more stuff to share soon,” he wrote on X. – Bloomberg