Social media is abuzz with speculations about a major OpenAI announcement. This excitement follows the triumph of Meta’s Llama 3 (with a larger model slated for July) and a series of enigmatic images shared by the AI lab featuring the number 22. As April 22 coincides with OpenAI CEO Sam Altman’s 39th birthday, conjecture is rife that the company is poised to unveil something substantial, potentially Sora or even the long-awaited GPT-5.
If the latter is true and we witness the debut of a significant new AI model, it could mark a pivotal moment in artificial intelligence. Altman has previously hinted that it will be “significantly better” than its predecessor and will take the world by surprise. However, I anticipate a more conservative reveal, perhaps GPT-4.5 or an update to DALL-E, OpenAI’s image generation model. Nonetheless, let’s delve into what we currently know about GPT-5.
What do we know about GPT-5?
Information regarding GPT-5 is scant, as OpenAI has maintained a veil of secrecy surrounding its performance and capabilities. Altman has reiterated that it will be “materially better” in multiple interviews. Each iteration of OpenAI’s large language models represents a substantial leap forward in reasoning, coding, knowledge, and conversation abilities. GPT-5 will continue this trend.
In training since late last year, it will boast either a significantly higher parameter count than GPT-4’s 1.5 trillion or a comparably sized model with enhanced underlying architecture, enabling major performance gains without increasing the overall size. Similar to Meta’s Llama 3 70B, which performs at par with larger models, GPT-5 is expected to deliver impressive results.
Chat GPT-5 is likely to be multimodal, capable of processing input beyond text. While the extent of this capability remains uncertain, it may encompass text, image, video, speech, code, spatial information, and even music, akin to Google’s Gemini 1.5 models.
What will GPT-5 be able to do?
A notable departure with GPT-5 could be a shift from chatbot to agent functionality. This evolution would empower the AI model to delegate tasks to sub-models or connect with various services to execute real-world actions autonomously. While the implementation of agents may not materialize immediately, it aligns with the industry’s trajectory, especially with the proliferation of interconnected smart devices and systems. For instance, GPT-5 could handle routine tasks like grocery orders based on user preferences and smart fridge data, streamlining everyday life.
How different will GPT-5 be?
OpenAI may follow Google’s lead with Gemini, granting GPT-5 default internet access. This would ensure the model’s knowledge remains up to date beyond its training cut-off date. Enhanced multimodality could make interaction via voice, video, or speech the norm, potentially positioning Chat GPT-5 as a comprehensive smart assistant akin to Siri or Google Gemini. Additionally, expect a significant expansion of the context window, essential for processing complex data like video analysis.
Bring out the robots
The emergence of generative AI has spurred the development of humanoid robots equipped with AI brains, enabling autonomous task execution without exhaustive pre-programming. OpenAI’s investment in robotics startup Figure, coupled with the potential spatial awareness data in GPT-5, promises more capable and reliable robots, attuned to human interaction dynamics. Other players like Nvidia and AI21’s Mentee Robotics are also making strides in this domain, ushering in a new era where AI permeates our physical surroundings.
What this all means
As Yann LeCun forecasts, we’re on the brink of a paradigm shift where AI becomes integral to every facet of our digital lives. The advent of agents and enhanced multimodality in GPT-5 enables AI models to perform tasks independently, while robotics integration brings AI into tangible realms.
OpenAI faces mounting competition, yet its forthcoming releases, including possibly GPT-4.5 and Sora, signify continued innovation. Altman’s hints at a suite of exciting models and products suggest an eventful year ahead for OpenAI, reaffirming its pioneering role in the AI landscape.