OpenAI Launches ChatGPT Images 2.0: Multi-Image Generation and Advanced Text Rendering Now Live
"Images 2.0 brings an unprecedented level of specificity and fidelity to image creation." — OpenAI
OpenAI has released ChatGPT Images 2.0, an updated image generation model available globally to all ChatGPT and Codex users as of Tuesday. The model introduces multi-image generation from a single prompt, improved text rendering in multiple languages, and access to reasoning capabilities for more detailed outputs.
Availability and Pricing
The model is available to all ChatGPT and Codex users. Paid subscribers have access to a more powerful version with advanced outputs. OpenAI will also make a gpt-image-2 API available, with pricing dependent on output quality and resolution.
Key Capabilities
- Multi-image generation: The model can generate multiple images from a single prompt, including complete documents such as study booklets and multi-paneled comic strips.
- Text rendering: Improved accuracy for rendering text, including non-Latin scripts such as Japanese, Korean, Hindi, Bengali, and Chinese.
- Resolution and detail: The model can render fine-grained elements including small text, iconography, and UI elements at up to 2K resolution.
- Reasoning integration: OpenAI describes the model as having "thinking capabilities," allowing it to search the web, generate multiple images from a single prompt, and review its own creations.
- Customization: Users can set aspect ratios ranging from 3:1 to 1:3.
- Generation time: Generating complex images, such as multi-paneled comics, takes a few minutes.
Performance
Early testing showed improved text rendering accuracy compared to previous models. In one test, the model generated an infographic with accurate weather details and recognizable landmarks for San Francisco. OpenAI has focused on reducing errors in text within images, a common issue in earlier image generation models.
Technical Background
Previous AI image generators using diffusion models have historically struggled with accurately rendering text. Asmelash Teka Hadgu, founder and CEO of Lesan AI, explained in 2024 that in diffusion models, text constitutes a very small part of the image pixels, so the model prioritizes learning broader visual patterns.
Researchers have explored other mechanisms for image generation, such as autoregressive models, which function more like large language models (LLMs) by making predictions about image content. OpenAI declined to specify the type of model powering ChatGPT Images 2.0 during a press briefing.
The model's knowledge is current through December 2025, which may affect its accuracy for prompts involving more recent events.
Industry Context
Major AI companies releasing new image models often boost usage and social media trends. Last year, Google's Nano Banana model gained popularity for hyperrealistic figurines. Earlier this year, ChatGPT Images saw viral use for AI-generated caricatures.