Native AI Image Generation Comes to ChatGPT

ChatGPT now features native AI image generation via GPT-4o, delivering accurate multimodal outputs with ethical safeguards.

Native AI Image Generation comes to ChatGPT (Image via: Deltia's Gaming, OpenAI)
Native AI Image Generation comes to ChatGPT (Image via: Deltia’s Gaming, OpenAI)

At present, AI models all over the world are evolving constantly. Major players in the AI industry are hard at work to bring new updates and features to the plethora of AI models. OpenAI’s ChatGPT is no exception. Not too long ago, OpenAI introduced native AI image-generation capabilities to ChatGPT.

ChatGPT’s functionality will evolve by a margin with this feature. Interestingly, this new feature employs the GPT-4o model for its functionality. In short, enables users to generate images from the ChatGPT interface directly.

Well, that is certainly some great news in terms of technological advancements. Still, it is important to consider some angles. Basically, there are a lot of questions that need to be answered. Further discussions on accessibility, ethical considerations, and the future of AI-driven creativity would be pretty constructive. Not to worry, though. We plan to address all of that in this post. So strap in, and let’s get to it.

Key Features of Native AI Image Generation in ChatGPT

GPT-4o's Image Generation at work (Image via: OpenAI)
GPT-4o’s Image Generation at work (Image via: OpenAI)

By integrating multimodal capabilities, the GPT-4o model can now create photorealistic images. These images would be generated on the basis of user prompts. The level of detail of these generated images will be proportional to how detailed the prompts are.

There is a major difference between GPT-4o and its predecessor, DALL-E. The former is designed to handle up to 20 objects within an image. This marks a significant improvement over the usual limit of 8 objects seen in other models. That’s not all, though. Users can upload images that the AI model can edit and enhance. This feature demonstrates ChatGPT’s versatility when it comes to creative workloads.

The emphasis here is that OpenAI claims that the model has the ability to accurately render text and follow prompts. This accuracy is aimed at reducing the likelihood of generating unintended results. The source of this improvement is training the model on the joint distribution of online images and texts. This training is intended to enable the model to gain an effective understanding of the relationships between images and language.

Limitations and Ethical Safeguards

An invitation created by ChatGPT according to the user prompts (Image via: OpenAI)
An invitation created by ChatGPT according to the user prompts (Image via: OpenAI)

While GPT-4o is as capable as AI models get, OpenAI stresses that it is far from absolute perfection. They are working toward addressing the limitations of native AI Image Generation on ChatGPT. These improvements will only come with future updates.

As a wise uncle once said, “With great power comes great responsibility.” Ethical safeguards are of utmost importance to prevent the misuse of the capabilities of AI models. OpenAI has done the same to prevent the generation of harmful content. The content that is classified as harmful includes child sexual abuse materials (CSAM) as well as sexual deepfakes. Additionally, to ensure that the AI-generated images are accurately identified, Open AI has embedded C2PA metadata in each output.

However, debates have sparked around their decision to allow the generation of images that depict adult public figures. OpenAI has done quite well in addressing this issue to highlight the complexities of balancing creative freedom with privacy concerns. An addendum that was added later states that OpenAI offers adult public figures the option to opt-out. Those who will choose to opt-out will not be depicted in any images generated by GPT-4o.

Accessibility and Availability

A restaurant menu created by ChatGPT according to the user prompts (Image via: OpenAI)
A restaurant menu created by ChatGPT according to the user prompts (Image via: OpenAI)

Native AI Image Generation in ChatGPT was initially available to Plus, Pro, Team, and Free users. However, since then, OpenAI’s Sam Altman states that there has been a severely high demand, which they can’t meet as of yet.

“images in chatgpt are wayyyy more popular than we expected (and we had pretty high expectations). rollout to our free tier is unfortunately going to be delayed for awhile,” Sam Altman shared on X (formerly Twitter).

As a result, OpenAI has temporarily halted access for free users. They have also limited generation of certain content that was originally allowed (Studio Ghibli style images, we are looking at you). This situation is very reminiscent of when they rolled out Sora.

There has been no official announcement on the daily limit for free users so far. However, Taya Christianson, an OpenAI spokesperson, shared with The Verge that the free tier’s usage limit is the same as DALL-E (three a day). However, they went on to say that they,

“didn’t have a specific number to share,” and “these may change over time based on demand.”

Implications for the Future of AI Image Generation

The addition of native AI Image Generation to the list of features in ChatGPT definitely marks a step forward in AI’s ability to combine text with visual creativity. Compared to DALL-E, the integration of GPT-4o into ChatGPT results in more seamless and versatile AI interactions. As long as OpenAI works towards bringing more innovations while putting safeguards against malpractices in place, we should be golden.


We provide the latest news and “How To’s” for Tech content. Meanwhile, you can check out the following articles related to PC GPUs, CPU and GPU comparisons, mobile phones, and more: