Premium
This is an archive article published on February 29, 2024

Text within AI-generated images: ‘Ideogram’ gets it right every time

When it comes to nailing text in AI images, a small startup is outperforming heavy hitters like DALL-E 3.

ideogram featuredBig AI models still struggle with legible text in generated images. But a startup called Ideogram has quietly become a leader in this challenge. (Express image)

A lesser-known AI startup called Ideogram has seemingly become the leader in generating images with crisp, clear text – a key challenge plaguing even the most advanced AI image generators. This week, the company announced it has raised $80 million in a Series A funding round led by prominent AI investors, according to a Bloomberg report.

The news comes as the red-hot generative AI space continues to see fast innovation. In just the past few months, tools like OpenAI’s DALL-E 3 and Stability AI’s Stable Diffusion 3.0 have improved their ability to render legible text in generated images. But according to Ideogram CEO Mohammad Norouzi, his company’s latest software still has the edge.

Ideogram launched in August 2023 with the goal of solving the notorious text problem that has plagued AI image models. Even as these tools have become adept at generating amazingly realistic scenes and characters, any text included in images – from protest signs to t-shirt slogans – often appears warped beyond recognition.

Last fall, when Ideogram debuted its software, competitors like Midjourney, DALL-E 2, and Stable Diffusion struggled mightily to handle text. But the field has seen rapid advances since then. Stable Diffusion 3.0 focuses heavily on textual improvements, while DALL-E 3 can now produce some legible words and phrases in images.

Even so, Norouzi believes Ideogram’s unique approach outperforms rival models. The latest version boasts higher text accuracy rates overall, he says, and shows special skill at handling lengthy, complex sentences. Just look at this example from last year produced with the prompt “a photograph of an adorable kitten wearing a t-shirt with the words ‘ask me about my AI startup” on multiple image models.

ideogram cat Clockwise from top left: Ideogram, OpenAI’s DALL-E 2, Stability AI’s Stable Diffusion, and Midjourney. (Image: Bloomberg/Ideogram)

There’s a clear winner here.

The new software also includes a new feature called “magic prompt” that automatically expands on the written prompts users submit. For example, it might build on a simple phrase like “a cute pika with bumblebee antennae” by generating additional descriptive sentences about the pika’s stance, expression, and other details.

 

Latest Comment
Post Comment
Read Comments
Advertisement
Loading Taboola...
Advertisement