Qwen-Image Model Shows ‘Significant Advances’ in AI Image Generation


eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Alibaba, the company behind the Qwen AI platform, has recently lifted the lid on Qwen-Image. 

The team at Qwen described Qwen-Image in a recent blog post by saying: “We are thrilled to release Qwen-Image, a 20B MMDiT image foundation model that achieves significant advances in complex text rendering and precise image editing.”

The model demonstrates strong performance in embedding multilingual text, including Chinese and English, while accurately generating visuals based on complex user instructions. 

Table of Contents

Demonstrating the platform’s capabilities

To showcase Qwen-Image’s range, the Qwen team asked the generative AI platform to create various images, each with granular instructions and complex requests.

First, the team asked Qwen-Image to create an image based on the anime art style of Hayao Miyazaki. The model successfully replicated Miyazaki’s distinct aesthetic while following the provided instructions.

After trying out a few different designs with both English and Chinese text prompts, the development team tested Qwen-Image’s ability to handle complicated, multi-step instructions. In one test, the model effectively produced bilingual outputs, embedding both English and Chinese text in a single image layout.

These early demonstrations highlighted how Qwen-Image can be used to create cartoonish art, realistic imagery, infographics, posters, and more.

Comparing the Qwen-Image’s performance against other AI models

Qwen-Image’s performance was also compared directly against other AI companies’ models in a variety of common benchmarks.

Image generation and editing benchmark scores(Source: Qwen-Image official benchmark report)
Qwen-Image GPT Image 1 (High) FLUX.1 Kontext (Pro) FLUX.1 (Dev) Seedream 3.0 Bagel
DPG 88.32 85.15 83.84 88.27
One-IG-Bench-ZH 0.548 0.474 0.528
One-IG-Bench-EN 0.539 0.533 0.434 0.530
ImgEdit 4.27 4.20 4.00 3.20
GEdit-CN 7.52 7.30 1.20 6.50
GEdit-EN 7.56 7.53 6.56 6.52
GSO 15.11 12.07 14.50 13.78
GenEval 0.91 0.84 0.66 0.84
Text rendering benchmark scores
(Source: Qwen-Image official benchmark report)
Qwen-Image GPT Image 1 (High) Seedream 3.0
LongText-Bench (ZH) 0.946 0.619 0.878
LongText-Bench (EN) 0.943 0.956 0.896
Chinese Word (ZH) 0.583 0.361 0.331
TextCraft (EN) 0.829 0.857 0.592
One-IG-Bench-Test (ZH) 0.963 0.650 0.928
One-IG-Bench-Test (EN) 0.891 0.857 0.865

While Qwen-Image is the clear winner on most benchmarks, it falls behind GPT Image 1 when it comes to rendering text in English. Nonetheless, given Alibaba’s strong domestic focus, Qwen-Image’s top-tier Chinese text performance further strengthens its appeal for users in multilingual environments.

Releasing Qwen-Image in a highly competitive landscape

Although the developers of Qwen-Image have already proven their platform’s artistic capabilities, they’re entering a highly competitive space of generative AI. Major players such as OpenAI’s DALL-E, Midjourney, Canva, Adobe Firefly, and Stable Diffusion currently dominate the visual AI market.

It remains to be seen how Qwen-Image will stack up against these established tools, particularly in regions and industries that benefit from bilingual support and open licensing.

Curious how AI is powering the next generation of digital creators? Dive into our roundup of free AI art tools making it easier than ever to bring your ideas to life.


Share this content:

I am a passionate blogger with extensive experience in web design. As a seasoned YouTube SEO expert, I have helped numerous creators optimize their content for maximum visibility.

Leave a Comment