How Does the Same “Space Fleet” Prompt Change? A Deep Comparative Analysis of DALL·E 3 / nanoBanana / Flux / Sora

Before comparing four image-generation models, here is the visual reference point.
This image was generated by DALL·E approximately one year ago, using an early version of the “Space Fleet” concept.

 

 

 

 

Baseline image generated by DALL·E (approx. one year prior to this experiment).


Introduction

The world of image generation AI has clearly moved beyond the simple question of
“Which one looks better?”

We are now entering a phase where the real comparison lies in
how each model interprets the world.

In this article, I feed the exact same JSON-formatted prompt—depicting a space fleet and a massive megastructure—into four major image-generation models.
By analyzing their outputs, I explore each model’s distinct tendencies and identify key control points for prompt design.


1. The Shared Prompt Used in This Experiment

To ensure a fair comparison, the following detailed specification (JSON) was provided to all models.

  • Main subject: A massive, wedge-shaped Star Destroyer
  • Composition: A low-angle view looking up through a colossal circular megastructure
  • Lighting: Strong backlighting from a nearby star, combined with blue ambient light from a nebula
  • Details: Fine surface greebles on the hull and multiple small starfighters in motion

2. Comparison: What Actually Changes Between Models?

After analyzing the four generated images, two major variables stood out.

① Representation of Scale

  • DALL·E 3 / nanoBanana
    These models rely on symbolic contrast. By placing small ships next to a massive one, they convey scale in a clear, immediately readable way.
  • Flux / Sora
    These models express scale through spatial density and atmosphere. Distant objects fade, structural details accumulate, and the viewer’s brain is subtly tricked into perceiving enormity.

② Physical Behavior of Light

  • Sora
    Leveraging its video-model heritage, Sora excels at physically plausible light behavior. Diffraction, indirect reflections, and light scattering along inner structures are particularly convincing.
  • Flux
    Flux focuses on material reflection. Highlights are sharp, metallic, and distinctly industrial, emphasizing the ship as a physical object.

3. Why Do These Differences Occur?

(Technical Background)

Why does the same instruction produce such different results?
The answer lies in each model’s origin and training focus.

ModelBackgroundImpact on Results
DALL·E 3LLM-centric language understandingStrong at producing a “correct” interpretation of user intent, but with standardized details
nanoBananaMultimodal editing and compositionExcels at dynamic layouts and dramatic color emphasis
Flux.1High-resolution, physically grounded trainingExceptional photographic realism and dense industrial detail
SoraVideo generation and physical simulationSuperior spatial coherence, depth, and light propagation

4. Analysis: What to Fix, and What to Let the Model Interpret

Through this experiment, several prompt-design insights became clear.

Elements That Should Be Fixed (Highly Prone to Drift)

  • Camera settings
    “Low angle” alone is too vague. Specifying values such as
    Wide-angle lens (14mm) or Looking up at a 45-degree angle significantly reduces variation.
  • Geometric definition
    “Star Destroyer” is conceptual. Adding geometric constraints like
    isosceles triangular hull helps prevent shape distortion.

Model-Specific Interpretation Biases

  • nanoBanana
    Always tries to produce a “cool hero shot,” often boosting saturation and contrast automatically.
  • Flux
    Fills any available space with parts and surface detail. Without constraints, complexity escalates quickly.
  • Sora
    Treats the scene as a 3D space, automatically applying depth-of-field effects based on camera distance.

Conclusion: Choosing the Right Tool

This comparison makes one thing clear:
the best model depends entirely on what you want to create.

  • Rapid concept visualization → DALL·E 3
  • Iterative composition control → nanoBanana
  • Industrial realism and density → Flux
  • Cinematic lighting and spatial depth → Sora

Image-generation AI is no longer just about automation.
Each model has evolved into a distinct creative partner, with its own strengths, assumptions, and worldview.


Thank you for reading.

コメントする

メールアドレスが公開されることはありません。 が付いている欄は必須項目です

CAPTCHA