June 28, 2026
How to Prompt GPT Image Models
By Synthex
Image prompting gets much easier when you stop treating the prompt like a magic sentence.
A good image prompt is closer to a small creative brief. It tells the model what the image is for, what should be visible, what style it should follow, what must stay unchanged, and what should be avoided.
This guide is based on OpenAI's GPT Image Generation Models Prompting Guide. The practical idea is simple: the clearer the visual job, the easier it is for the model to make the right image without five messy retries.
You do not need to write huge prompts. You need to write prompts that make the important decisions visible.
What you'll learn
- How to structure a good GPT Image prompt.
- When to use
low,medium, orhighquality. - How to prompt for photorealistic images.
- How to get cleaner text inside generated images.
- How to edit images without drifting away from the original.
- How to use multiple input images.
- How to prompt for ads, infographics, product mockups, logos, and character consistency.
- What to check before using GPT Image models in a production workflow.
What this is really about
Most weak image prompts fail because they leave too much unsaid.
For example:
The model has to guess:
- What product?
- What surface?
- What lighting?
- What camera angle?
- What background?
- What mood?
- Is this for ecommerce, an ad, a pitch deck, or social media?
- Should there be text?
- Should the image feel realistic, premium, casual, playful, technical, or editorial?
A stronger prompt removes the important guesses:
That prompt is not complicated. It is just specific.
The basic prompt structure
Use this order when you are not sure how to start:
You can write this as a paragraph or as labeled sections. For production work, labeled sections are easier to maintain because you can adjust one part without rewriting the whole prompt.
The five things that matter most
1. Say the intended use
The model needs to know what kind of image it is making.
An ad, infographic, UI mockup, product photo, children's book illustration, and logo all have different visual rules.
Weak:
Better:
The second prompt tells the model the format, audience, and visual job.
2. Be specific about the subject
If the subject matters, describe it.
For objects, mention:
- Material.
- Shape.
- Color.
- Surface texture.
- Size relationship.
- Condition.
- Important details.
For people, mention:
- Framing.
- Pose.
- Gaze.
- Clothing.
- Action.
- Scale.
- What the person is interacting with.
Example:
Notice the phrase photorealistic. If you want something to feel like a real photograph, say that directly.
3. Control composition
Composition means where things sit in the image.
Good composition instructions include:
centered subjectfull body visibletop-down viewwide landscape shotclose-up macro detailnegative space on the leftlogo area top rightthree panels stacked vertically
If the image needs to work as a social graphic, ad, product listing, or blog thumbnail, composition matters as much as style.
Example:
Without placement instructions, the model may make a beautiful image that is awkward to use.
4. Say what must stay unchanged
This matters most for edits.
If you are editing an existing image, separate the change from the invariants.
Use this pattern:
Example:
This is the part beginners often skip. The model needs to know not only what to change, but also what to protect.
5. Use constraints without turning the prompt into noise
Constraints are useful:
no watermarkno extra textno unrelated logosno cartoon styledo not change the backgroundpreserve product label exactly
But a giant list of negatives can become hard to reason about.
Use constraints for the things that actually matter. If the output must preserve a label, identity, or layout, say that clearly. If the image just needs to be generally clean, keep the negative list short.
Choosing quality settings
GPT Image workflows usually involve a tradeoff between speed and fidelity.
Use this simple rule:
| Setting | Use when |
|---|---|
low | You need fast drafts, high-volume variants, internal previews, or quick experiments |
medium | You want a balanced default for normal production exploration |
high | You need dense text, polished infographics, close-up portraits, identity-sensitive edits, or final-quality output |
For many workflows, start with low or medium, then move to high only when the image needs extra precision.
Do not use the highest setting automatically. It can be slower and more expensive. The useful question is:
Does this image need more fidelity, or does it need a better prompt?
Often, the better prompt matters more.
Prompting for text inside images
GPT Image models are much better at text than older image models, but text still needs careful prompting.
If text must appear in the image:
- Put the exact text in quotes.
- Say it should appear once.
- Say it should be verbatim.
- Describe placement.
- Describe font style.
- Ask for strong contrast.
- Avoid asking for too much small text in one image.
Example:
For tricky brand names or unusual spellings, spell the word out letter by letter in the prompt. For small text, dense layouts, or multi-panel designs, use a higher quality setting and expect to review closely.
Prompting for photorealism
For realistic images, do not only say "realistic."
Describe the image like a real photo:
- Lighting.
- Lens or viewpoint.
- Surface texture.
- Natural imperfections.
- Framing.
- Depth of field.
- Material behavior.
- Everyday detail.
Good photorealistic prompts often include phrases like:
photorealisticreal photographnatural daylightrealistic contact shadowssubtle skin textureworn material detailunposed candid moment
Use camera language for the overall look, not as a guarantee of exact physics. A phrase like 50mm lens can help communicate a natural portrait feel, but it should not be treated as a precise simulation setting.
Prompting for infographics and diagrams
Infographics need structure more than mood.
Tell the model:
- The topic.
- The audience.
- The layout.
- The parts or steps.
- The hierarchy.
- Whether labels are needed.
- How much detail should appear.
Example:
For dense infographics, use medium or high quality and keep the text concise. If the graphic needs exact wording, write the wording explicitly.
Prompting for ads and marketing images
Ad prompts work best when they read like a creative brief.
Include:
- Brand or product context.
- Audience.
- Cultural feel.
- Scene.
- Composition.
- Tagline.
- Text constraints.
- What should not appear.
Example:
The model can make better creative choices when it understands the campaign, not just the objects.
Prompting for edits
Editing is where prompt discipline matters most.
Use this pattern:
Example:
The phrase replace only is useful. It reduces the chance that the model restyles the whole image when you wanted one contained change.
Prompting with multiple images
When using multiple input images, label them in the prompt.
Do not assume the model will infer which image is the product, which is the style reference, and which is the scene.
Use:
For compositing, specify:
- Which object moves.
- Which scene stays.
- Where the object should go.
- How scale, lighting, shadows, and perspective should match.
- What should remain unchanged.
Use cases that work especially well
GPT Image models are useful across a wide range of workflows.
| Use case | What to focus on |
|---|---|
| Product mockups | Preserve geometry, label clarity, shadows, and background intent |
| Virtual try-on | Preserve identity, pose, body shape, face, hair, and lighting |
| Logos | Ask for original, simple, scalable marks with strong silhouette |
| Infographics | Define structure, labels, audience, and hierarchy |
| Ads | Write like a creative brief with exact copy and audience context |
| Comics | Define panels, characters, sequence, and consistency |
| Interior edits | Change one object while preserving the room |
| Style transfer | Separate content to preserve from style to apply |
| Character consistency | Create a character anchor, then reuse it with strict continuity notes |
The common thread is control. Say what the image should become, but also say what should not drift.
Common misunderstandings
"Longer prompts are always better"
No.
Long prompts can work, but only when they are organized. A short structured prompt is usually better than a long pile of adjectives.
"The model should know what I mean by premium"
Maybe, but it helps to define it.
Instead of only saying premium, describe the visible cues:
- Clean composition.
- Soft realistic shadows.
- Tactile materials.
- Balanced negative space.
- Sharp product edges.
- Controlled color palette.
"Edits only need the new thing"
No.
For edits, the preserve list matters as much as the change. If you do not say what should stay the same, the model may treat the whole image as flexible.
"Text rendering means I can fill the image with copy"
Not always.
Text works best when it is short, explicit, high contrast, and placed clearly. Dense layouts need stricter prompting and closer review.
"One perfect prompt should solve everything"
Not usually.
The better workflow is to start clean, review the output, then make one small change at a time. If the image drifts, restate the critical constraints.
What to do first
Start with this simple workflow:
- Choose the use case: photo, ad, infographic, edit, mockup, logo, or character art.
- Write the subject and intended use.
- Add composition and lighting.
- Add style or medium.
- Add exact text only if text is needed.
- Add constraints and preserve rules.
- Generate at
lowormediumfor exploration. - Move to
highwhen detail, identity, or small text matters. - Review the output against the prompt.
- Iterate with one clear change at a time.
Here is a reusable template:
If you are editing an image, add:
Final takeaway
Good GPT Image prompts are not about sounding artistic.
They are about making the visual task clear.
Name the use case. Describe the subject. Control the composition. Choose the quality level intentionally. Put exact text in quotes. For edits, say what changes and what stays unchanged. For multi-image workflows, label each input and explain how they relate.
The model can handle a lot, but it should not have to guess the parts that matter most.
Further reading
- OpenAI GPT Image Generation Models Prompting Guide: https://developers.openai.com/cookbook/examples/multimodal/image-gen-models-prompting-guide (opens in new tab)
- OpenAI image generation guide: https://developers.openai.com/api/docs/guides/image-generation (opens in new tab)
- GPT Image 2 model page: https://developers.openai.com/api/docs/models/gpt-image-2 (opens in new tab)
- Image edits API reference: https://developers.openai.com/api/reference/python/resources/images/methods/edit/ (opens in new tab)
