June 27, 2026
How to Use Codex Goals for Longer, Evidence-Based Work
By Synthex
Codex is easy to use for a small task.
Ask it to inspect a file, fix one bug, explain one error, or write one test, and the shape is simple: you ask, Codex works, you review the result.
The harder cases are different. A performance problem may need several attempts. A flaky test may need reproduction before repair. A research task may need a final report that separates confirmed evidence from uncertainty. In those cases, the problem is not always the first prompt.
The problem is that the objective gets fuzzy after a few turns.
This guide is based on OpenAI's Cookbook example on using Goals in Codex. The practical idea is simple: a Goal gives Codex a persistent objective for the current thread, so it can keep working toward a defined outcome instead of waiting for you to keep saying "continue."
What you'll learn
- What a Codex Goal is in plain language.
- When a Goal is better than a normal prompt.
- How to start, pause, resume, view, and clear a Goal.
- How to write Goals with a clear finish line.
- Why strong Goals need evidence, not just intention.
- How Goals differ from memory,
AGENTS.md, and automation. - What to avoid when a task is too vague or too small.
What this is really about
A normal prompt tells Codex what to do next.
A Goal tells Codex what should be true when the work is finished.
That difference matters.
If you ask:
Codex can try one repair, report what happened, and wait.
If you set:
Codex has a more durable target. It can reproduce the failure, inspect the test, try a contained fix, rerun the test, and decide whether the evidence is good enough.
A Goal is not "do work forever." A Goal is a scoped finish line for the current thread.
The best mental model is:
| Mode | What it means |
|---|---|
| Normal prompt | Do this next thing, then wait |
| Goal | Keep working toward this defined outcome until it is complete, paused, blocked, cleared, or stopped by budget |
Goals are most useful when the next step depends on what Codex learns while working.
What a Codex Goal is
A Goal is a persistent objective attached to a Codex thread.
That means it belongs to the conversation where the work is happening. It is not global memory. It is not a permanent project rule. It is not the same as AGENTS.md.
Use a Goal when the thread needs a clear completion contract:
- What should be true at the end.
- How Codex should check that it is true.
- What must not break along the way.
- What Codex should do if the work becomes blocked.
For example:
That gives Codex three useful anchors:
- Outcome: report generation below 2 seconds.
- Verification: the local benchmark.
- Constraint: export tests must still pass.
Without those anchors, Codex may improve something, but it has no reliable way to know whether the work is actually finished.
When to use a Goal
Use a Goal when the task has a finish line but the route is uncertain.
Good candidates:
- Performance tuning.
- Flaky test investigation.
- Bug hunts that require reproduction.
- Dependency upgrades with tests and follow-up fixes.
- Multi-step refactors with a clear verification command.
- Research work that needs a final evidence-backed report.
- Long audits where findings need to be checked against source material.
A Goal is useful when you would otherwise keep writing:
That kind of repeated instruction is a sign that the task needs a persistent objective.
When not to use a Goal
Do not use a Goal for every Codex task.
A normal prompt is better for:
- A one-line edit.
- A simple explanation.
- A short code review.
- A quick command.
- A single file cleanup.
- A question where you want one answer and then a stop.
Also avoid Goals when the finish line is too vague.
Weak:
Better:
The second version gives Codex something it can inspect. The first one mostly gives Codex a mood.
How to start and manage a Goal
Goals are available in Codex builds that support the /goal command. If the command is missing, update Codex first.
For the CLI, the OpenAI Cookbook example lists Goals as available starting in Codex 0.128.0. You can update and check your version with:
Or with Homebrew:
To set a Goal, use /goal followed by the desired outcome:
To manage the Goal:
| Command | What it does |
|---|---|
/goal | View the current Goal |
/goal pause | Pause the active Goal |
/goal resume | Resume a paused Goal |
/goal clear | Remove the current Goal |
The important part is control. A Goal gives Codex continuation context, but you can still pause, resume, or clear it.
How to write a strong Goal
A strong Goal is not long because it is fancy. It is specific because Codex needs to know what counts as done.
The strongest Goals usually include six parts.
| Part | Plain meaning | Example |
|---|---|---|
| Outcome | What should be true at the end | Reduce p95 latency below 120 ms |
| Verification surface | How Codex should prove it | verified by the checkout benchmark |
| Constraints | What must not regress | while keeping correctness tests green |
| Boundaries | Where Codex may work | use only the checkout service and related tests |
| Iteration policy | How to choose the next attempt | after each run, compare results and try the smallest defensible change |
| Blocked stop condition | When to stop and report | if the benchmark cannot run, report the blocker and next input needed |
Here is the basic pattern:
You do not need to use this exact grammar every time. The point is to define the finish line and the evidence.
Weak vs strong examples
Performance
Weak:
Strong:
Why it works:
- The target is measurable.
- The verification surface is named.
- The constraints are visible.
- Codex knows when to stop instead of guessing.
Flaky tests
Weak:
Strong:
Why it works:
- It asks for reproduction before repair.
- It protects the assertion from being watered down.
- It defines what "reliably" means.
- It has a clean blocked condition.
Documentation
Weak:
Strong:
Why it works:
- It names the final artifact.
- It defines what the page should cover.
- It asks Codex to check commands instead of inventing confidence.
Research
Weak:
Strong:
Why it works:
- It does not pretend every claim can be proven.
- It asks for a structured final artifact.
- It keeps uncertainty visible.
What changes when a Goal is active
When a Goal is active, Codex can keep the objective in view across turns.
That does not mean it should ignore you. It means the thread has a stronger sense of what it is trying to finish.
Three things change.
1. The objective stays visible
If a test fails, Codex can compare the failure to the Goal.
If a benchmark improves but misses the target, Codex can keep going.
If a research path hits missing data, Codex can adjust the evidence plan without losing the final report standard.
2. Continuation becomes structured
Goals are designed for idle continuation, not chaotic parallel work.
Codex should continue only when the thread is ready for more work. If there is queued user input, active work, or an interruption, the Goal should not bulldoze through it.
This matters because useful autonomy needs boundaries. Continuation should happen at safe points, after Codex has evidence from the previous step.
3. Completion has to be checked
A Goal should not be treated as complete because Codex feels done.
It should be complete because the objective was checked against concrete evidence:
- Tests passed.
- A benchmark reached the target.
- A build succeeded.
- A generated artifact exists.
- A report separates confirmed, approximate, blocked, and uncertain claims.
The evidence is what makes the Goal trustworthy.
Goals are not memory, AGENTS.md, or automations
These concepts are related, but they are not interchangeable.
| Feature | Scope | Use it for |
|---|---|---|
| Goal | Current thread | A persistent outcome for one long task |
| Memory | Cross-thread recall, when available | Stable preferences or context that may help future work |
| AGENTS.md | Project or folder guidance | Durable instructions Codex should follow in that workspace |
| Automation | Scheduled or recurring work | Running a known workflow later or repeatedly |
Use a Goal when the current thread needs to keep working toward a defined outcome.
Use AGENTS.md when the rule should apply every time Codex works in a folder.
Use memory for stable personal or project preferences, if memory is available and appropriate.
Use automation only after the workflow is already clear enough to run on a schedule.
A practical workflow for beginners
If Goals feel abstract, use this order.
Step 1: Write the task in normal language
Start messy:
Step 2: Ask Codex to turn it into a Goal
Use Codex to draft the Goal before activating it:
Step 3: Tighten the evidence
Before you start, check whether the Goal answers:
- What does done mean?
- How will Codex prove it?
- What should not change?
- What files, tools, or data are allowed?
- When should Codex stop and ask?
Step 4: Activate the Goal
Then paste the cleaned-up version:
Step 5: Review the evidence
Do not review only the final sentence. Review the proof:
- Which commands ran?
- Which files changed?
- Which tests passed?
- Which claims are still uncertain?
- Did Codex stay inside the boundaries?
This is where Goals become genuinely useful. They do not remove your review role. They make the review target clearer.
Common misunderstandings
"A Goal means Codex can work without supervision"
No.
A Goal gives Codex a persistent objective. It does not remove your responsibility to review changes, inspect evidence, and manage permissions.
"A longer Goal is always better"
No.
A strong Goal is specific, not bloated. If the Goal becomes a giant requirements document, Codex may struggle to tell which parts are essential. Keep it focused on outcome, evidence, constraints, and blockers.
"Goals are only for code"
Not necessarily.
They are especially useful for coding tasks because tests and benchmarks make verification easier. But the same pattern can help with research, audits, documentation, and file-heavy work when the final artifact has a clear evidence standard.
"If Codex reaches the budget, the Goal is complete"
No.
A budget limit is a stopping condition, not proof of success. If Codex runs out of budget or time, the useful output is a progress summary, blockers, and the next best step.
"A Goal can fix a vague task"
Only if you make the Goal less vague.
Make this better is still weak. A Goal should define what better means, how to check it, and what should stay intact.
What to do first
Try Goals on a contained task before using them on something large.
Good first Goals:
Avoid starting with:
That is too broad to audit.
Start with one narrow task, one verification surface, and one clear constraint. Let the Goal teach you the rhythm: work, check, continue, or stop honestly.
Final takeaway
Codex Goals are for work where the objective should persist longer than one prompt.
Use them when the task has a clear finish line, but the route to that finish line may require investigation. Write the Goal like a compact contract: outcome, evidence, constraints, boundaries, iteration policy, and blocked stop condition.
The point is not to make Codex run endlessly. The point is to keep the work tied to evidence until it is either complete or honestly blocked.
