Self-Improvement
Automatically improve your prompts based on test results.
Prompts Self-Improvement
Once you've defined tests for your prompts, use the improve command to automatically fix failing tests. The CLI analyzes failures and suggests improvements to make your prompts work better in practice.
The improve command works best with capable models like gpt-4o or claude-3.5-sonnet. Weaker models may struggle to generate valid JSON suggestions.
See Testing Prompts to learn how to add tests.
Auto-Improving Prompts
The improve command automatically fixes failing tests:
# Improve all prompts with failing tests
npx @nudge-ai/cli improve
# Control iterations
npx @nudge-ai/cli improve --max-iterations 5
# Target specific prompts
npx @nudge-ai/cli improve --prompt-ids summarizer,analyzer
# Show detailed analysis
npx @nudge-ai/cli improve --verbose
# Use LLM judge for string assertions
npx @nudge-ai/cli improve --judgeHow Improvement Works
- Analyzes failing tests
- Requests suggestions from the AI to fix issues
- Applies the suggestions to the generated prompt
- Re-runs tests to verify improvements
- Repeats until tests pass or plateau
The process is iterative—it stops when all tests pass, max iterations reached, or no further improvements are possible.
Output
The improve command shows:
Iteration 1/3 for "summarizer"
1 failing test:
Input: "The quick brown fox..."
Expected: output.length < 100
Got: "A fox is an animal that jumps..."
Analyzing failures and generating improvements...
Suggested Changes:
+ Added: "keep responses under 100 words"
Results:
✓ Test: "The quick brown fox..." (fixed)
✓ summarizer: improved in 1 iteration(s)Changes from improve are made to prompts.gen.ts only. Run npx @nudge-ai/cli generate to reset, or apply source hints to your .prompt.ts files for permanent changes.
Workflow Example
Here's a typical testing and improvement workflow:
1. Add tests to your prompt
First, define tests in your prompt file. See Testing Prompts for details on test types and syntax:
// src/translator.prompt.ts
export const translator = prompt("translator", (p) =>
p
.persona("professional translator")
.input("text to translate to {{language}}")
.output("translated text")
.do("preserve meaning and tone")
.test(
"The meeting is at 3 PM tomorrow.",
(output) => output.length > 0,
"Should produce output"
)
.test(
"Hello, how are you?",
"should be a natural greeting in the target language",
"Natural translation"
)
);2. Generate prompts
npx @nudge-ai/cli generate3. Evaluate tests
npx @nudge-ai/cli eval --judgeSee which tests fail.
4. Auto-improve
npx @nudge-ai/cli improve --verboseWatch the AI fix failing tests iteratively.
5. Apply suggestions
The improve command suggests changes to your source file:
💡 Source Hint: Consider these changes in translator.prompt.ts:
add: .constraint("preserve punctuation and formatting")
Reason: Tests show lost punctuation
modify: change "preserve meaning and tone" to "preserve exact meaning, tone, and style"
Reason: Tone-specific issues in translationApply these to your .prompt.ts file, then regenerate.
CLI Options Reference
improve
| Option | Default | Description |
|---|---|---|
--max-iterations <n> | 3 | Maximum improvement iterations per prompt |
--prompt-ids <ids> | — | Comma-separated list of specific prompt IDs to improve |
--verbose | — | Show detailed improvement analysis |
--judge | — | Use LLM to evaluate string assertions |
See Testing Prompts for the eval command options.
Best Practices
Run eval first to understand what's failing before running improve. Use --verbose to see detailed analysis.
- Iterate gradually: Use improvement in small batches rather than all at once
- Review suggestions: Check the source hints carefully before applying them to your
.prompt.tsfiles - Re-baseline: After improvements, regenerate and run tests again to ensure changes stuck
- Version control: Commit your
.prompt.tsfiles and suggested changes separately from generated files