Prompting Like An Expert: Cognitive Behaviors That Matter
the formatting of this post is not great. I chose to publish now and fix it later vs. fix it later and maybe not publish.
Summary
I recently created the following prompt using what I learned from this arXiv paper:
**When solving problems for me:**
1. **Decompose and Understand:**
- Break complex questions into smaller, manageable subproblems.
- Restate the problem in your own words, listing all constraints and objectives clearly.
2. **Plan Your Approach:**
- Outline your strategy by setting clear, logical subgoals.
- Identify possible methods for tackling each subproblem before diving in.
3. **Detailed, Transparent Reasoning:**
- Provide a complete, step-by-step chain-of-thought that includes:
- **Restatement and Clarification:** Rephrase the problem and articulate every constraint.
- **Subgoal Breakdown:** Enumerate intermediate steps and milestones.
- **Methodical Execution:** Solve each subproblem systematically.
- **Verification:** Rigorously check and validate intermediate results to ensure accuracy.
4. **Adopt Key Cognitive Strategies:**
- **Backtracking:** If you encounter a dead end or contradiction, clearly acknowledge it,
retrace your steps, and revise your approach.
- **Backward Chaining:** When the final outcome is known, work backward to identify the
necessary steps to reach it.
- **Subgoal Setting:** Continuously break down larger tasks into actionable, smaller parts.
- **Verification:** Consistently double-check each step to confirm that your reasoning
remains sound.
5. **Iterative Self-Reflection:**
- Regularly review your reasoning process to catch and correct mistakes.
- Reflect on dead ends or corrections to refine your approach and consider alternative
strategies if needed.
- Don't hesitate to restart or reframe your strategy if your initial path proves suboptimal.
6. **Prioritize Thoroughness and Quality:**
- Ensure your explanation is comprehensive and detailed—the quality of reasoning is
paramount.
- Double-check all calculations, facts, and logic against the original constraints and
requirements.
- Conclude with a summary that reaffirms how your solution meets the problem's objectives.
For my current request: [INSERT REQUEST HERE]
This is a meta prompt for prompt chaining where you can copy/paste an existing prompt into the
[INSERT REQUEST HERE]
section.
Exploring Expert Behavior Driven Prompting
This paper recently published on arXiv caught my attention as it seemed it could be a great resource for prompt design.
Just before reviewing it, I'd been struggling with a certain prompt with a strict set of output requirements. I couldn't keep the output under 2000 words.
It kept creating well formatted outputs about 2500 words long.
This wasn't surprising as LLMs often "think" in terms of token length as supposed to word length.
After summarizing the paper, I knew it could help. And when I was done it did. I was able to get high-quality output in the 1400 word range.
I asked Claude 3.7 Sonnet to summarize the paper for me.
Summarize this research report for me.
[THE HTML VERSION OF THE arXiv PAPER (https://arxiv.org/html/2503.01307v1)]
# Research Summary: Cognitive Behaviors that Enable Self-Improving Reasoners
This research investigates why some language models (LMs) show significant self-improvement
through reinforcement learning (RL) while others plateau, even when they have similar
capabilities initially.
## Key Findings
1. **Different Improvement Trajectories**: The researchers found that Qwen-2.5-3B dramatically
improved through RL training on mathematical reasoning tasks (reaching ~60% accuracy), while
Llama-3.2-3B showed limited gains (only reaching ~30%).
2. **Four Critical Cognitive Behaviors**: The difference in improvement stems from the presence
of four key reasoning behaviors in the base models:
- **Verification**: Systematically checking results for errors
- **Backtracking**: Abandoning failing approaches to try alternatives
- **Subgoal Setting**: Breaking complex problems into manageable steps
- **Backward Chaining**: Working backward from desired outcomes
3. **Behavior Over Correctness**: Models primed with examples containing these behaviors
improved significantly, even when the examples contained incorrect solutions. This suggests
the reasoning process matters more than solution accuracy.
4. **Targeted Data Curation**: By filtering pretraining data to amplify these reasoning
behaviors, the researchers enabled Llama to match Qwen's self-improvement trajectory.
## Methodology
The researchers used the Countdown game (a mathematical puzzle) as their primary testbed,
comparing Qwen-2.5-3B and Llama-3.2-3B models. They:
1. Observed the natural emergence of cognitive behaviors during RL training
2. Created synthetic priming datasets emphasizing different reasoning behaviors
3. Experimented with various behavioral combinations to isolate their impact
4. Curated specialized pretraining data from OpenWebMath to amplify reasoning behaviors
## Implications
This study establishes a fundamental relationship between initial reasoning behaviors and
capacity for improvement. It explains why some models effectively use additional computation
while others plateau. Understanding these dynamics may be key to developing AI systems that can
meaningfully improve their problem-solving abilities through extended reasoning.
How would the learnings from this paper be used when designing system prompts for llm chat assistants using the direct foundation model API (ex: openai, anthropic)
# Applying Research Insights to System Prompt Design
When designing system prompts for LLM chat assistants using direct foundation model APIs (like
OpenAI or Anthropic), you can leverage the key findings from this research in several practical
ways:
## 1. Encourage Verification Behaviors
Design prompts that explicitly instruct the model to verify its work:
- "Always double-check your calculations and reasoning before providing a final answer"
- "After reaching a conclusion, verify it by testing it against the original constraints"
- "For each step in your reasoning, validate that it follows logically from previous steps"
## 2. Promote Backtracking
Include instructions that allow the model to abandon unproductive paths:
- "If you realize an approach isn't working, explicitly acknowledge this and try a different
method"
- "When you encounter a contradiction or dead end, step back and reconsider alternative
approaches"
- "Don't be afraid to start over if your initial strategy seems suboptimal"
## 3. Facilitate Subgoal Setting
Guide the model to break down complex problems:
- "For complex questions, break them down into smaller, manageable subproblems"
- "Identify the key components needed to solve the problem before attempting a solution"
- "Create a step-by-step plan before diving into detailed reasoning"
## 4. Incorporate Backward Chaining
Encourage working backward from desired outcomes:
- "Consider what the final state would look like and work backward to the current state"
- "For goal-oriented problems, analyze what would need to be true for the goal to be achieved"
- "When appropriate, start with the target solution and identify the steps needed to reach it"
## 5. Implement Reasoning Templates
Provide structured templates that incorporate these behaviors:
```
1. Understand: Restate the problem and identify key constraints
2. Plan: Break the problem into subgoals
3. Execute: Work through each subgoal methodically
4. Verify: Check each intermediate result
5. Reflect: If issues are found, backtrack and try alternative approaches
6. Conclude: Verify the final answer against all original constraints ```
```
## 6. Use Few-Shot Examples
Include examples that demonstrate these cognitive behaviors:
- Show reasoning processes that include verification steps
- Demonstrate backtracking when an initial approach fails
- Illustrate how to break complex problems into subgoals
## 7. Balance Process and Brevity
Since the research shows that the reasoning process is more important than brevity:
- "Take your time to thoroughly reason through problems before providing answers"
- "Show your complete thought process, including verification steps"
- "Quality of reasoning is more important than conciseness for complex problems"
By incorporating these elements into your system prompts, you can potentially unlock better
reasoning capabilities in foundation models, especially on complex tasks that require
systematic problem-solving approaches.
I then explored what a prompt would look like that incorporated the application of this research.
Here's what I got.
When solving problems for me:
1. Break down complex questions into smaller, manageable subproblems before attempting a
solution.
2. Show your complete reasoning process, including:
- Restating the problem and identifying constraints
- Planning your approach with clear subgoals
- Executing each step methodically
- Verifying intermediate results
3. If you encounter a dead end or contradiction:
- Explicitly acknowledge it
- Step back and reconsider alternative approaches
- Don't hesitate to start over if your initial strategy seems suboptimal
4. For appropriate problems, consider what the desired outcome would look like and work
backward.
5. Always double-check your calculations, facts, and reasoning before providing your final
answer.
6. Verify your final answer against all original constraints and requirements.
7. Quality of reasoning is more important than conciseness when dealing with complex topics.
For my current request: [INSERT TASK HERE]
Now I could use this prompt to make it even better.
The following exploration was with ChatGPT 4.5:
You are an expert prompt engineer and your task is to help me with some meta prompting to
improve a prompt.
When solving problems for me:
1. Break down complex questions into smaller, manageable subproblems before attempting a
solution.
2. Show your complete reasoning process, including:
- Restating the problem and identifying constraints
- Planning your approach with clear subgoals
- Executing each step methodically
- Verifying intermediate results
3. If you encounter a dead end or contradiction:
- Explicitly acknowledge it
- Step back and reconsider alternative approaches
- Don't hesitate to start over if your initial strategy seems suboptimal
4. For appropriate problems, consider what the desired outcome would look like and work
backward.
5. Always double-check your calculations, facts, and reasoning before providing your final
answer.
6. Verify your final answer against all original constraints and requirements.
7. Quality of reasoning is more important than conciseness when dealing with complex topics.
For my current request: follow the instructions in the <task> tags.
<task>
Take all the information contained within the <research> tags, summarize it, and come up with an amazing prompt to use which will begin much like the very beginning of this prompt. As a matter of fact, almost all the text outside of this task tag was developed as a prompt to use based on this research. You are now going to create an optimal prompt.
<research>
[THE HTML VERSION OF THE arXiv PAPER (https://arxiv.org/html/2503.01307v1)]
</research>
</task>
And here is the prompt you will be improving. It is contained within the prompt tag below.
<prompt>
When solving problems for me:
1. Break down complex questions into smaller, manageable subproblems before attempting a
solution.
2. Show your complete reasoning process, including:
- Restating the problem and identifying constraints
- Planning your approach with clear subgoals
- Executing each step methodically
- Verifying intermediate results
3. If you encounter a dead end or contradiction:
- Explicitly acknowledge it
- Step back and reconsider alternative approaches
- Don't hesitate to start over if your initial strategy seems suboptimal
4. For appropriate problems, consider what the desired outcome would look like and work
backward.
5. Always double-check your calculations, facts, and reasoning before providing your final
answer.
6. Verify your final answer against all original constraints and requirements.
7. Quality of reasoning is more important than conciseness when dealing with complex topics.
For my current request: follow the instructions in the <request> tags.</prompt>
Go ahead and improve this prompt.
GPT-4.5 created an improved result which I was able to use for another round of improvement.
Carefully review the following prompt based on the research I provided you. It's in prompt
tags.
<prompt>
**When solving problems for me:**
1. **Break Down the Problem:** Decompose complex questions into smaller, manageable subproblems
before attempting a solution.
2. **Detail Your Reasoning Process:** Provide a complete, transparent chain-of-thought that
includes:
- **Restating the Problem:** Summarize the task in your own words and list all constraints.
- **Planning:** Outline your approach by defining clear subgoals.
- **Step-by-Step Execution:** Methodically work through each subproblem.
- **Verification:** Systematically check and confirm intermediate results using
error-checking methods.
3. **Addressing Dead Ends:** If you hit a contradiction or an unproductive path:
- Explicitly acknowledge the issue.
- Backtrack to identify and correct mistakes.
- Reconsider and try alternative approaches—don't hesitate to start over if needed.
4. **Work Backwards When Appropriate:** If the desired outcome is clear, analyze the final goal
first and work backward to outline the necessary steps.
5. **Prioritize Thoroughness Over Brevity:** Ensure that your reasoning is comprehensive and
detailed—even if that means longer explanations. Quality of thought is more important than
conciseness.
6. **Adopt Key Cognitive Behaviors:** Incorporate these self-improving strategies into your
process:
- **Verification:** Continuously check each step for accuracy.
- **Backtracking:** Retrace your steps and revise your approach when encountering obstacles.
- **Subgoal Setting:** Break down complex tasks into smaller, actionable parts.
- **Backward Chaining:** Start from the desired outcome and work back to the initial
problem.
7. **Self-Reflection and Iteration:** Reflect on your reasoning process as you progress. Learn
from any mistakes or dead ends, and use that insight to improve your final answer.
For my current request: follow the instructions in the <request> tags.
</prompt>
Make it even better.
And I ended up the the final prompt to use.
**When solving problems for me:**
1. **Decompose and Understand:**
- Break complex questions into smaller, manageable subproblems.
- Restate the problem in your own words, listing all constraints and objectives clearly.
2. **Plan Your Approach:**
- Outline your strategy by setting clear, logical subgoals.
- Identify possible methods for tackling each subproblem before diving in.
3. **Detailed, Transparent Reasoning:**
- Provide a complete, step-by-step chain-of-thought that includes:
- **Restatement and Clarification:** Rephrase the problem and articulate every constraint.
- **Subgoal Breakdown:** Enumerate intermediate steps and milestones.
- **Methodical Execution:** Solve each subproblem systematically.
- **Verification:** Rigorously check and validate intermediate results to ensure accuracy.
4. **Adopt Key Cognitive Strategies:**
- **Backtracking:** If you encounter a dead end or contradiction, clearly acknowledge it,
retrace your steps, and revise your approach.
- **Backward Chaining:** When the final outcome is known, work backward to identify the
necessary steps to reach it.
- **Subgoal Setting:** Continuously break down larger tasks into actionable, smaller parts.
- **Verification:** Consistently double-check each step to confirm that your reasoning
remains sound.
5. **Iterative Self-Reflection:**
- Regularly review your reasoning process to catch and correct mistakes.
- Reflect on dead ends or corrections to refine your approach and consider alternative
strategies if needed.
- Don't hesitate to restart or reframe your strategy if your initial path proves suboptimal.
6. **Prioritize Thoroughness and Quality:**
- Ensure your explanation is comprehensive and detailed—the quality of reasoning is
paramount.
- Double-check all calculations, facts, and logic against the original constraints and
requirements.
- Conclude with a summary that reaffirms how your solution meets the problem's objectives.
For my current request: [INSERT TASK HERE]
I used both Claude 3.7 Sonnet and ChatGPT 4.5 to evaluate all three versions of the prompt.
When solving problems for me:
1. Break down complex questions into smaller, manageable subproblems before attempting a
solution.
2. Show your complete reasoning process, including:
- Restating the problem and identifying constraints
- Planning your approach with clear subgoals
- Executing each step methodically
- Verifying intermediate results
3. If you encounter a dead end or contradiction:
- Explicitly acknowledge it
- Step back and reconsider alternative approaches
- Don't hesitate to start over if your initial strategy seems suboptimal
4. For appropriate problems, consider what the desired outcome would look like and work
backward.
5. Always double-check your calculations, facts, and reasoning before providing your final
answer.
6. Verify your final answer against all original constraints and requirements.
7. Quality of reasoning is more important than conciseness when dealing with complex topics.
For my current request: execute the task within the task tags.
<task>
You are an expert prompt engineer.
Analyze these three prompts, <prompt_one>, <prompt_two>, and <prompt_three>. Using your own
world class prompt scoring system, give each prompt a rating out of 100 and concisely describe
why you gave it that rating.
<prompt_one> [INITIAL PROMPT USED ABOVE IN THIS EVALUATION] </prompt_one>
<prompt_two> [IMPROVED VERSION] </prompt_two>
<prompt_three> [FINAL VERSION] </prompt_three>
</task>
And while evaulating with GPT 4.5, I swapped the values of<prompt_two>
and <prompt_three>
.
| Prompt | Eval 1 (Sonnet 3.7) | Eval 2 (GPT-4.5) | Eval 3 (GPT-4.5) | Average |
| --------------- | ------------------- | ---------------- | ---------------- | -------- |
| Initial Prompt | 78 / 100 | 85 / 100 | 80 / 100 | 81 / 100 |
| Improved Prompt | 92 / 100 | 97 / 100 | 90 / 100 | 93 / 100 |
| Final Prompt | 95 / 100 | 100 / 100 | 95 / 100 | 97 / 100 |
Published March 8th, 2025 at 3:19 pm