Context Engineering: Beyond Prompt Engineering
What is Context Engineering?
Context engineering is the evolution beyond simple prompt engineering. It's "the delicate art and science of filling the context window with just the right information for the next step" as defined by Andrej Karpathy.
Key Definitions:
From Manus Team:
Context engineering allows us to ship improvements in hours instead of weeks, keeping our product orthogonal to the underlying models: If model progress is the rising tide, we want to be the boat, not the pillar stuck to the seabed.
From LangChain:
Context engineering is building dynamic systems to provide the right information and tools in the right format such that the LLM can plausibly accomplish the task.
From Tobi Lutke (Shopify CEO):
The art of providing all the context for the task to be plausibly solvable by the LLM.
Context Engineering vs Prompt Engineering
Prompt Engineering │ Context Engineering
↓ │ ↓
"What you say" │ "Everything else the model sees"
(Single instruction)│ (Examples, memory, retrieval,
│ tools, state, control flow)
Core Components of Context Engineering
1. System Architecture Design
- Dynamic context assembly from multiple sources
- Context from developer, user, previous interactions, tool calls, external data
- Real-time context adaptation based on task requirements
2. Information Management
- Right Information: Ensuring LLMs have necessary context (garbage in, garbage out)
- Right Tools: Providing appropriate tools for information lookup and action taking
- Right Format: Optimizing how information is presented to LLMs
3. Context Optimization Principles
- Token Budget Management: Optimizing every token for cost and performance
- KV-Cache Optimization: Maximizing cache hit rates for better latency and cost
- Context Pruning: Removing irrelevant information while preserving essential context
Context Engineering Templates and Patterns
Template 1: KV-Cache Optimization Pattern
# Stable Prefix Template
system_prompt:
role: "You are an expert assistant"
guidelines: |
- Keep consistent formatting
- Avoid timestamps in prefix
- Use deterministic serialization
context_management:
- append_only: true
- stable_prefix: true
- cache_breakpoints: ["system_end", "tools_end"]
Template 2: Tool Masking Pattern
# Instead of removing tools, mask them
class ContextAwareAgent:
def __init__(self):
self.all_tools = ["browser_search", "browser_click", "shell_run", "file_write"]
self.state_machine = StateMachine()
def get_available_tools(self, context_state):
if context_state == "user_input":
return [] # Must reply, not use tools
elif context_state == "web_research":
return [tool for tool in self.all_tools if tool.startswith("browser_")]
elif context_state == "file_operations":
return [tool for tool in self.all_tools if tool.startswith("file_")]
Template 3: File System as Context Pattern
# Treat file system as unlimited context
context_strategy:
primary_context: "working_memory" # Limited context window
extended_context: "file_system" # Unlimited persistent storage
compression_rules:
- web_content: "drop_content_keep_url"
- documents: "drop_content_keep_path"
- observations: "summarize_and_store"
restoration_strategy:
- restorable: true
- on_demand_loading: true
Template 4: Attention Manipulation Pattern
# Todo List Recitation Pattern
## Current Task: [TASK_NAME]
### Progress Tracker (todo.md)
- [x] Step 1: Initial analysis completed
- [x] Step 2: Data collection finished
- [ ] Step 3: Processing data (IN PROGRESS)
- [ ] Step 4: Generate report
- [ ] Step 5: Review and finalize
### Key Objectives (Recited)
1. Primary goal: [MAIN_OBJECTIVE]
2. Success criteria: [CRITERIA]
3. Current focus: [CURRENT_STEP]
Template 5: Error Preservation Pattern
# Keep errors in context for learning
class ErrorAwareContext:
def __init__(self):
self.preserve_errors = True
self.error_history = []
def handle_action_result(self, action, result):
if result.is_error:
# Don't clean up - keep for learning
error_context = {
"action": action,
"error": result.error,
"timestamp": result.timestamp,
"context_state": self.get_current_state()
}
self.error_history.append(error_context)
return f"Action failed: {result.error}\nPrevious context preserved for learning."
return result.success_message
Template 6: Cognitive Tools Pattern
# Structured reasoning tools as function calls
cognitive_tools:
understanding:
name: "understand_problem"
description: "Break down and comprehend the core problem"
template: |
1. Identify main concepts
2. Extract relevant information
3. Highlight key constraints
4. Map to known patterns
reasoning:
name: "apply_reasoning"
description: "Apply logical reasoning steps"
template: |
1. Generate hypotheses
2. Test against constraints
3. Eliminate invalid options
4. Verify solution path
verification:
name: "verify_solution"
description: "Check and validate the solution"
template: |
1. Review solution steps
2. Check against original problem
3. Identify potential issues
4. Confirm correctness
Template 7: Multi-Agent Context Orchestration
# Context engineering for multi-agent systems
agent_context_template:
search_planner:
system_prompt: |
You are a search planning specialist.
Generate comprehensive search strategies.
context_components:
- current_datetime: "{{current_time}}"
- user_query: "{{delimited_query}}"
- output_format: "structured_json"
- examples: "{{few_shot_examples}}"
researcher:
system_prompt: |
You are a research execution specialist.
Execute search plans and gather information.
context_components:
- search_plan: "{{from_planner}}"
- available_tools: ["web_search", "document_retrieval"]
- output_format: "structured_findings"
Template 8: Dynamic Context Assembly
# Dynamic context engineering system
class ContextEngineer:
def __init__(self):
self.context_sources = {
"system": self.get_system_context,
"user": self.get_user_context,
"memory": self.get_memory_context,
"tools": self.get_tool_context,
"retrieval": self.get_retrieval_context
}
def engineer_context(self, task, user_input, state):
context_parts = []
# Always include system context
context_parts.append(self.context_sources["system"](task))
# Add user context with proper formatting
context_parts.append(f"<user_input>\n{user_input}\n</user_input>")
# Conditionally add other context based on task requirements
if task.requires_memory:
context_parts.append(self.context_sources["memory"](state))
if task.requires_tools:
context_parts.append(self.context_sources["tools"](task.tool_requirements))
if task.requires_knowledge:
context_parts.append(self.context_sources["retrieval"](user_input))
return "\n\n".join(context_parts)
Best Practices for Context Engineering
1. Design Around KV-Cache
- Keep prompt prefixes stable
- Make context append-only
- Mark cache breakpoints explicitly
- Use consistent serialization
2. Information Architecture
- Provide complete context (LLMs can't read minds)
- Format information clearly and concisely
- Structure inputs and outputs consistently
- Use delimiters and clear formatting
3. Tool and Memory Management
- Mask tools instead of removing them
- Use file system as extended context
- Implement restorable compression strategies
- Maintain context persistence across interactions
4. Error and Learning Integration
- Preserve errors for learning
- Avoid over-cleaning context traces
- Include failure examples in context
- Enable error recovery patterns
5. Attention and Focus Management
- Use recitation to maintain focus
- Implement todo-list patterns
- Avoid uniform context (add diversity)
- Break repetitive patterns
Evaluation and Optimization
Key Metrics:
- KV-cache hit rate: Most important for production agents
- Token efficiency: Cost per successful task completion
- Context relevance: Information utility score
- Task success rate: Overall system effectiveness
- Latency: Time to first token and total response time
Optimization Process:
- Measure baseline performance across all metrics
- Identify bottlenecks in context assembly and processing
- Apply targeted optimizations using templates above
- A/B test changes with proper evaluation frameworks
- Iterate based on data rather than intuition
Future Directions
Context engineering is evolving toward:
- Automated context optimization using ML techniques
- Context compression without information loss
- Multi-modal context engineering for vision and audio
- Context safety and security considerations
- Real-time context adaptation based on performance feedback
The field represents a shift from simple prompting to sophisticated information architecture for AI systems, requiring both technical skill and creative problem-solving.