Skip to content

Context Engineering: Beyond Prompt Engineering

What is Context Engineering?

Context engineering is the evolution beyond simple prompt engineering. It's "the delicate art and science of filling the context window with just the right information for the next step" as defined by Andrej Karpathy.

Key Definitions:

From Manus Team:

Context engineering allows us to ship improvements in hours instead of weeks, keeping our product orthogonal to the underlying models: If model progress is the rising tide, we want to be the boat, not the pillar stuck to the seabed.

From LangChain:

Context engineering is building dynamic systems to provide the right information and tools in the right format such that the LLM can plausibly accomplish the task.

From Tobi Lutke (Shopify CEO):

The art of providing all the context for the task to be plausibly solvable by the LLM.

Context Engineering vs Prompt Engineering

Prompt Engineering  │  Context Engineering
       ↓            │            ↓                      
 "What you say"     │  "Everything else the model sees"
(Single instruction)│    (Examples, memory, retrieval,
                    │     tools, state, control flow)

Core Components of Context Engineering

1. System Architecture Design

  • Dynamic context assembly from multiple sources
  • Context from developer, user, previous interactions, tool calls, external data
  • Real-time context adaptation based on task requirements

2. Information Management

  • Right Information: Ensuring LLMs have necessary context (garbage in, garbage out)
  • Right Tools: Providing appropriate tools for information lookup and action taking
  • Right Format: Optimizing how information is presented to LLMs

3. Context Optimization Principles

  • Token Budget Management: Optimizing every token for cost and performance
  • KV-Cache Optimization: Maximizing cache hit rates for better latency and cost
  • Context Pruning: Removing irrelevant information while preserving essential context

Context Engineering Templates and Patterns

Template 1: KV-Cache Optimization Pattern

yaml
# Stable Prefix Template
system_prompt:
  role: "You are an expert assistant"
  guidelines: |
    - Keep consistent formatting
    - Avoid timestamps in prefix
    - Use deterministic serialization
  
context_management:
  - append_only: true
  - stable_prefix: true
  - cache_breakpoints: ["system_end", "tools_end"]

Template 2: Tool Masking Pattern

python
# Instead of removing tools, mask them
class ContextAwareAgent:
    def __init__(self):
        self.all_tools = ["browser_search", "browser_click", "shell_run", "file_write"]
        self.state_machine = StateMachine()
    
    def get_available_tools(self, context_state):
        if context_state == "user_input":
            return []  # Must reply, not use tools
        elif context_state == "web_research":
            return [tool for tool in self.all_tools if tool.startswith("browser_")]
        elif context_state == "file_operations":
            return [tool for tool in self.all_tools if tool.startswith("file_")]

Template 3: File System as Context Pattern

yaml
# Treat file system as unlimited context
context_strategy:
  primary_context: "working_memory"  # Limited context window
  extended_context: "file_system"   # Unlimited persistent storage
  
compression_rules:
  - web_content: "drop_content_keep_url"
  - documents: "drop_content_keep_path"
  - observations: "summarize_and_store"
  
restoration_strategy:
  - restorable: true
  - on_demand_loading: true

Template 4: Attention Manipulation Pattern

markdown
# Todo List Recitation Pattern
## Current Task: [TASK_NAME]

### Progress Tracker (todo.md)
- [x] Step 1: Initial analysis completed
- [x] Step 2: Data collection finished  
- [ ] Step 3: Processing data (IN PROGRESS)
- [ ] Step 4: Generate report
- [ ] Step 5: Review and finalize

### Key Objectives (Recited)
1. Primary goal: [MAIN_OBJECTIVE]
2. Success criteria: [CRITERIA]
3. Current focus: [CURRENT_STEP]

Template 5: Error Preservation Pattern

python
# Keep errors in context for learning
class ErrorAwareContext:
    def __init__(self):
        self.preserve_errors = True
        self.error_history = []
    
    def handle_action_result(self, action, result):
        if result.is_error:
            # Don't clean up - keep for learning
            error_context = {
                "action": action,
                "error": result.error,
                "timestamp": result.timestamp,
                "context_state": self.get_current_state()
            }
            self.error_history.append(error_context)
            return f"Action failed: {result.error}\nPrevious context preserved for learning."
        return result.success_message

Template 6: Cognitive Tools Pattern

yaml
# Structured reasoning tools as function calls
cognitive_tools:
  understanding:
    name: "understand_problem"
    description: "Break down and comprehend the core problem"
    template: |
      1. Identify main concepts
      2. Extract relevant information
      3. Highlight key constraints
      4. Map to known patterns
  
  reasoning:
    name: "apply_reasoning"
    description: "Apply logical reasoning steps"
    template: |
      1. Generate hypotheses
      2. Test against constraints
      3. Eliminate invalid options
      4. Verify solution path
  
  verification:
    name: "verify_solution"
    description: "Check and validate the solution"
    template: |
      1. Review solution steps
      2. Check against original problem
      3. Identify potential issues
      4. Confirm correctness

Template 7: Multi-Agent Context Orchestration

yaml
# Context engineering for multi-agent systems
agent_context_template:
  search_planner:
    system_prompt: |
      You are a search planning specialist.
      Generate comprehensive search strategies.
    
    context_components:
      - current_datetime: "{{current_time}}"
      - user_query: "{{delimited_query}}"
      - output_format: "structured_json"
      - examples: "{{few_shot_examples}}"
      
  researcher:
    system_prompt: |
      You are a research execution specialist.
      Execute search plans and gather information.
    
    context_components:
      - search_plan: "{{from_planner}}"
      - available_tools: ["web_search", "document_retrieval"]
      - output_format: "structured_findings"

Template 8: Dynamic Context Assembly

python
# Dynamic context engineering system
class ContextEngineer:
    def __init__(self):
        self.context_sources = {
            "system": self.get_system_context,
            "user": self.get_user_context,
            "memory": self.get_memory_context,
            "tools": self.get_tool_context,
            "retrieval": self.get_retrieval_context
        }
    
    def engineer_context(self, task, user_input, state):
        context_parts = []
        
        # Always include system context
        context_parts.append(self.context_sources["system"](task))
        
        # Add user context with proper formatting
        context_parts.append(f"<user_input>\n{user_input}\n</user_input>")
        
        # Conditionally add other context based on task requirements
        if task.requires_memory:
            context_parts.append(self.context_sources["memory"](state))
        
        if task.requires_tools:
            context_parts.append(self.context_sources["tools"](task.tool_requirements))
        
        if task.requires_knowledge:
            context_parts.append(self.context_sources["retrieval"](user_input))
        
        return "\n\n".join(context_parts)

Best Practices for Context Engineering

1. Design Around KV-Cache

  • Keep prompt prefixes stable
  • Make context append-only
  • Mark cache breakpoints explicitly
  • Use consistent serialization

2. Information Architecture

  • Provide complete context (LLMs can't read minds)
  • Format information clearly and concisely
  • Structure inputs and outputs consistently
  • Use delimiters and clear formatting

3. Tool and Memory Management

  • Mask tools instead of removing them
  • Use file system as extended context
  • Implement restorable compression strategies
  • Maintain context persistence across interactions

4. Error and Learning Integration

  • Preserve errors for learning
  • Avoid over-cleaning context traces
  • Include failure examples in context
  • Enable error recovery patterns

5. Attention and Focus Management

  • Use recitation to maintain focus
  • Implement todo-list patterns
  • Avoid uniform context (add diversity)
  • Break repetitive patterns

Evaluation and Optimization

Key Metrics:

  • KV-cache hit rate: Most important for production agents
  • Token efficiency: Cost per successful task completion
  • Context relevance: Information utility score
  • Task success rate: Overall system effectiveness
  • Latency: Time to first token and total response time

Optimization Process:

  1. Measure baseline performance across all metrics
  2. Identify bottlenecks in context assembly and processing
  3. Apply targeted optimizations using templates above
  4. A/B test changes with proper evaluation frameworks
  5. Iterate based on data rather than intuition

Future Directions

Context engineering is evolving toward:

  • Automated context optimization using ML techniques
  • Context compression without information loss
  • Multi-modal context engineering for vision and audio
  • Context safety and security considerations
  • Real-time context adaptation based on performance feedback

The field represents a shift from simple prompting to sophisticated information architecture for AI systems, requiring both technical skill and creative problem-solving.

Released under the MIT License.