Mar 6, 2026

Tool Use Patterns: Building Reliable Agent-Tool Interfaces

Your agent called a tool and got back a 40-line JSON blob — raw API response, nested objects, error codes buried inside a status field. The model read it, picked a plausible-looking value, and continued. The value was wrong. Three steps later, the agent confidently wrote a report based on bad data.

The tool worked. The interface failed.

Tool use is the mechanism that turns a language model into an agent. Every capability your agent has — searching databases, writing files, calling APIs, querying services — arrives through a tool interface. If the interface is poorly designed, the model makes worse decisions even when the underlying service is functioning correctly. This guide covers five patterns for building tool interfaces that are precise, reliable, and production-ready.

Prerequisites: Familiarity with Python and the Claude API. For background on MCP as a tool transport layer, see Building Your First MCP Server.

Why Tool Interface Design Matters

When an agent chooses and uses a tool, it makes two decisions:

Which tool to call — driven by the tool’s name and description
What arguments to pass — driven by the tool’s input_schema

Ambiguous descriptions lead to wrong tool selection. Loose schemas let the model pass malformed inputs. Unstructured results make the model guess at what happened. Most agent bugs don’t live in the reasoning — they live at the tool boundary.

Pattern 1: Schema-First Tool Design

Write the JSON schema before writing the implementation. A tight schema constrains model behavior at the input stage — before anything executes.

import anthropic

client = anthropic.Anthropic()

# Tight schema: enum for category, bounded integer for max_results
tools = [
    {
        "name": "search_products",
        "description": (
            "Search the product catalog by keyword. "
            "Returns a list of matching products with IDs, names, and prices. "
            "Use this when the user wants to find or browse products."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Keywords to search for"
                },
                "category": {
                    "type": "string",
                    "enum": ["electronics", "clothing", "food", "home", "all"],
                    "description": "Product category to filter by. Use 'all' if unspecified."
                },
                "max_results": {
                    "type": "integer",
                    "minimum": 1,
                    "maximum": 20,
                    "description": "Number of results to return. Default: 5"
                }
            },
            "required": ["query", "category"]
        }
    }
]

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=tools,
    messages=[{"role": "user", "content": "Find me electronics under $100"}]
)

Schema rules that reduce errors:

Use enum for any field with a fixed set of valid values — the model cannot invent an invalid option
Set minimum/maximum on numeric fields to prevent out-of-range inputs
Mark fields required only when the tool genuinely cannot run without them
Write descriptions from the model’s perspective: “Use this when…” tells the model when to call the tool, not just what it does

When to use: Every tool definition. Schema design is not an optimization — it’s the primary control surface over model behavior.

Pattern 2: Structured Tool Results

Return typed, machine-readable results. Never return raw API responses or prose descriptions.

import json
from dataclasses import dataclass, asdict
from typing import Any, Optional


@dataclass
class ToolResult:
    success: bool
    data: Optional[Any] = None
    error: Optional[str] = None

    def to_content(self) -> str:
        return json.dumps(asdict(self))


def search_products(
    query: str,
    category: str,
    max_results: int = 5,
) -> ToolResult:
    try:
        # Call your actual data source here
        raw_results = _query_database(query, category, limit=max_results)

        # Shape the response before returning it
        products = [
            {"id": r["product_id"], "name": r["title"], "price": r["price_usd"]}
            for r in raw_results
        ]
        return ToolResult(success=True, data={"products": products, "count": len(products)})

    except ConnectionError as e:
        return ToolResult(success=False, error=f"Database unavailable: {e}")
    except Exception as e:
        return ToolResult(success=False, error=f"Search failed: {type(e).__name__}: {e}")


def _query_database(query, category, limit):
    # Placeholder — replace with real implementation
    return []

The consistent {success, data, error} envelope means the model always knows where to look. It never has to interpret whether an empty list means “no results” or “error.” For handling failures gracefully, see Agent Error Recovery Patterns.

When to use: Every tool implementation. Structure beats prose for machine consumption.

Pattern 3: Parallel Tool Calls

Claude can request multiple tools in a single response. Process them in parallel rather than sequentially — it’s faster and often what the model intended.

from concurrent.futures import ThreadPoolExecutor, as_completed


TOOL_REGISTRY = {
    "search_products": search_products,
    # register other tools here
}


def dispatch_tool(name: str, inputs: dict) -> ToolResult:
    handler = TOOL_REGISTRY.get(name)
    if not handler:
        return ToolResult(success=False, error=f"Unknown tool: {name}")
    return handler(**inputs)


def process_tool_calls(response: anthropic.types.Message) -> list[dict]:
    """Execute all tool_use blocks from a model response, in parallel."""
    tool_uses = [
        block for block in response.content
        if block.type == "tool_use"
    ]
    if not tool_uses:
        return []

    def execute(tool_use):
        result = dispatch_tool(tool_use.name, tool_use.input)
        return {
            "type": "tool_result",
            "tool_use_id": tool_use.id,
            "content": result.to_content(),
        }

    with ThreadPoolExecutor(max_workers=len(tool_uses)) as executor:
        futures = {executor.submit(execute, tu): tu for tu in tool_uses}
        results = []
        for future in as_completed(futures):
            results.append(future.result())

    return results

To continue the agent loop after tool execution:

def run_agent(user_message: str) -> str:
    messages = [{"role": "user", "content": user_message}]

    while True:
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=4096,
            tools=tools,
            messages=messages,
        )

        if response.stop_reason == "end_turn":
            # Extract final text response
            for block in response.content:
                if hasattr(block, "text"):
                    return block.text
            return ""

        if response.stop_reason == "tool_use":
            tool_results = process_tool_calls(response)

            # Append assistant response and tool results to conversation
            messages.append({"role": "assistant", "content": response.content})
            messages.append({"role": "user", "content": tool_results})
        else:
            break

    return ""

For orchestrating multiple agents that each call tools, see Multi-Agent Patterns.

When to use: Any agent loop. Parallel execution cuts latency when the model requests multiple independent tools at once.

Pattern 4: Safe Tool Call Wrapper

Never let tool exceptions reach the agent loop unhandled. A crashing tool should produce a structured error, not a Python traceback.

import functools
import signal


def timeout_handler(signum, frame):
    raise TimeoutError("Tool execution timed out")


def safe_tool_call(
    name: str,
    inputs: dict,
    timeout_seconds: int = 30,
) -> ToolResult:
    """
    Execute a tool with a timeout, catching all exceptions.
    Always returns a ToolResult — never raises.
    """
    signal.signal(signal.SIGALRM, timeout_handler)
    signal.alarm(timeout_seconds)
    try:
        return dispatch_tool(name, inputs)
    except TimeoutError:
        return ToolResult(
            success=False,
            error=f"Tool '{name}' timed out after {timeout_seconds}s"
        )
    except Exception as e:
        return ToolResult(
            success=False,
            error=f"Tool '{name}' raised {type(e).__name__}: {e}"
        )
    finally:
        signal.alarm(0)  # Cancel the alarm


# Use safe_tool_call in your executor:
def execute(tool_use):
    result = safe_tool_call(tool_use.name, tool_use.input)
    return {
        "type": "tool_result",
        "tool_use_id": tool_use.id,
        "content": result.to_content(),
    }

When a tool returns success: false, the model can decide whether to retry, try an alternative, or report the failure to the user. This keeps control in the model’s hands instead of crashing the pipeline. For broader retry strategies, see Agent Error Recovery Patterns.

When to use: Replace bare dispatch_tool calls with safe_tool_call in any production agent.

Pattern 5: Result Validation and Truncation

Validate tool results before returning them to the model. Two problems to catch: schema drift (the underlying API changed its response shape) and context overflow (the result is too large for the model’s window).

MAX_TOOL_RESULT_CHARS = 8000  # ~2000 tokens; adjust to your model's context


def validate_result(result: ToolResult, expected_keys: list[str]) -> ToolResult:
    """Check that expected keys are present in the result data."""
    if not result.success or not isinstance(result.data, dict):
        return result
    missing = [k for k in expected_keys if k not in result.data]
    if missing:
        return ToolResult(
            success=False,
            error=f"Tool response missing expected fields: {missing}"
        )
    return result


def truncate_result(result: ToolResult) -> ToolResult:
    """Truncate oversized results to avoid context overflow."""
    content = result.to_content()
    if len(content) <= MAX_TOOL_RESULT_CHARS:
        return result

    truncated_data = {
        "truncated": True,
        "chars_omitted": len(content) - MAX_TOOL_RESULT_CHARS,
        "content": content[:MAX_TOOL_RESULT_CHARS],
    }
    return ToolResult(
        success=result.success,
        data=truncated_data,
        error="Result truncated — too large for context window",
    )


def safe_tool_call_validated(
    name: str,
    inputs: dict,
    expected_keys: list[str] | None = None,
) -> ToolResult:
    result = safe_tool_call(name, inputs)
    if expected_keys:
        result = validate_result(result, expected_keys)
    result = truncate_result(result)
    return result

Validation catches API contract violations before they propagate through the agent. For how to observe and debug these failures in production, see Debugging and Observability.

When to use: Tools that call external APIs (which can change), or tools that return variable-size payloads (search results, file contents, database rows).

Common Mistakes

Mistake 1: Returning Raw API Responses

The model receives a nested object with 30 fields, most irrelevant. It picks the wrong one.

Fix: Shape the response before returning it. Return only what the model needs to make its next decision.

Mistake 2: Tools with Side Effects and No Confirmation

A tool that sends an email, deletes a record, or charges a card should not execute silently. The model may call it speculatively.

Fix: For irreversible actions, use a two-tool pattern: plan_email returns a preview, send_email actually sends it. The model must call both and gets a natural confirmation step.

Mistake 3: Overlapping Tool Responsibilities

Two tools that do similar things force the model to guess which to use. It will sometimes pick wrong.

Fix: Each tool should have a distinct, non-overlapping purpose. If you have search_products and find_items, merge them.

Mistake 4: No Timeout on External Tools

A slow third-party API call blocks the entire agent loop indefinitely.

Fix: Always set a timeout (Pattern 4). For long-running operations, return a job ID and poll with a separate check_job_status tool.

Production Checklist

Every tool has a description written from the model’s perspective (“Use this when…”)
Enum fields used for all fixed-value inputs
Every tool returns {success, data, error} — never raw API responses
Tool calls wrapped in safe_tool_call — no unhandled exceptions
Parallel tool execution for multi-tool responses
Timeout set on every external tool call
Result truncation for variable-size payloads
Validation for tools calling external APIs

Next Steps

Start with the schema — write your input_schema before writing the function body
Add the ToolResult wrapper to every existing tool
Drop in safe_tool_call to harden your agent loop
Set up result validation for any tool that calls an API you don’t control

Related guides:

Building Your First MCP Server — exposing tools via the MCP protocol
Multi-Agent Patterns — tools in orchestrated multi-agent workflows
Agent Memory Systems — using memory as a tool
Agent Error Recovery Patterns — retry and fallback for tool failures
Debugging and Observability — tracing tool calls in production