Tool Use Patterns: Building Reliable Agent-Tool Interfaces
Your agent called a tool and got back a 40-line JSON blob — raw API response, nested objects, error codes buried inside a status field. The model read it, picked a plausible-looking value, and continued. The value was wrong. Three steps later, the agent confidently wrote a report based on bad data.
The tool worked. The interface failed.
Tool use is the mechanism that turns a language model into an agent. Every capability your agent has — searching databases, writing files, calling APIs, querying services — arrives through a tool interface. If the interface is poorly designed, the model makes worse decisions even when the underlying service is functioning correctly. This guide covers five patterns for building tool interfaces that are precise, reliable, and production-ready.
Prerequisites: Familiarity with Python and the Claude API. For background on MCP as a tool transport layer, see Building Your First MCP Server.
Why Tool Interface Design Matters
When an agent chooses and uses a tool, it makes two decisions:
- Which tool to call — driven by the tool’s
nameanddescription - What arguments to pass — driven by the tool’s
input_schema
Ambiguous descriptions lead to wrong tool selection. Loose schemas let the model pass malformed inputs. Unstructured results make the model guess at what happened. Most agent bugs don’t live in the reasoning — they live at the tool boundary.
Pattern 1: Schema-First Tool Design
Write the JSON schema before writing the implementation. A tight schema constrains model behavior at the input stage — before anything executes.
import anthropic
client = anthropic.Anthropic()
# Tight schema: enum for category, bounded integer for max_resultstools = [ { "name": "search_products", "description": ( "Search the product catalog by keyword. " "Returns a list of matching products with IDs, names, and prices. " "Use this when the user wants to find or browse products." ), "input_schema": { "type": "object", "properties": { "query": { "type": "string", "description": "Keywords to search for" }, "category": { "type": "string", "enum": ["electronics", "clothing", "food", "home", "all"], "description": "Product category to filter by. Use 'all' if unspecified." }, "max_results": { "type": "integer", "minimum": 1, "maximum": 20, "description": "Number of results to return. Default: 5" } }, "required": ["query", "category"] } }]
response = client.messages.create( model="claude-sonnet-4-6", max_tokens=1024, tools=tools, messages=[{"role": "user", "content": "Find me electronics under $100"}])Schema rules that reduce errors:
- Use
enumfor any field with a fixed set of valid values — the model cannot invent an invalid option - Set
minimum/maximumon numeric fields to prevent out-of-range inputs - Mark fields
requiredonly when the tool genuinely cannot run without them - Write descriptions from the model’s perspective: “Use this when…” tells the model when to call the tool, not just what it does
When to use: Every tool definition. Schema design is not an optimization — it’s the primary control surface over model behavior.
Pattern 2: Structured Tool Results
Return typed, machine-readable results. Never return raw API responses or prose descriptions.
import jsonfrom dataclasses import dataclass, asdictfrom typing import Any, Optional
@dataclassclass ToolResult: success: bool data: Optional[Any] = None error: Optional[str] = None
def to_content(self) -> str: return json.dumps(asdict(self))
def search_products( query: str, category: str, max_results: int = 5,) -> ToolResult: try: # Call your actual data source here raw_results = _query_database(query, category, limit=max_results)
# Shape the response before returning it products = [ {"id": r["product_id"], "name": r["title"], "price": r["price_usd"]} for r in raw_results ] return ToolResult(success=True, data={"products": products, "count": len(products)})
except ConnectionError as e: return ToolResult(success=False, error=f"Database unavailable: {e}") except Exception as e: return ToolResult(success=False, error=f"Search failed: {type(e).__name__}: {e}")
def _query_database(query, category, limit): # Placeholder — replace with real implementation return []The consistent {success, data, error} envelope means the model always knows where to look. It never has to interpret whether an empty list means “no results” or “error.” For handling failures gracefully, see Agent Error Recovery Patterns.
When to use: Every tool implementation. Structure beats prose for machine consumption.
Pattern 3: Parallel Tool Calls
Claude can request multiple tools in a single response. Process them in parallel rather than sequentially — it’s faster and often what the model intended.
from concurrent.futures import ThreadPoolExecutor, as_completed
TOOL_REGISTRY = { "search_products": search_products, # register other tools here}
def dispatch_tool(name: str, inputs: dict) -> ToolResult: handler = TOOL_REGISTRY.get(name) if not handler: return ToolResult(success=False, error=f"Unknown tool: {name}") return handler(**inputs)
def process_tool_calls(response: anthropic.types.Message) -> list[dict]: """Execute all tool_use blocks from a model response, in parallel.""" tool_uses = [ block for block in response.content if block.type == "tool_use" ] if not tool_uses: return []
def execute(tool_use): result = dispatch_tool(tool_use.name, tool_use.input) return { "type": "tool_result", "tool_use_id": tool_use.id, "content": result.to_content(), }
with ThreadPoolExecutor(max_workers=len(tool_uses)) as executor: futures = {executor.submit(execute, tu): tu for tu in tool_uses} results = [] for future in as_completed(futures): results.append(future.result())
return resultsTo continue the agent loop after tool execution:
def run_agent(user_message: str) -> str: messages = [{"role": "user", "content": user_message}]
while True: response = client.messages.create( model="claude-sonnet-4-6", max_tokens=4096, tools=tools, messages=messages, )
if response.stop_reason == "end_turn": # Extract final text response for block in response.content: if hasattr(block, "text"): return block.text return ""
if response.stop_reason == "tool_use": tool_results = process_tool_calls(response)
# Append assistant response and tool results to conversation messages.append({"role": "assistant", "content": response.content}) messages.append({"role": "user", "content": tool_results}) else: break
return ""For orchestrating multiple agents that each call tools, see Multi-Agent Patterns.
When to use: Any agent loop. Parallel execution cuts latency when the model requests multiple independent tools at once.
Pattern 4: Safe Tool Call Wrapper
Never let tool exceptions reach the agent loop unhandled. A crashing tool should produce a structured error, not a Python traceback.
import functoolsimport signal
def timeout_handler(signum, frame): raise TimeoutError("Tool execution timed out")
def safe_tool_call( name: str, inputs: dict, timeout_seconds: int = 30,) -> ToolResult: """ Execute a tool with a timeout, catching all exceptions. Always returns a ToolResult — never raises. """ signal.signal(signal.SIGALRM, timeout_handler) signal.alarm(timeout_seconds) try: return dispatch_tool(name, inputs) except TimeoutError: return ToolResult( success=False, error=f"Tool '{name}' timed out after {timeout_seconds}s" ) except Exception as e: return ToolResult( success=False, error=f"Tool '{name}' raised {type(e).__name__}: {e}" ) finally: signal.alarm(0) # Cancel the alarm
# Use safe_tool_call in your executor:def execute(tool_use): result = safe_tool_call(tool_use.name, tool_use.input) return { "type": "tool_result", "tool_use_id": tool_use.id, "content": result.to_content(), }When a tool returns success: false, the model can decide whether to retry, try an alternative, or report the failure to the user. This keeps control in the model’s hands instead of crashing the pipeline. For broader retry strategies, see Agent Error Recovery Patterns.
When to use: Replace bare dispatch_tool calls with safe_tool_call in any production agent.
Pattern 5: Result Validation and Truncation
Validate tool results before returning them to the model. Two problems to catch: schema drift (the underlying API changed its response shape) and context overflow (the result is too large for the model’s window).
MAX_TOOL_RESULT_CHARS = 8000 # ~2000 tokens; adjust to your model's context
def validate_result(result: ToolResult, expected_keys: list[str]) -> ToolResult: """Check that expected keys are present in the result data.""" if not result.success or not isinstance(result.data, dict): return result missing = [k for k in expected_keys if k not in result.data] if missing: return ToolResult( success=False, error=f"Tool response missing expected fields: {missing}" ) return result
def truncate_result(result: ToolResult) -> ToolResult: """Truncate oversized results to avoid context overflow.""" content = result.to_content() if len(content) <= MAX_TOOL_RESULT_CHARS: return result
truncated_data = { "truncated": True, "chars_omitted": len(content) - MAX_TOOL_RESULT_CHARS, "content": content[:MAX_TOOL_RESULT_CHARS], } return ToolResult( success=result.success, data=truncated_data, error="Result truncated — too large for context window", )
def safe_tool_call_validated( name: str, inputs: dict, expected_keys: list[str] | None = None,) -> ToolResult: result = safe_tool_call(name, inputs) if expected_keys: result = validate_result(result, expected_keys) result = truncate_result(result) return resultValidation catches API contract violations before they propagate through the agent. For how to observe and debug these failures in production, see Debugging and Observability.
When to use: Tools that call external APIs (which can change), or tools that return variable-size payloads (search results, file contents, database rows).
Common Mistakes
Mistake 1: Returning Raw API Responses
The model receives a nested object with 30 fields, most irrelevant. It picks the wrong one.
Fix: Shape the response before returning it. Return only what the model needs to make its next decision.
Mistake 2: Tools with Side Effects and No Confirmation
A tool that sends an email, deletes a record, or charges a card should not execute silently. The model may call it speculatively.
Fix: For irreversible actions, use a two-tool pattern: plan_email returns a preview, send_email actually sends it. The model must call both and gets a natural confirmation step.
Mistake 3: Overlapping Tool Responsibilities
Two tools that do similar things force the model to guess which to use. It will sometimes pick wrong.
Fix: Each tool should have a distinct, non-overlapping purpose. If you have search_products and find_items, merge them.
Mistake 4: No Timeout on External Tools
A slow third-party API call blocks the entire agent loop indefinitely.
Fix: Always set a timeout (Pattern 4). For long-running operations, return a job ID and poll with a separate check_job_status tool.
Production Checklist
- Every tool has a description written from the model’s perspective (“Use this when…”)
- Enum fields used for all fixed-value inputs
- Every tool returns
{success, data, error}— never raw API responses - Tool calls wrapped in
safe_tool_call— no unhandled exceptions - Parallel tool execution for multi-tool responses
- Timeout set on every external tool call
- Result truncation for variable-size payloads
- Validation for tools calling external APIs
Next Steps
- Start with the schema — write your
input_schemabefore writing the function body - Add the
ToolResultwrapper to every existing tool - Drop in
safe_tool_callto harden your agent loop - Set up result validation for any tool that calls an API you don’t control
Related guides:
- Building Your First MCP Server — exposing tools via the MCP protocol
- Multi-Agent Patterns — tools in orchestrated multi-agent workflows
- Agent Memory Systems — using memory as a tool
- Agent Error Recovery Patterns — retry and fallback for tool failures
- Debugging and Observability — tracing tool calls in production