February 15, 2026

Universal Tool Design Cheatsheet for AI Agentic Engineering

I've built a lot of AI tools. The pattern I use doesn't change whether I'm working with raw JSON, Langchain, Strands, Anthropic SDK, or Pydantic. Only the syntax varies.

The thinking is always the same: understand the problem, design the boundary, handle errors, define returns, implement. This cheatsheet is that thinking, applicable to any framework.

Use it as a reference when building tools. Use it to review other people's work. Use it to catch mistakes before they become expensive bugs.

Phase 1: Understand the Problem (Before You Write Anything)

Seriously—don't write code yet.

Answer these questions first. Write them down:

What business problem does this tool solve? (not “what does it do”, but why does it matter?)
Why can't Claude do this without calling a tool? (what's the gap you're filling?)
Who will use this? (LLM? Humans? Both?)
What data must the tool receive to make a decision?
What data must the tool return?

I use this with an example: upload_thumbnail tool for an e-commerce platform

Problem: Product images can't go live without validation. Bad dimensions break layouts. Corrupted files break pipelines.
Why tool: Claude can't directly access S3 or validate pixel dimensions. The system needs to.
User: E-commerce AI assistant managing product catalogs. Or a human who needs Claude to handle uploads.
Needs: File location, dimensions, product ID, file format metadata
Returns: CDN URL (where image lives), thumbnail ID (for tracking), status (success/failure)

This takes 10 minutes. Skipping it costs you hours later.

Phase 2: Design the Boundary (Validate at Entry)

The boundary is where the LLM hands data to your system. Validate aggressively here.

Why? Because catching errors at the boundary is exponentially cheaper than discovering them after they've propagated through your system.

For each parameter, ask:

Is this required or optional? (Use the decision tree below)
What format is valid? (enum values? regex pattern? numeric range?)
What constraints prevent disasters? (min/max file size, date ranges, format validation)
Can I validate this immediately? (at the boundary, not deep in processing)

Required vs. Optional: The Real Logic

This is the distinction that trips people up. Here's the actual decision:

Make it REQUIRED if: – You can validate it at the boundary (immediately, without calling other services) – It prevents logical errors downstream (like invoice amount mismatches) – The LLM can reliably provide it (has access to the information)

Make it OPTIONAL if: – It can be generated or extracted asynchronously (e.g., OCR on an image for alt_text) – It's a nice-to-have that improves validation but isn't critical – The LLM might not have access to it

Quick decision tree:

Is this parameter critical to prevent errors?
├─ YES → Make it REQUIRED + add constraints
│        Example: invoice_amount (catches PO mismatches before processing)
│
└─ NO → Can it be generated later?
         ├─ YES → OPTIONAL (compute async)
         │        Example: alt_text (from image analysis after upload)
         │
         └─ NO → Stop. Does the LLM really need to provide this?
                  Maybe it's not a parameter at all.

Phase 3: Define Error States (What Can Go Wrong)

Most tool designs fail here. They define errors like: { status: "error", message: "something failed" }. That's useless.

List every way your tool can fail:

Invalid input (user error, LLM hallucination)
Resource not found (the thing doesn't exist)
Permission denied (auth error)
Service unavailable (downstream system down)
Timeout (performance)
Partial success (batch operation: some succeeded, some failed)

For each error state, define:

Error code (machine-readable: SCREAMING_SNAKE_CASE)
Human message (the LLM understands what went wrong)
Suggested action (what should the LLM do?)
Retry-able: Can it try again or is it terminal?

Real Example: `upload_thumbnail` Errors

DIMENSION_MISMATCH
  Message: "Image dimensions 500x400 do not match required 600x400"
  Action: "Re-upload with correct dimensions or use image scaling"
  Retry: Yes

PRODUCT_NOT_FOUND
  Message: "Product ID 'Product-99999999' does not exist in database"
  Action: "Verify product ID with user and retry"
  Retry: No (need valid product ID from user)

FILE_CORRUPTED
  Message: "File size mismatch: expected 524288 bytes, got 262144"
  Action: "Re-upload from original source"
  Retry: Yes

SIZE_EXCEEDS_LIMIT
  Message: "File size (3.5 MB) exceeds maximum (2 MB)"
  Action: "Compress image and retry"
  Retry: Yes

Each one tells the LLM what to do next. That's the point. Bad errors make the LLM guess.

Phase 4: Design the Return Contract (What Claude Gets Back)

Be explicit about what happens.

On success: – What's the primary result? (what the user wanted) – What metadata is useful? (ID, timestamp, URL) – What can Claude do next with this result?

On failure: – Error code (machine-readable) – Error message (human-readable) – Suggested action – Retry-able flag

If async: – Job ID (for polling) – Status (pending/processing/complete/failed) – Polling URL – Estimated completion time

Critical: Make Success Explicit

Don't assume the LLM understands what happened. Be obvious:

{
  "status": "success",
  "thumbnail_id": "THUMB-20250214-ABC123",
  "cdn_url": "https://cdn.example.com/thumbnails/...",
  "alt_text": "Red running shoe, side view"
}

The LLM will key off status. Make it explicit, not implicit.

Phase 5: Write the Manifest (Implementation)

Whether you're using JSON, Langchain, Pydantic, or Strands—follow this structure:

1. Name – snake_case, verb-based – ✅ upload_thumbnail, delete_invoice, fetch_user_data – ❌ thumbnail, data, processor

2. Description – 2-3 sentences: what, why Claude uses it, when – Be specific, not generic – ✅ “Upload product thumbnail to CDN. Validates 600x400px, <2MB, jpg/png/webp. Returns CDN URL.” – ❌ “Gets data”

3. Parameters – Type, format, constraints, description, example for each – Required array: which params MUST be present? – Constraints: min/max, enum, regex

4. Returns – Success schema: all fields with descriptions – Error schema: errorcode, errormessage, suggestedaction, retryable – Async schema (if needed): job_id, status, polling info

Implementation: Raw JSON (OpenAI Format)

{
  "name": "upload_thumbnail",
  "description": "Upload product thumbnail to CDN. Validates 600x400px, <2MB, jpg/png/webp. Returns CDN URL.",
  "parameters": {
    "type": "object",
    "properties": {
      "thumbnail_url": {
        "type": "string",
        "description": "S3 presigned URL of the thumbnail file",
        "format": "uri",
        "pattern": "^https://s3\\.amazonaws\\.com/.*\\.(jpg|jpeg|png|webp)$"
      },
      "product_id": {
        "type": "string",
        "description": "Product ID: 'Product-' + 6-8 digits",
        "pattern": "^Product-[0-9]{6,8}$"
      },
      "image_width": {
        "type": "integer",
        "description": "Image width in pixels (must be 600)",
        "minimum": 600,
        "maximum": 600
      }
    },
    "required": ["thumbnail_url", "product_id", "image_width"]
  }
}

Implementation: Langchain (Python)

from langchain.tools import tool
from typing import Optional

@tool
def upload_thumbnail(
    thumbnail_url: str,
    product_id: str,
    image_width: int,
    image_height: int,
    file_size_bytes: int,
    file_format: str,
    alt_text: Optional[str] = None
) -> dict:
    """
    Upload product thumbnail to CDN.
    
    Validates dimensions (600x400px), file size (<2MB), and format.
    Returns CDN URL on success.
    
    Args:
        thumbnail_url: S3 presigned URL. Example: https://s3.amazonaws.com/thumb.jpg
        product_id: Format: Product-123456 to Product-12345678
        image_width: Must be exactly 600 pixels
        image_height: Must be exactly 400 pixels
        file_size_bytes: Between 1KB and 2MB
        file_format: One of: jpg, jpeg, png, webp
        alt_text: Optional accessibility text
    
    Returns:
        dict with: status, thumbnail_id (success), error_details (failure)
    """
    pass

Implementation: Anthropic SDK (Python)

upload_tool = {
    "name": "upload_thumbnail",
    "description": "Upload product thumbnail. Validates 600x400px, <2MB, jpg/png/webp.",
    "input_schema": {
        "type": "object",
        "properties": {
            "thumbnail_url": {
                "type": "string",
                "description": "S3 presigned URL"
            },
            "product_id": {
                "type": "string",
                "description": "Format: Product-123456"
            },
            "image_width": {
                "type": "integer",
                "description": "Must be 600 pixels"
            },
            "image_height": {
                "type": "integer",
                "description": "Must be 400 pixels"
            }
        },
        "required": ["thumbnail_url", "product_id", "image_width", "image_height"]
    }
}

Implementation: Pydantic (Python)

from pydantic import BaseModel, Field
from typing import Optional

class UploadThumbnailInput(BaseModel):
    thumbnail_url: str = Field(
        ...,
        description="S3 presigned URL",
        pattern="^https://s3\\.amazonaws\\.com/.*"
    )
    product_id: str = Field(
        ...,
        description="Format: Product-123456",
        pattern="^Product-[0-9]{6,8}$"
    )
    image_width: int = Field(
        ...,
        description="Must be 600 pixels",
        ge=600, le=600
    )
    image_height: int = Field(
        ...,
        description="Must be 400 pixels",
        ge=400, le=400
    )
    alt_text: Optional[str] = Field(
        None,
        description="Optional accessibility text",
        min_length=10, max_length=500
    )

Validation Strategy: Boundary vs. Async

Two approaches. Know when to use each.

Boundary Validation (Immediate)

Validate at entry using constraints. Catch errors before they propagate.

When: Parameters you can check without external services

Example: invoice_amount must match PO amount range

"invoice_amount": {
  "type": "number",
  "minimum": 0.01,
  "maximum": 999999.99,
  "description": "Must match PO amount within ±tolerance"
}

Async Validation (Later)

Validate after processing. Makes sense for expensive operations (OCR, image analysis).

When: Parameters requiring computation or external services

Example: alt_text semantic validation against image content (after upload)

"alt_text": {
  "type": "string",
  "description": "Optional. Validated asynchronously against image."
}

Common Mistakes (Don't Do These)

❌ Vague descriptions
   "gets data" instead of "Retrieves invoice history for past 90 days"

❌ Missing constraints
   Unbounded string allows 100,000 character input

❌ Required parameters LLM can't provide
   Making "file_hash_sha256" required when only metadata is known

❌ Useless error states
   "error" instead of "PRODUCT_NOT_FOUND: verify product ID"

❌ Missing return schema
   Forgetting to document what success looks like

❌ Async tools with no job IDs
   "will process in background" but no way to check status

❌ No examples for complex params
   Pattern without showing what's valid

❌ Computing in LLM, not system
   Asking LLM for SHA-256 hash instead of extracting from file

❌ Late boundary validation
   Discovering PO mismatch after days of processing

❌ State management gaps
   Allowing duplicate uploads without handling overwrites

Real-World Patterns

Pattern 1: File Upload Tools

Required:
  - file_url (validate S3 access immediately)
  - file_format (enum: jpg, png, webp)
  - file_size_bytes (validate < 10MB at boundary)

Optional:
  - alt_text (generated from image async)
  - metadata (extracted from file async)

Error cases:
  - INVALID_URL
  - SIZE_EXCEEDS_LIMIT
  - UNSUPPORTED_FORMAT
  - FILE_CORRUPTED

Pattern 2: Data Reconciliation Tools

Required:
  - reference_id (must exist)
  - amount (must match expected value ±tolerance)
  - date (must be within acceptable range)

Optional:
  - notes (context, not critical)

Error cases:
  - RECORD_NOT_FOUND
  - AMOUNT_MISMATCH
  - DATE_OUT_OF_RANGE
  - DUPLICATE_DETECTED

Pattern 3: Action Tools (Delete, Update)

Required:
  - resource_id (must exist)
  - confirm_action (true to proceed)
  - reason (audit trail)

Optional:
  - cascade (delete related records? yes/no)

Error cases:
  - RESOURCE_NOT_FOUND
  - PERMISSION_DENIED
  - CONFIRMATION_REQUIRED
  - CASCADING_FAILED

The Quick Checklist

Before considering a tool “done”:

□ Name is clear and actionable (verb-based)
□ Description explains the why, not just the what
□ Required parameters prevent logical errors
□ Constraints prevent invalid inputs
□ Error cases are comprehensive (not just "error")
□ Error messages tell LLM what to do next
□ Return schema is complete (success + error + async)
□ Complex parameters have examples
□ Validation happens at boundary
□ Async operations return job IDs + polling
□ No vague descriptions
□ No unbounded strings/integers
□ No required params LLM can't reliably provide
□ Error codes clearly indicate next steps
□ Return fields are documented

When to Add Parameters vs. Handle Internally

Add as Parameter If:

The LLM should decide this value
Different values change behavior
You want boundary validation
It affects business logic

Handle Internally If:

Only the system decides (backend concern)
It's implementation detail (encryption, compression)
It's derived from other parameters
It's infrastructure config (database ID, S3 bucket)

Examples

✅ Parameter: po_number (LLM decides which PO to reconcile)
❌ Parameter: database_id (system decides internally)

✅ Parameter: invoice_amount (LLM provides, system validates)
❌ Parameter: encrypted_at_rest (backend concern)

✅ Parameter: file_url (LLM knows where file is)
❌ Parameter: s3_bucket_name (hardcoded in backend)

Quick Reference: Parameter Types

Type	Constraints	Example
string	minLength, maxLength, pattern, enum	“user@example.com”
integer	minimum, maximum, enum	42
number	minimum, maximum	3.14
boolean	(none)	true
array	minItems, maxItems, items schema	[1, 2, 3]
object	properties, required	{“name”: “John”}

Quick Reference: Standard Error Codes

INPUT_ERRORS:
  INVALID_FORMAT
  MISSING_REQUIRED_FIELD
  VALUE_OUT_OF_RANGE

DATA_ERRORS:
  NOT_FOUND
  ALREADY_EXISTS
  DUPLICATE_DETECTED

AUTH_ERRORS:
  PERMISSION_DENIED
  UNAUTHORIZED
  ACCESS_REVOKED

SYSTEM_ERRORS:
  SERVICE_UNAVAILABLE
  TIMEOUT
  INTERNAL_ERROR

BUSINESS_LOGIC_ERRORS:
  AMOUNT_MISMATCH
  STATE_INVALID_FOR_TRANSITION
  QUOTA_EXCEEDED

Test Before You Build

Mental walkthrough. If you can't answer all six, your design isn't complete:

Happy Path — LLM provides correct data → system processes → clear success response
Invalid Input — LLM provides wrong type → system rejects at boundary → actionable error message
Missing Required — LLM forgets a parameter → system says which one
Not Found — LLM provides valid but non-existent ID → system clearly indicates it
Async Operation — LLM calls async tool → gets job_id immediately → can poll for status
Partial Failure — Batch operation: some succeed, some fail → LLM sees both with reasons

Universal Principle

The thinking is the same across every framework.

Understand the problem (Phase 1)
Design the boundary (Phase 2)
Define error states (Phase 3)
Design return contract (Phase 4)
Write the manifest (Phase 5)

Whether you use JSON, Langchain, Strands, Anthropic SDK, or Pydantic—only the syntax changes. The thinking doesn't.

Build smarter tools. Design the boundary first. The rest follows.

Last updated: February 2025

References: – Anthropic Tool Use – OpenAI Function Calling – JSON Schema – Pydantic