Pattern · Execution · Knowledge · Discovery
An agent can reason about anything. But it can only execute what its skills allow. If the action doesn't exist, it's architecturally impossible.
The core problem
Admin of your database. Your cloud credentials. Your domain password. Without telling them what they can and can't do. No limits. Just: "do what you think is necessary".
That's exactly what we do today with AI agents. We give them unlimited reasoning and unlimited execution. And then we're surprised when things blow up.
The Agent Skills pattern solves this: the agent thinks without limits, but only executes bounded, reusable and safe actions.
The idea is simple: an agent can reason about anything — analyze, plan, make decisions. But when it comes to executing, it can only do what its skills allow.
If the action isn't in the skill catalog, it simply can't do it. It's not that it chooses not to. It doesn't exist for it.
That's an Agent Skill: a bounded, reusable and safe function an agent can invoke. The pattern wasn't invented by Anthropic, Google, or OpenAI. It's over 40 years old. It comes from robotics.
The mental model
Skill catalog
If it doesn't exist → can't do it. Period.
This is the most common confusion when studying Agent Skills. There are 4 terms used interchangeably on the internet, but they operate at completely different levels. They are parallel, not sequential.
The universal idea of separating reasoning from execution. Comes from 1980s robotics. Not from Anthropic, not from Google, not from anyone. It's proven engineering. When we talk about "the Agent Skills pattern", we mean this.
Concept: an agent reasons freely, but only executes bounded, reusable and safe capabilities.
The Python function, API, endpoint the agent invokes at runtime.
It's what the agent executes internally.
Each framework has its name: ADK calls it tool,
Claude calls it tool use,
OpenAI calls it function.
When a tool is well designed — bounded, reusable and safe — it's following the skill pattern.
Anthropic took the pattern and built a product: folders with packaged procedural knowledge. Not tool use. Doesn't replace executable functions. It teaches the agent how to approach complex tasks, which flow to follow, how to combine its tools.
Claude Skill structure:
my-skill/
├── SKILL.md ← instructions + recommended flows
├── scripts/ ← executable helpers
└── references/ ← additional docs
Published as open standard at agentskills.io. Adopted by +27 tools: Cursor, VS Code, GitHub, Gemini CLI, OpenAI Codex, Goose, Databricks and more.
In Google's A2A protocol, AgentSkill is something completely different.
It's not an executable function. It's a high-level declaration of
what an agent knows how to do — the "business card" other agents can discover and use to delegate tasks.
A single AgentSkill can group multiple internal tools. For example: our DNS Agent has 6 internal tools, but at the A2A level it's published as a single AgentSkill: "DNS Management".
Doesn't execute anything. It's high-level metadata for the agent ecosystem.
The synthesis
Anthropic solved procedural knowledge with SKILL.md. Google solved discovery with AgentSkill in A2A. Parallel paths, same underlying concept. Not sequential layers — different implementations of the same pattern.
Whether an execution tool, a Claude Skill, or an A2A AgentSkill — all share these three principles.
Does ONE thing with clear inputs and outputs. No unexpected side effects.
create_dns_record(domain, type, name, value) → record_id. Creates. Doesn't modify, doesn't delete, doesn't touch other domains.
The same skill works in different contexts without rewriting code.
To create a subdomain you use create_record. To migrate DNS you use find + update. You compose, not duplicate.
Internal validations. Cannot do more than what is defined.
If you ask it something outside its scope, it refuses. The agent doesn't get creative with unplanned actions.
The most useful analogy to understand the pattern is not technical. It's organizational.
The CEO as agent:
The CEO can think about anything: strategy, markets, expansion, products. No limits. But when something needs to be executed, they call a department.
HR can only hire. Finance only approves budgets. Legal only reviews contracts. DevOps only deploys. Each department is a skill: clear responsibility, clear inputs, clear outputs.
What happens if the CEO wants to do something for which there's no department? It simply can't be done. That's security by absence.
Direct mapping
When you build an agent, there are two ways to control its behavior. Most tutorials only teach you one. The second is the one that really matters.
How it should think
⚠️ Suggestion
The agent reads them, understands them, generally follows them. But technically can ignore them. They're like a procedures manual.
What it can do
🔒 Real constraint
If the function doesn't exist, the agent cannot execute that action. Doesn't matter how intelligent it is. It's like giving it specific keys to a building — doors without keys don't exist for it.
Example: DNS Agent with both layers
LAYER 1 — Instructions:
"You are a DNS agent for nicolasneira.dev.
Always list records BEFORE creating.
After creating, ALWAYS validate."
→ The agent should follow this flow.
→ Could skip steps if it "reasons" it's not necessary.
LAYER 2 — Capabilities:
tools = [
list_dns_records, # ✓
create_dns_record, # ✓
update_dns_record, # ✓
validate_dns, # ✓
# delete_record → DOESN'T EXIST
]
→ Can only use these functions.
→ Cannot delete. Ever. No matter what.
Pure Python, Google ADK and Claude Code implement the two layers in different ways. The pattern is the same. The tool changes.
Layer 1 — Instructions
instructions= "You are a DNS agent..."
Layer 2 — Capabilities
skills=[list_dns, create_dns, validate_dns]
Full control. You explicitly define both layers. Maximum flexibility, no magic.
Layer 1 — Instructions
instruction= "Generate infrastructure..."
Layer 2 — Capabilities
tools=[get_config, save_compose, gen_jenkins]
Both layers explicit: instruction guides reasoning, tools restrict execution.
Layer 1 — Instructions
SKILL.md with flows, rules and references
Layer 2 — Capabilities
allowed-tools: [Bash(curl *), Read]
Enriched Layer 1: not just text, but a folder with scripts and references. Claude Code already has tools built-in.
A prompt that says "don't delete anything" is Layer 1: a suggestion the agent can ignore. A skill that doesn't exist is Layer 2: an architectural constraint the agent can't bypass.
# dns_skills.py — DNS Agent skill catalog
# Only these 6 functions exist. There is no delete_dns_record.
DNS_SKILLS = [
list_dns_records, # ✓ can list
create_dns_record, # ✓ can create (only A, CNAME, MX, TXT)
search_dns_record, # ✓ can search
update_dns_record, # ✓ can update
validate_dns, # ✓ can validate resolution
check_propagation, # ✓ can verify propagation
# delete_dns_record ← doesn't exist → impossible to execute
# update_nameservers ← doesn't exist → impossible
]
# When the agent tries to delete all records:
# → "I don't have the capability to delete DNS records."
# → Executes nothing. Doesn't get creative. Period. Agent without skills
"I want to optimize DNS" → agent executes arbitrary code →
can do rm -rf, modify nameservers, access the entire account.
Agent with skills
"I want to optimize DNS" → agent checks its catalog → can only list, create, search, update, validate, verify. Impossible to delete even if it tries.
Three specialized agents. The orchestrator (DevOps Agent) has no tools of its own — discovers the other two via the A2A protocol and delegates to them.
Here you see the two security patterns in simultaneous action: SCOPE-based (DNS Agent, restricted to one domain) and CAPABILITY-based (Monitoring Agent, read-only regardless of domain).
SCOPE-based
Manages DNS records against the real Cloudflare API.
Scope: only nicolasneira.dev. Rejects any operation on other domains.
CAPABILITY-based
Infrastructure health check. Observes any domain, but cannot modify anything.
Capability: any domain, but read-only only. Has no write skills at all.
A2A Orchestrator
No tools of its own. Discovers the others via AgentCard and delegates tasks.
Reasons about the full flow, but executes zero direct actions. Everything goes through the other agents.
Agent Skills weren't invented by anyone. The pattern already existed and the industry converged on the name. That means you're not learning a feature that could disappear tomorrow. You're learning an engineering pattern proven over decades.
1980s · Robotics
Behavior-Based Robotics: simple, bounded and reusable behaviors. The robot decides which to activate. First separation between reasoning and execution.
1990s · Multi-agent systems
Each agent publishes a capability directory. Other agents discover what each one can do. Already had TWO levels: internal execution and external publication.
2010s · Microservices
Netflix, Amazon. Each service has a defined API. Clear contracts, defined boundaries, composition via API. Bounded, reusable, safe.
2023 · Massive Function Calling
LLMs can reason when to use a function and call it directly. The bridge between free reasoning and bounded execution.
2025–2026 · Convergence on "Skills"
Google includes AgentSkill in the A2A protocol (discovery). Anthropic formalizes SKILL.md as an open standard at agentskills.io (procedural knowledge). More than 27 tools adopted. The name converges.
The most common questions about Agent Skills, Tools, Claude Skills and the A2A protocol.
An Agent Skill is a bounded, reusable and safe capability that an AI agent can invoke. The agent can reason freely about any problem, but can only execute actions defined in its skill catalog. If an action doesn't exist, the agent cannot execute it — it's not a decision, it's an architectural constraint of the system.
They are the same concept at different levels. "Skill" is the universal design pattern — the idea of separating reasoning from execution, which comes from 1980s robotics. "Tool" or "Function" is the concrete runtime implementation: the Python function, API or endpoint the agent invokes. Google ADK calls it a tool, Claude calls it tool use, OpenAI calls it function. When a tool is well designed — bounded, reusable and safe — it's following the skill pattern.
Claude Skills is Anthropic's standard for packaging procedural knowledge. A folder with a SKILL.md file (instructions and recommended flows), executable scripts and references. It doesn't replace tool use — it complements it. It teaches the agent how to approach complex tasks, which flow to follow and how to combine its tools. Published as an open standard at agentskills.io and adopted by more than 27 tools: Cursor, VS Code, Gemini CLI, OpenAI Codex and more.
In Google's A2A (Agent-to-Agent) protocol, AgentSkill is a high-level declaration of what an agent knows how to do. It's not an executable function — it's metadata used for other agents to discover it and delegate tasks to it. A single AgentSkill can group multiple internal tools. For example, a DNS Agent with 6 internal tools is published as a single AgentSkill called "DNS Management" in its AgentCard.
Every agent has two control layers. Layer 1 is instructions: system prompts, markdown files, SKILL.md. They guide how the agent should think, but they're a suggestion — the agent can technically ignore them. Layer 2 is capabilities: Python functions, registered tools, defined APIs. These are real constraints — if the function doesn't exist, the agent cannot execute that action regardless of how intelligent it is. For critical systems (infrastructure, sensitive data) use Layer 2. For flexible flows Layer 1 is enough.
Google ADK (Agent Development Kit) is Google's official framework for building AI agents with explicit two layers: the instruction parameter defines how the agent should reason (Layer 1), and the tools parameter defines what it can execute (Layer 2). ADK also facilitates publishing agents in the A2A protocol, allowing multiple specialized agents to discover each other and delegate tasks without direct coupling.