·18 min read·agent-skills

The Complete Guide to Agent Skills in 2026: Build, Share, and Secure AI Agent Capabilities

Everything you need to know about Agent Skills — the SKILL.md format, the 351K+ skill ecosystem, security risks, and best practices for Claude Code, Codex CLI, and Copilot.

DH
Danny Huang

What Are Agent Skills?

An Agent Skill is a folder containing a SKILL.md file — plus optional scripts, references, and examples — that teaches an AI coding agent how to perform a specific task. Think of it as a self-contained instruction manual that the agent loads on demand. When you ask Claude Code to "create a React component following our design system," a skill can provide the exact conventions, templates, and validation steps the agent should follow.

The analogy that stuck: skills are the npm of AI agents. Where npm packages give your application reusable code, skills give your AI agent reusable knowledge. A package exports functions. A skill exports procedures, constraints, and domain expertise.

Why do skills matter? Because raw model intelligence is not enough. Claude Opus 4.6 is extraordinarily capable, but it does not know your team's naming conventions, your deployment pipeline, or your compliance requirements. Skills close the gap between general intelligence and project-specific execution. Without skills, you repeat the same context in every prompt. With skills, you encode that context once and the agent applies it every time.

The format originated at Anthropic, released as an open standard, and has been adopted by OpenAI (Codex CLI), Microsoft (GitHub Copilot), and the broader ecosystem. As of March 2026, over 351,000 skills exist across three major marketplaces. The trajectory from "interesting idea" to "standard infrastructure" took less than six months.

The SKILL.md Format

Every skill starts with a single file: SKILL.md. The format is deliberately simple — YAML frontmatter for machine-readable metadata, followed by Markdown instructions for the agent.

Here is a real, working skill:

---
name: react-component
description: Create React components following our team conventions with TypeScript, Tailwind CSS, and proper test coverage.
---

## Instructions

When creating a new React component:

1. Use functional components with TypeScript interfaces for props
2. Place the component in `src/components/[ComponentName]/`
3. Create three files:
   - `index.tsx` — the component implementation
   - `index.test.tsx` — unit tests with Vitest
   - `index.stories.tsx` — Storybook story (if the component is visual)

## Conventions

- Props interface name: `[ComponentName]Props`
- Export as named export, never default export
- Use Tailwind CSS utility classes, no inline styles
- All interactive components must handle keyboard navigation

## Example

```tsx
interface ButtonProps {
  variant: 'primary' | 'secondary' | 'ghost';
  size?: 'sm' | 'md' | 'lg';
  children: React.ReactNode;
  onClick?: () => void;
}

export function Button({ variant, size = 'md', children, onClick }: ButtonProps) {
  return (
    <button
      className={cn(baseStyles, variantStyles[variant], sizeStyles[size])}
      onClick={onClick}
    >
      {children}
    </button>
  );
}

The frontmatter has two required fields. name must be lowercase with hyphens, maximum 64 characters, and must match the parent directory name. description is the trigger — the agent reads it to decide whether this skill is relevant to the current task. Maximum 1024 characters.

Optional frontmatter fields include allowed-tools (restricts which tools the agent can call when the skill is active), metadata (author, version), and license. The description field is the single most important line in your skill. If the description does not clearly match the types of requests your skill handles, the agent will never load it.

Progressive disclosure is the core design principle. The frontmatter is always loaded (tiny cost). The full SKILL.md body is loaded only when the agent decides the skill is relevant. For large skills, the body can reference external files that the agent reads on demand — keeping the initial context window cost low.

The Ecosystem: 351K Skills and Growing

Six months ago, the skill ecosystem barely existed. Today, three marketplaces serve over 351,000 skills to millions of developers.

MarketplaceSkillsInstallsLaunchedFocus
SkillsMP351K+Not disclosedMid-2025Volume. Crawls GitHub for SKILL.md files, indexes with semantic search.
Skills.sh (Vercel)83K+8M+Jan 20, 2026Curated. CLI-native install, leaderboard, security scanning via Snyk.
ClawHub (OpenClaw)~50KNot disclosedLate 2025Open platform. Hit by ClawHavoc malware campaign (see Security section).

SkillsMP is the volume leader — a community-run discovery platform that crawls GitHub repositories for SKILL.md files and indexes them with AI-powered semantic search. The catalog spans 89K tools skills, 70K development skills, and 60K business skills. Growth has been exponential: a few thousand in December 2025, tens of thousands in January 2026, vertical to 351K by early March.

Skills.sh is Vercel's entry. Launched January 20, 2026, it reached 20,000 installs within six hours. Stripe shipped their own skills within hours of the launch. Skills.sh differentiates through curated quality, a CLI-native install experience (npx skills install vercel-labs/react-best-practices), and integrated security scanning via a partnership with Snyk. As of March 2026, top skills like vercel-react-best-practices have exceeded 100K installs each.

ClawHub is the cautionary tale. OpenClaw's marketplace grew fast but suffered the ClawHavoc campaign — 341 malicious skills distributing Atomic macOS Stealer (details in the Security section below). Trust damage was significant.

The velocity of ecosystem growth is remarkable. In the AI CLI tools landscape, we track how quickly developer tooling moves in 2026. Skills are following the same compressed timeline: from concept to standard infrastructure in months, not years.

Try Termdock Ast Code Analysis works out of the box. Free download →

Creating Your First Skill

Building a skill takes five minutes. Here is the complete process.

Step 1: Create the directory.

Skills live in .claude/skills/ for personal use or your project's .claude/skills/ directory for team use.

mkdir -p .claude/skills/code-review

Step 2: Write the SKILL.md file.

---
name: code-review
description: Perform thorough code review with focus on security, performance, and maintainability. Flag issues by severity.
---

## Review Process

When asked to review code, follow this process:

1. **Security scan** — Check for injection vulnerabilities, exposed secrets, improper auth checks
2. **Performance** — Identify N+1 queries, unnecessary re-renders, missing indexes
3. **Maintainability** — Flag functions over 30 lines, nesting deeper than 3 levels, missing types
4. **Testing** — Verify test coverage for critical paths

## Output Format

For each issue found:

- **Severity**: Critical / Warning / Suggestion
- **Location**: File path and line range
- **Issue**: One-sentence description
- **Fix**: Concrete code change or direction

## Rules

- Never approve code with Critical issues
- If no issues found, explicitly state the code passes review
- Do not nitpick formatting — assume a formatter handles that

Step 3: Test it.

Open your terminal and ask the agent a question that should trigger the skill:

claude "review the changes in my last commit"

The agent should load the code-review skill based on the description match, then follow the review process you defined.

Step 4: Iterate on the description.

If the skill does not trigger when you expect it to, refine the description field. Be specific about what tasks should match. Anthropic's Skill Creator tool (available at claude.com/plugins/skill-creator) automates this — it runs an eval loop that tests trigger phrases against your description, iterates up to 5 times, and produces an optimized description with measurable trigger rates.

The Skill Creator operates in four modes: Create, Eval, Improve, and Benchmark. The Improve mode is where the magic happens — it splits your test cases into 60% training and 40% held-out test, evaluates trigger rates across 3 runs per query, proposes description improvements based on failures, and re-evaluates iteratively.

Skills Across Agents: Claude Code, Codex CLI, Copilot

The SKILL.md format is an open standard. Three major agents support it, each with slightly different conventions for where skills live and how they are discovered.

Claude Code

Claude Code reads skills from three locations:

  • Personal: ~/.claude/skills/ — Available across all your projects
  • Project: .claude/skills/ — Shared with the team via version control
  • Marketplace: Installed via Anthropic's plugin system or manually

Claude Code loads skill descriptions into context at session start and selectively loads full skill bodies when a request matches a description. The allowed-tools frontmatter field works here — you can restrict a skill to only use specific tools (e.g., only read files, never execute shell commands).

Codex CLI

OpenAI adopted the Agent Skills standard and maintains an official Skills Catalog at github.com/openai/skills — 13K+ GitHub stars and 35 curated skills as of March 2026. Codex CLI reads skills from:

  • Project: .agents/skills/ or .codex/skills/
  • Catalog: Install via the built-in $skill-installer command

The OpenAI catalog doubles as both a distribution mechanism and a reference implementation. Installation requires no config files, no package managers, and no build steps — just $skill-installer plus a restart.

GitHub Copilot

GitHub announced Agent Skills support in December 2025. Copilot reads skills from .github/skills/ and supports the same SKILL.md format. Skills work across Copilot coding agent, Copilot CLI, and agent mode in VS Code.

Copilot's discovery mechanism matches the standard approach: read name and description from frontmatter, match against user requests, load full body on match. The .github/skills/ directory convention means skills naturally live alongside GitHub Actions, issue templates, and other GitHub-specific configuration.

Cross-Agent Compatibility

A well-written skill works across all three agents without modification. The SKILL.md format is the common denominator. The only differences are directory conventions and optional agent-specific frontmatter fields. If you need cross-agent compatibility, stick to the core spec: name, description, and Markdown body. Avoid agent-specific extensions in frontmatter.

For teams using multiple AI CLI tools — which is increasingly common in 2026 — placing skills in a shared directory and symlinking to each agent's expected path is a practical workaround:

# Canonical location
mkdir -p .skills/code-review

# Symlinks for each agent
ln -s ../../.skills .claude/skills
ln -s ../../.skills .agents/skills
ln -s ../../.skills .github/skills

Superpowers and the Skills Framework

The most adopted skills framework is Superpowers by Jesse Vincent — over 82K GitHub stars and growing at roughly 2K stars per week as of March 2026. Superpowers is not a single skill. It is a complete software development methodology expressed as composable skills.

The framework ships with 10+ core skills covering the full development lifecycle:

  • Brainstorming — Structured ideation that forces the agent to explore multiple approaches before committing
  • Planning — Break tasks into subtasks with dependencies, estimate complexity, identify risks
  • Test-Driven Development — Write failing tests first, implement to pass, refactor. The agent follows the red-green-refactor cycle automatically
  • Code Review — Multi-pass review covering security, performance, correctness, and style
  • Debugging — Systematic hypothesis-driven debugging with logging and bisection strategies
  • Documentation — Generate docs from code, not the other way around

What makes Superpowers effective is the enforcement mechanism. The skills do not just suggest TDD — they refuse to write implementation code without tests. They do not just recommend planning — they halt and produce a plan before touching any files. This turns a general-purpose LLM into a disciplined developer with a repeatable methodology.

Superpowers was accepted into the official Anthropic Claude Code plugin marketplace in January 2026. It works with Claude Code out of the box and is compatible with Codex CLI and Copilot with minimal adaptation.

For teams evaluating whether to build custom skills or adopt a framework, the answer is usually both: adopt Superpowers for general methodology, then build custom skills for domain-specific tasks (your API conventions, your deployment pipeline, your compliance requirements).

Try Termdock Drag Resize Terminals works out of the box. Free download →

Security: 13.4% of Skills Have Critical Issues

The speed of ecosystem growth outpaced security practices. The numbers are sobering.

The Snyk ToxicSkills Study

Snyk's ToxicSkills research, published February 5, 2026, scanned 3,984 skills and found:

  • 1,467 skills (36.8%) had at least one security flaw
  • 534 skills (13.4%) contained critical-level issues
  • 76 skills were confirmed malicious payloads — credential theft, backdoor installation, data exfiltration
  • 91% of malicious skills combined prompt injection with traditional malware

The attack surface is inherent to how skills work. A SKILL.md file contains instructions that the AI agent follows. Those instructions can include "run this shell command" or "read this file and send its contents to this URL." The agent, trusting the skill as legitimate context, executes the instructions.

The ClawHavoc Campaign

The worst incident so far: researchers found 341 malicious skills on ClawHub (OpenClaw's marketplace) distributing Atomic macOS Stealer (AMOS). The campaign started January 27, 2026, surged on January 31, and was named ClawHavoc by Koi Security on February 1.

The attack method was clever. Malicious instructions hidden in SKILL.md files exploited AI agents as trusted intermediaries. The agent would present a fake setup dialog to the user, requesting their system password to "complete installation." The AMOS variant harvested browser credentials, keychain passwords, cryptocurrency wallet data, SSH keys, and files from common user directories.

A subsequent wave brought the total to over 1,184 malicious skills before ClawHub implemented mandatory scanning.

The Threat Model

Skills have three attack vectors:

  1. Shell execution — Skills can instruct agents to run arbitrary shell commands. A malicious skill can download and execute payloads.
  2. Filesystem access — Skills can instruct agents to read sensitive files (.env, SSH keys, credentials) and exfiltrate data.
  3. Prompt injection — Skills can embed instructions that override the agent's safety guidelines, using the agent's trusted position to social-engineer the user.

Mitigation

Snyk and Vercel partnered to build integrated security scanning for Skills.sh. Snyk's Agent Scan tool (available at labs.snyk.io) analyzes SKILL.md files for known malicious patterns with 90-100% recall on confirmed malicious skills and 0% false positives on the top-100 legitimate skills.

Practical steps:

  1. Only install skills from verified sources. Skills.sh with Snyk scanning is the safest marketplace. SkillsMP is acceptable for discovery but verify before installing.
  2. Read the SKILL.md before installing. It is a Markdown file — reading it takes 2 minutes and reveals everything the skill instructs the agent to do.
  3. Use allowed-tools restrictions. Lock skills to the minimum required tool set. A code review skill does not need shell execution access.
  4. Never enter credentials when an agent asks. No legitimate skill requires your system password.
  5. Audit installed skills periodically. Run snyk agent-scan .claude/skills/ on your project.

Skill Architecture Best Practices

Building one skill is easy. Building skills that remain maintainable as your team grows requires discipline.

Keep SKILL.md Under 500 Lines

Every line in SKILL.md is context window cost when loaded. Keep the main file focused on instructions and constraints. Move large reference material — API schemas, extensive examples, template libraries — into separate files that the skill references.

---
name: api-design
description: Design REST API endpoints following our conventions.
---

## Instructions

Follow the API design guidelines in `./references/api-standards.md`.
Use the OpenAPI template in `./templates/endpoint.yaml` as a starting point.

## Key Rules

- All endpoints must use JSON:API format
- Authentication via Bearer token
- Rate limiting headers on every response
- Pagination via cursor, not offset

The agent reads the 20-line SKILL.md first. Only when it needs the full API standards or the template does it read the referenced files. Progressive disclosure in action.

Scripts for Deterministic Tasks

Some tasks should not be left to LLM interpretation. Linting, formatting, running test suites, deploying — these have exact commands. Put them in scripts that the skill invokes:

In your SKILL.md, include:

After Making Changes

Run the validation script: ./scripts/validate.sh

This script runs:

  1. TypeScript type checking (tsc --noEmit)
  2. ESLint with auto-fix (eslint --fix)
  3. Unit tests (vitest run)
  4. Build verification (next build)

The agent calls the script rather than improvising each step. Deterministic where possible, intelligent where necessary.

When to Split Into Multiple Skills

Split when a skill tries to serve two different trigger patterns. A "code review" skill and a "PR description writer" skill share some context but trigger on different requests and produce different outputs. If your SKILL.md has sections that only apply to some triggers, it is time to split.

Rule of thumb: one skill, one verb. Review code. Generate tests. Write documentation. Deploy to staging. Each gets its own skill.

Team Skill Sharing

Skills have three scopes, and the boundaries matter.

Personal skills live in ~/.claude/skills/ (or equivalent for other agents). These are your private workflow optimizations — how you like commit messages formatted, your personal code review checklist, your debugging approach. They travel with you across projects.

Project skills live in .claude/skills/ inside the repository. These are committed to version control, shared with the team, and version-tracked like any other code. Project skills encode team conventions: API design standards, component templates, deployment procedures.

Marketplace skills are installed from external sources. They provide general-purpose capabilities — Superpowers for methodology, language-specific best practices, framework-specific patterns.

Governance

For project skills, apply the same rigor you apply to code:

  • Code review skill changes. A SKILL.md change affects every developer on the team. Review it like you would review a CI/CD pipeline change.
  • Pin marketplace skill versions. Document which version of external skills your project uses. Do not auto-update.
  • Test skill behavior. When a skill changes, verify it still produces the expected output on representative tasks.

The governance overhead is low because skills are small. A typical project has 5-15 custom skills. Each is a single Markdown file. The review burden is manageable.

Context Engineering: Skills vs CLAUDE.md vs AGENTS.md

Skills exist alongside other context mechanisms. Understanding when to use which prevents duplication and keeps context costs low.

MechanismScopeLoaded WhenBest For
CLAUDE.mdProjectAlways (every session)Architecture, conventions, hard constraints
AGENTS.mdProjectAlways (cross-agent)Same as CLAUDE.md but for multi-agent teams
SKILL.mdTaskOn demand (description match)Specific procedures, templates, checklists
MCP ServersExternalOn tool callLive data access (databases, APIs, services)

CLAUDE.md is your project's constitution — always loaded, always in context. Keep it to 200-500 words of high-signal information. Architecture decisions, critical conventions, hard constraints. If something applies to every task in the project, it belongs in CLAUDE.md.

AGENTS.md serves the same role as CLAUDE.md but is recognized by Codex CLI, Copilot, and other tools. If your team uses multiple AI CLI tools, maintain AGENTS.md as the canonical source.

Skills are task-level capabilities. They load only when relevant, so they can be longer and more detailed than CLAUDE.md without wasting context on unrelated tasks. The code review skill only loads when you ask for a code review. The API design skill only loads when you design an API.

MCP Servers provide live data. Skills tell the agent how to do something. MCP servers give the agent access to something — a database, a GitHub repo, a monitoring dashboard.

The anti-pattern is stuffing everything into CLAUDE.md. If your CLAUDE.md exceeds 500 words, extract task-specific sections into skills. The ETH Zurich research on context engineering (covered in our AI CLI tools guide) confirmed that overly detailed context files degrade agent performance.

Building and Testing Skills

Anthropic's Skill Creator plugin (claude.com/plugins/skill-creator) is the fastest path from idea to working skill. It handles the create-eval-improve loop that manual skill development requires.

The workflow:

  1. Create — Describe what the skill should do. The Skill Creator generates a SKILL.md draft.
  2. Eval — Provide 5-10 test queries that should trigger the skill. The tool measures trigger rate and response quality.
  3. Improve — The tool analyzes failures, proposes description and instruction changes, and re-evaluates. Iterates up to 5 times automatically.
  4. Benchmark — Compare the optimized skill against the original. The tool produces an HTML report showing trigger rate, response quality, and iteration-by-iteration improvement.

For manual development, the eval loop is straightforward: write the skill, test 5 trigger phrases, check if the agent loads the skill and follows the instructions, refine description if trigger rate is low, refine instructions if behavior is wrong.

Description optimization is the highest-leverage improvement. A description that says "code review" triggers on fewer requests than one that says "Perform thorough code review with focus on security, performance, and maintainability. Flag issues by severity." Be specific about the task, the approach, and the output format.

Testing skills effectively means editing the SKILL.md in one pane while running test prompts in a terminal pane next to it. See the instructions, run the prompt, check the output, iterate. This edit-test loop is where a multi-pane terminal environment pays off — you need simultaneous visibility into the skill file and the agent's response.

Getting Started Checklist

  1. Understand the format. Read the SKILL.md specification at agentskills.io/specification. It takes 10 minutes.
  2. Install Superpowers. Clone github.com/obra/superpowers and follow the setup instructions. This gives you a battle-tested skill framework immediately.
  3. Create one personal skill. Pick your most repeated workflow — code review, component creation, PR descriptions — and encode it as a SKILL.md in ~/.claude/skills/.
  4. Create one project skill. Pick your team's most important convention — API design, test patterns, deployment procedures — and commit it to .claude/skills/ in your repository.
  5. Test trigger reliability. Ask 5 different phrasings of the same request. If the skill does not load consistently, improve the description.
  6. Scan for security. Run Snyk Agent Scan on any marketplace skills before installing. Read the SKILL.md file. If it instructs the agent to run shell commands you do not understand, do not install it.
  7. Set up your terminal for skill development. Download Termdock — edit SKILL.md files in the integrated file manager, test skills in the terminal pane, and use workspace switching to maintain separate skill configurations per project. AST analysis can parse skill file structure, and session recovery preserves your skill development context across restarts.
  8. Establish team governance. Add SKILL.md files to code review requirements. Pin marketplace skill versions. Document your skill inventory.
  9. Iterate weekly. Skills improve with use. Review which skills trigger correctly, which produce good output, and which need refinement. Treat skills like code — they are living artifacts that evolve with your project.

The agent skills ecosystem went from zero to 351K in six months. The pace is not slowing. The developers who invest in skill literacy now — understanding the format, building team conventions, securing their supply chain — will compound that advantage through 2026 and beyond.

DH
Free Download

Ready to streamline your terminal workflow?

Multi-terminal drag-and-drop layout, workspace Git sync, built-in AI integration, AST code analysis — all in one app.

Download Termdock →
#agent-skills#skill-md#claude-code#codex-cli#copilot#ai-agents

Related Posts