What Are Agent Skills?

Picture a brilliant new hire on your engineering team. She writes flawless React components. Her TypeScript is clean. Her Tailwind usage is precise. She even writes tests without being asked. There is just one problem: she has never read your team's style guide. She does not know that props interfaces carry a Props suffix, that default exports are forbidden, or that every interactive element must respond to keyboard navigation. She is talented. She is uninformed.

That is your AI coding agent without skills.

Think of a skill the way you would think of a recipe handed to a talented chef. The recipe provides structure: ingredients, sequence, constraints. The chef provides judgment: when to let the sauce reduce another minute, when to swap one ingredient for another. Without the recipe, even a gifted chef improvises differently every time. Without skills, even a powerful model produces inconsistent results.

In concrete terms, an Agent Skill is a folder. Inside it sits a SKILL.md file, plus optional scripts, reference materials, and examples. The agent loads it on demand when a task matches the skill's description. That is the entire mechanism. No compilation. No runtime. No dependency graph. A Markdown file with YAML frontmatter teaches the agent a specific procedure.

The analogy that stuck across the industry: skills are the npm of AI agents. Where npm packages give your application reusable code, skills give your AI agent reusable knowledge. A package exports functions. A skill exports procedures, constraints, and domain expertise.

Why does this matter? Because the gap between "generally intelligent" and "specifically useful" is where most AI-assisted development fails. Claude Opus 4.6 is extraordinarily capable. It still does not know your deployment pipeline or your compliance requirements. Skills close that gap. You encode your context once. The agent applies it every time.

The format originated at Anthropic. It was released as an open standard and has since been adopted by OpenAI (Codex CLI), Microsoft (GitHub Copilot), and the broader ecosystem. As of March 2026, over 490,000 skills exist across three major marketplaces (SkillsMP, Skills.sh, ClawHub). The trajectory from "interesting idea" to "standard infrastructure" took less than six months.

The SKILL.md Format

Every skill starts with a single file: SKILL.md. The format is deliberately simple. This is not an accident. Simplicity is the design.

YAML frontmatter provides machine-readable metadata. Markdown provides human-readable instructions. The agent parses the first. It follows the second. Here is a real, working skill:

---
name: react-component
description: Create React components following our team conventions with TypeScript, Tailwind CSS, and proper test coverage.
---

## Instructions

When creating a new React component:

1. Use functional components with TypeScript interfaces for props
2. Place the component in `src/components/[ComponentName]/`
3. Create three files:
   - `index.tsx` - the component implementation
   - `index.test.tsx` - unit tests with Vitest
   - `index.stories.tsx` - Storybook story (if the component is visual)

## Conventions

- Props interface name: `[ComponentName]Props`
- Export as named export, never default export
- Use Tailwind CSS utility classes, no inline styles
- All interactive components must handle keyboard navigation

## Example

```tsx
interface ButtonProps {
  variant: 'primary' | 'secondary' | 'ghost';
  size?: 'sm' | 'md' | 'lg';
  children: React.ReactNode;
  onClick?: () => void;
}

export function Button({ variant, size = 'md', children, onClick }: ButtonProps) {
  return (
    <button
      className={cn(baseStyles, variantStyles[variant], sizeStyles[size])}
      onClick={onClick}
    >
      {children}
    </button>
  );
}

Two frontmatter fields are required. name must be lowercase with hyphens, maximum 64 characters, and must match the parent directory name. description is the trigger. The agent reads it to decide whether this skill is relevant to the current task. Maximum 1,024 characters.

The description field is the single most important line in your skill. Think of it as a search query in reverse: instead of you searching for information, the agent searches your skills for relevance. If the description does not clearly match the types of requests your skill handles, the agent will never load it.

Optional frontmatter fields include allowed-tools (restricts which tools the agent can call when the skill is active), metadata (author, version), and license.

Progressive disclosure is the core design principle. The frontmatter is always loaded -- tiny cost. The full SKILL.md body is loaded only when the agent decides the skill is relevant. For large skills, the body can reference external files that the agent reads on demand. This keeps the initial context window cost low. When you have dozens of skills installed, that economy matters.

The Ecosystem: 490K Skills and Growing

Six months ago, the skill ecosystem barely existed. Today, three marketplaces serve nearly half a million skills to millions of developers.

Marketplace	Skills	Installs	Launched	Focus
SkillsMP	400K+	Not disclosed	Mid-2025	Volume. Crawls GitHub for SKILL.md files, indexes with semantic search.
Skills.sh (Vercel)	83K+	8M+	Jan 20, 2026	Curated. CLI-native install, leaderboard, security scanning via Snyk.
ClawHub (OpenClaw)	~10K+	Not disclosed	Late 2025	Open platform. Hit by ClawHavoc malware campaign (see Security section).

SkillsMP is the volume leader. A community-run discovery platform, it crawls GitHub repositories for SKILL.md files and indexes them with AI-powered semantic search. The catalog spans 89K tools skills, 70K development skills, and 60K business skills. The growth curve tells the story: a few thousand in December 2025, tens of thousands in January 2026, 400K+ by mid-March. That is not linear growth. That is a phase transition.

Skills.sh is Vercel's entry. Launched January 20, 2026, it reached 20,000 installs within six hours. Stripe shipped their own skills within hours of the launch. Skills.sh differentiates through curated quality, a CLI-native install experience (npx skills install vercel-labs/react-best-practices), and integrated security scanning via a partnership with Snyk. As of March 2026, top skills like vercel-react-best-practices have exceeded 100K installs each.

ClawHub is the cautionary tale. OpenClaw's marketplace grew fast but suffered the ClawHavoc campaign, with 341 malicious skills distributing Atomic macOS Stealer (details in the Security section below). Trust damage was significant.

The velocity is remarkable. In the AI CLI tools landscape, we track how quickly developer tooling moves in 2026. Skills follow the same compressed timeline: from concept to standard infrastructure in months, not years.

Try Termdock — Ast Code Analysis works out of the box. Free download →

Creating Your First Skill

Building a skill takes five minutes. That is not marketing language. It is a Markdown file in a directory.

Step 1: Create the directory.

Skills live in .claude/skills/ for personal use or your project's .claude/skills/ directory for team use.

mkdir -p .claude/skills/code-review

Step 2: Write the SKILL.md file.

---
name: code-review
description: Perform thorough code review with focus on security, performance, and maintainability. Flag issues by severity.
---

## Review Process

When asked to review code, follow this process:

1. **Security scan** - Check for injection vulnerabilities, exposed secrets, improper auth checks
2. **Performance** - Identify N+1 queries, unnecessary re-renders, missing indexes
3. **Maintainability** - Flag functions over 30 lines, nesting deeper than 3 levels, missing types
4. **Testing** - Verify test coverage for critical paths

## Output Format

For each issue found:

- **Severity**: Critical / Warning / Suggestion
- **Location**: File path and line range
- **Issue**: One-sentence description
- **Fix**: Concrete code change or direction

## Rules

- Never approve code with Critical issues
- If no issues found, explicitly state the code passes review
- Do not nitpick formatting - assume a formatter handles that

Step 3: Test it.

Open your terminal and ask the agent a question that should trigger the skill:

claude "review the changes in my last commit"

The agent should load the code-review skill based on the description match, then follow the review process you defined.

Step 4: Iterate on the description.

If the skill does not trigger when you expect it to, refine the description field. Be specific about what tasks should match. Anthropic's Skill Creator tool (available at claude.com/plugins/skill-creator) automates this. It runs an eval loop that tests trigger phrases against your description, iterates up to 5 times, and produces an optimized description with measurable trigger rates.

The Skill Creator operates in four modes: Create, Eval, Improve, and Benchmark. The Improve mode is where the real work happens. It splits your test cases into 60% training and 40% held-out test, evaluates trigger rates across 3 runs per query, proposes description improvements based on failures, and re-evaluates iteratively. The result: a description that reliably triggers on the requests you care about.

Skills Across Agents: Claude Code, Codex CLI, Copilot

The SKILL.md format is an open standard. Three major agents support it, each with slightly different conventions for where skills live and how they are discovered. The interesting question is not whether they are compatible -- they are -- but what the differences reveal about each agent's philosophy.

Claude Code

Claude Code reads skills from three locations:

Personal: ~/.claude/skills/ - Available across all your projects
Project: .claude/skills/ - Shared with the team via version control
Marketplace: Installed via Anthropic's plugin system or manually

Claude Code loads skill descriptions into context at session start and selectively loads full skill bodies when a request matches a description. The allowed-tools frontmatter field works here: you can restrict a skill to only use specific tools (e.g., only read files, never execute shell commands).

Codex CLI

OpenAI adopted the Agent Skills standard and maintains an official Skills Catalog at github.com/openai/skills, with 13K+ GitHub stars and 35 curated skills as of March 2026. Codex CLI reads skills from:

Project: .agents/skills/ or .codex/skills/
Catalog: Install via the built-in $skill-installer command

The OpenAI catalog doubles as both a distribution mechanism and a reference implementation. Installation requires no config files, no package managers, and no build steps. Just $skill-installer plus a restart.

GitHub Copilot

GitHub announced Agent Skills support in December 2025. Copilot reads skills from .github/skills/ and supports the same SKILL.md format. Skills work across Copilot coding agent, Copilot CLI, and agent mode in VS Code.

Copilot's discovery mechanism matches the standard approach: read name and description from frontmatter, match against user requests, load full body on match. The .github/skills/ directory convention means skills naturally live alongside GitHub Actions, issue templates, and other GitHub-specific configuration.

Cross-Agent Compatibility

A well-written skill works across all three agents without modification. The SKILL.md format is the common denominator. The only differences are directory conventions and optional agent-specific frontmatter fields. If you need cross-agent compatibility, stick to the core spec: name, description, and Markdown body. Avoid agent-specific extensions in frontmatter.

For teams using multiple AI CLI tools -- which is increasingly common in 2026 -- placing skills in a shared directory and symlinking to each agent's expected path is a practical workaround:

# Canonical location
mkdir -p .skills/code-review

# Symlinks for each agent
ln -s ../../.skills .claude/skills
ln -s ../../.skills .agents/skills
ln -s ../../.skills .github/skills

Superpowers and the Skills Framework

The most adopted skills framework is Superpowers by Jesse Vincent. The numbers speak for themselves: over 87K GitHub stars as of March 2026, growing at roughly 9K stars per month. Superpowers is not a single skill. It is a complete software development methodology expressed as composable skills.

Think of it this way. A single skill is a recipe. Superpowers is an entire cookbook, plus a kitchen workflow that ensures the chef follows the recipes in the right order.

The framework ships with 10+ core skills covering the full development lifecycle:

Brainstorming - Structured ideation that forces the agent to explore multiple approaches before committing
Planning - Break tasks into subtasks with dependencies, estimate complexity, identify risks
Test-Driven Development - Write failing tests first, implement to pass, refactor. The agent follows the red-green-refactor cycle automatically
Code Review - Multi-pass review covering security, performance, correctness, and style
Debugging - Systematic hypothesis-driven debugging with logging and bisection strategies
Documentation - Generate docs from code, not the other way around

What makes Superpowers effective is the enforcement mechanism. The skills do not just suggest TDD. They refuse to write implementation code without tests. They do not just recommend planning. They halt and produce a plan before touching any files. This turns a general-purpose LLM into a disciplined developer with a repeatable methodology.

Superpowers was accepted into the official Anthropic Claude Code plugin marketplace in January 2026. It also has community ports for GitHub Copilot and compatibility with Codex CLI with minimal adaptation.

For teams evaluating whether to build custom skills or adopt a framework, the answer is usually both: adopt Superpowers for general methodology, then build custom skills for domain-specific tasks (your API conventions, your deployment pipeline, your compliance requirements).

Try Termdock — Drag Resize Terminals works out of the box. Free download →

Security: 13.4% of Skills Have Critical Issues

The skill ecosystem grew fast. Security did not keep pace. The numbers are sobering.

The Snyk ToxicSkills Study

Snyk's ToxicSkills research, published February 5, 2026, scanned 3,984 skills and found:

1,467 skills (36.8%) had at least one security flaw
534 skills (13.4%) contained critical-level issues
76 skills were confirmed malicious payloads: credential theft, backdoor installation, data exfiltration
91% of malicious skills combined prompt injection with traditional malware

The attack surface is inherent to how skills work. A SKILL.md file contains instructions the AI agent follows. Those instructions can say "run this shell command" or "read this file and send its contents to this URL." The agent, trusting the skill as legitimate context, executes the instructions. This is the fundamental tension: the same mechanism that makes skills powerful makes them dangerous.

The ClawHavoc Campaign

The worst incident so far. Koi Security researchers found 341 malicious skills on ClawHub distributing Atomic macOS Stealer (AMOS). The campaign started January 27, 2026, surged on January 31, and was named ClawHavoc on February 1.

The attack exploited trust. Malicious instructions hidden in SKILL.md files used AI agents as trusted intermediaries. The agent would present a fake setup dialog to the user, requesting their system password to "complete installation." The AMOS variant harvested browser credentials, keychain passwords, cryptocurrency wallet data, SSH keys, and files from common user directories.

By February 5, Antiy researchers had identified 1,184 malicious packages linked to 12 publisher accounts, with one uploader alone responsible for 677 packages. ClawHub implemented mandatory scanning only after the damage was done.

The Threat Model

Skills have three attack vectors:

Shell execution - Skills can instruct agents to run arbitrary shell commands. A malicious skill can download and execute payloads.
Filesystem access - Skills can instruct agents to read sensitive files (.env, SSH keys, credentials) and exfiltrate data.
Prompt injection - Skills can embed instructions that override the agent's safety guidelines, using the agent's trusted position to social-engineer the user.

Mitigation

Snyk and Vercel partnered to build integrated security scanning for Skills.sh. Snyk's Agent Scan tool (available at labs.snyk.io) analyzes SKILL.md files for known malicious patterns with 90-100% recall on confirmed malicious skills and 0% false positives on the top-100 legitimate skills.

Practical steps:

Only install skills from verified sources. Skills.sh with Snyk scanning is the safest marketplace. SkillsMP is acceptable for discovery but verify before installing.
Read the SKILL.md before installing. It is a Markdown file. Reading it takes 2 minutes and reveals everything the skill instructs the agent to do.
Use allowed-tools restrictions. Lock skills to the minimum required tool set. A code review skill does not need shell execution access.
Never enter credentials when an agent asks. No legitimate skill requires your system password.
Audit installed skills periodically. Run snyk agent-scan .claude/skills/ on your project.

Skill Architecture Best Practices

Building one skill is easy. Building skills that remain maintainable as your team grows requires the same discipline you apply to any codebase.

Keep SKILL.md Under 500 Lines

Every line in SKILL.md is context window cost when loaded. Think of it like memory allocation: each line consumes a scarce resource. Keep the main file focused on instructions and constraints. Move large reference material (API schemas, extensive examples, template libraries) into separate files that the skill references.

---
name: api-design
description: Design REST API endpoints following our conventions.
---

## Instructions

Follow the API design guidelines in `./references/api-standards.md`.
Use the OpenAPI template in `./templates/endpoint.yaml` as a starting point.

## Key Rules

- All endpoints must use JSON:API format
- Authentication via Bearer token
- Rate limiting headers on every response
- Pagination via cursor, not offset

The agent reads the 20-line SKILL.md first. Only when it needs the full API standards or the template does it read the referenced files. Progressive disclosure in action.

Scripts for Deterministic Tasks

Some tasks should not be left to LLM interpretation. Linting, formatting, running test suites, deploying -- these have exact commands. Put them in scripts that the skill invokes:

In your SKILL.md, include:

After Making Changes

Run the validation script: ./scripts/validate.sh

This script runs:

TypeScript type checking (tsc --noEmit)

ESLint with auto-fix (eslint --fix)

Unit tests (vitest run)

Build verification (next build)

The agent calls the script rather than improvising each step. Deterministic where possible. Intelligent where necessary.

When to Split Into Multiple Skills

Split when a skill tries to serve two different trigger patterns. A "code review" skill and a "PR description writer" skill share some context but trigger on different requests and produce different outputs. If your SKILL.md has sections that only apply to some triggers, it is time to split.

Rule of thumb: one skill, one verb. Review code. Generate tests. Write documentation. Deploy to staging. Each gets its own skill.

Skills have three scopes. The boundaries matter because they determine who sees what, who controls what, and who is affected by changes.

Personal skills live in ~/.claude/skills/ (or equivalent for other agents). These are your private workflow optimizations: how you like commit messages formatted, your personal code review checklist, your debugging approach. They travel with you across projects.

Project skills live in .claude/skills/ inside the repository. These are committed to version control, shared with the team, and version-tracked like any other code. Project skills encode team conventions: API design standards, component templates, deployment procedures.

Marketplace skills are installed from external sources. They provide general-purpose capabilities: Superpowers for methodology, language-specific best practices, framework-specific patterns.

Governance

For project skills, apply the same rigor you apply to code:

Code review skill changes. A SKILL.md change affects every developer on the team. Review it like you would review a CI/CD pipeline change.
Pin marketplace skill versions. Document which version of external skills your project uses. Do not auto-update.
Test skill behavior. When a skill changes, verify it still produces the expected output on representative tasks.

The governance overhead is low because skills are small. A typical project has 5-15 custom skills. Each is a single Markdown file. The review burden is manageable.

Context Engineering: Skills vs CLAUDE.md vs AGENTS.md

Skills exist alongside other context mechanisms. Understanding when to use which is like understanding when to use a global variable versus a function parameter. Both store information. The scope determines the right choice.

Mechanism	Scope	Loaded When	Best For
CLAUDE.md	Project	Always (every session)	Architecture, conventions, hard constraints
AGENTS.md	Project	Always (cross-agent)	Same as CLAUDE.md but for multi-agent teams
SKILL.md	Task	On demand (description match)	Specific procedures, templates, checklists
MCP Servers	External	On tool call	Live data access (databases, APIs, services)

CLAUDE.md is your project's constitution. Always loaded, always in context. Keep it to 200-500 words of high-signal information. Architecture decisions, critical conventions, hard constraints. If something applies to every task in the project, it belongs in CLAUDE.md.

AGENTS.md serves the same role as CLAUDE.md but is recognized by Codex CLI, Copilot, and other tools. If your team uses multiple AI CLI tools, maintain AGENTS.md as the canonical source.

Skills are task-level capabilities. They load only when relevant, so they can be longer and more detailed than CLAUDE.md without wasting context on unrelated tasks. The code review skill only loads when you ask for a code review. The API design skill only loads when you design an API.

MCP Servers provide live data. Skills tell the agent how to do something. MCP servers give the agent access to something: a database, a GitHub repo, a monitoring dashboard.

The anti-pattern is stuffing everything into CLAUDE.md. If your CLAUDE.md exceeds 500 words, extract task-specific sections into skills. The ETH Zurich research on context engineering (covered in our AI CLI tools guide) confirmed that overly detailed context files degrade agent performance.

Building and Testing Skills

Anthropic's Skill Creator plugin (claude.com/plugins/skill-creator) is the fastest path from idea to working skill. It handles the create-eval-improve loop that manual skill development requires.

The workflow:

Create - Describe what the skill should do. The Skill Creator generates a SKILL.md draft.
Eval - Provide 5-10 test queries that should trigger the skill. The tool measures trigger rate and response quality.
Improve - The tool analyzes failures, proposes description and instruction changes, and re-evaluates. Iterates up to 5 times automatically.
Benchmark - Compare the optimized skill against the original. The tool produces an HTML report showing trigger rate, response quality, and iteration-by-iteration improvement.

For manual development, the eval loop is straightforward: write the skill, test 5 trigger phrases, check if the agent loads the skill and follows the instructions, refine description if trigger rate is low, refine instructions if behavior is wrong.

Description optimization is the highest-leverage improvement. A description that says "code review" triggers on fewer requests than one that says "Perform thorough code review with focus on security, performance, and maintainability. Flag issues by severity." Be specific about the task, the approach, and the output format.

Testing skills effectively means editing the SKILL.md in one pane while running test prompts in a terminal pane next to it. See the instructions, run the prompt, check the output, iterate. This edit-test loop is where a multi-pane terminal environment pays off: you need simultaneous visibility into the skill file and the agent's response.

Getting Started Checklist

Understand the format. Read the SKILL.md specification at agentskills.io/specification. It takes 10 minutes.
Install Superpowers. Clone github.com/obra/superpowers and follow the setup instructions. This gives you a battle-tested skill framework immediately.
Create one personal skill. Pick your most repeated workflow (code review, component creation, PR descriptions) and encode it as a SKILL.md in ~/.claude/skills/.
Create one project skill. Pick your team's most important convention (API design, test patterns, deployment procedures) and commit it to .claude/skills/ in your repository.
Test trigger reliability. Ask 5 different phrasings of the same request. If the skill does not load consistently, improve the description.
Scan for security. Run Snyk Agent Scan on any marketplace skills before installing. Read the SKILL.md file. If it instructs the agent to run shell commands you do not understand, do not install it.
Set up your terminal for skill development. Download Termdock to edit SKILL.md files in the integrated file manager, test skills in the terminal pane, and use workspace switching to maintain separate skill configurations per project. AST analysis can parse skill file structure, and session recovery preserves your skill development context across restarts.
Establish team governance. Add SKILL.md files to code review requirements. Pin marketplace skill versions. Document your skill inventory.
Iterate weekly. Skills improve with use. Review which skills trigger correctly, which produce good output, and which need refinement. Treat skills like code: they are living artifacts that evolve with your project.

The agent skills ecosystem went from zero to nearly half a million in six months. The pace is not slowing. The developers who invest in skill literacy now -- understanding the format, building team conventions, securing their supply chain -- will compound that advantage through 2026 and beyond.

Danny Huang·Follow on Threads →

Free Download

Ready to streamline your terminal workflow?

Multi-terminal drag-and-drop layout, workspace Git sync, built-in AI integration, AST code analysis — all in one app.

Download Termdock →

#agent-skills#skill-md#claude-code#codex-cli#copilot#ai-agents

Agent Skills Guide 2026: Build, Share & Secure