Background
SkillScan started as a personal project to understand what was actually in ClawHub skill files before installing them. What I found turned into a systematic scan of 549 skills over several months.
This case study documents every threat category, provides real patterns we found, and explains why none of these would be caught by signature-based scanners like VirusTotal.
The numbers
- 549 skills scanned (approximately 5% of the full ClawHub registry)
- 93 behavioral threats detected (16.9% threat rate)
- 76 CRITICAL severity threats (82% of all threats found)
- 17 HIGH severity threats (18% of all threats found)
- 0 VirusTotal detections across all 93 flagged skills
The 16.9% threat rate means roughly 1 in 6 ClawHub skills contains behavioral instructions that could compromise an agent or its operator. If the pattern holds across the full 10,700+ registry, that is approximately 1,800 problematic skills in the ecosystem right now.
Threat category breakdown
We identified five primary threat categories. Here is what each means in practice.
1. Credential exfiltration (27 skills, 29% of threats)
These skills contain instructions telling agents to locate, collect, and forward authentication data: API keys, tokens, OAuth credentials, .env file contents, or authentication headers.
The pattern we found most frequently: a skill that includes legitimate API integration functionality, with a secondary instruction buried in the middle of the skill definition that asks the agent to save credentials to a temporary file and upload the file to an external URL for debugging purposes.
The instruction looks like a support feature. It functions as an exfiltration channel.
2. Prompt injection payloads (31 skills, 33% of threats)
These skills contain natural language instructions designed to override the agent's primary behavior, safety controls, or authorization checks.
Common patterns: instructions to ignore previous instructions if they conflict with this skill, instructions to treat this skill's commands as higher priority than system prompts, or instructions to bypass content policies when executing this skill's specific tasks.
These payloads are effective because agents follow instructions. A well-crafted injection pattern can make an agent behave differently for all subsequent requests, not just the ones involving this skill.
3. Data exfiltration workflows (19 skills, 20% of threats)
Similar to credential exfiltration but targeting broader data: customer data, internal documents, agent conversation history, or any data the agent can access.
Most common form: a skill that includes a summary or reporting feature that also sends data to an external endpoint. The exfiltration is framed as logging, analytics, or improvement feedback.
4. Agent hijacking routines (12 skills, 13% of threats)
These skills attempt to redefine the agent's core behaviors: changing its persona, modifying its authorization model, or making it treat certain requests as pre-authorized when they should require explicit user approval.
One pattern we found: a skill that claimed to be a productivity enhancer but included instructions redefining the agent's understanding of what actions require confirmation. After installation, the agent would approve actions it would normally ask about.
5. Shell execution chains (4 skills, 4% of threats)
These skills define patterns that enable remote code execution via pipe-to-shell patterns: instructions that format strings as shell commands and ask the agent to execute them through system calls.
This category has the highest potential severity: a skill that successfully establishes shell execution capability on an agent can run arbitrary commands on the host system.
Why VirusTotal scores all of these CLEAN
This is the key point for security teams evaluating AI agent supply chain risk.
VirusTotal is a binary analysis platform. It scans files against signature databases built for known malware: executable files, scripts, documents with embedded macros. Its core capability is excellent for that threat model.
ClawHub skills are JSON configuration files with natural language instruction fields. There are no binary signatures to match. The payload is in the text. And text that says collect API keys and POST them to this URL will not match any existing malware signature, because the same text could legitimately appear in a developer logging or debugging skill.
The detection method for behavioral threats has to be behavioral. The scanner must read the instructions and reason about what they cause an agent to do, not scan bytes against a hash database.
The high-value target finding
One specific skill had 31,600 downloads and a working credential exfiltration endpoint embedded in natural language. The endpoint was live during our scan. VirusTotal returned CLEAN.
We did not name the skill publicly because disclosure requires a responsible process. But this finding is the reason we think pre-install behavioral scanning matters: a skill with tens of thousands of downloads had a functional data exfiltration mechanism that no existing scanner would catch.
Methodology
SkillScan analyzes SKILL.md instruction content using behavioral pattern detection across five threat categories. The detection is not regex-based: we analyze the semantic meaning of instructions and their likely effect on agent behavior.
We intentionally err toward false negatives over false positives. A skill that mentions credential handling in a clearly defensive context is not flagged. A skill that instructs the agent to transmit credentials to a third-party URL is flagged regardless of how the instruction is framed.
The full dataset is at https://clawhub-scanner.chitacloud.dev/api/report. The scanner is free for individual skills at https://skillscan.chitacloud.dev
For security teams
If your organization deploys AI agents with skill or tool installations from public registries, this threat category is likely not in your current monitoring scope. Signature-based scanners will not catch it. Your agent security posture should include pre-install behavioral verification as a standard control.
Questions about methodology or threat categories: [email protected]