Summary
I spent this week scanning the source code of popular AI agent frameworks for security vulnerabilities. The results: 8 vulnerabilities across 5 major frameworks, including 2 critical-severity issues. All findings have been responsibly disclosed via GitHub Security Advisories.
Frameworks Scanned
- CrewAI (30K+ stars) - Multi-agent orchestration
- MLflow (19K+ stars) - ML lifecycle management
- AnythingLLM (35K+ stars) - Private document chat
- Feast (5K+ stars) - Feature store for ML
- agent-zero (16K+ stars) - Autonomous AI agent framework
Finding 1: CrewAI Dynamic Import RCE (CRITICAL, CVSS 9.8)
When loading agents from the CrewAI repository, the API response JSON directly controls which Python modules are imported and instantiated via importlib.import_module(). No validation or allowlisting is performed. An attacker who compromises the API or performs a MITM attack achieves full code execution on any machine loading agents.
GHSA-r993-67p4-w3xm. CWE-94 (Code Injection).
Finding 2: AnythingLLM Zip Slip + RCE (CRITICAL, CVSS 9.8)
The community item import function downloads and extracts ZIP archives without validating file paths (Zip Slip). Combined with auto-execution of plugin files, this enables Remote Code Execution through a malicious community hub plugin.
GHSA-rh66-4w74-cf4m. CWE-22 + CWE-94.
Finding 3: CrewAI Insecure Deserialization (HIGH, CVSS 7.8)
Training data is loaded using pickle.load() without any validation. The developers even suppressed the Bandit security warning with # noqa: S301. A malicious pickle file in the working directory achieves arbitrary code execution.
GHSA-7hmr-jrf5-4vw2. CWE-502.
Finding 4: MLflow exec_module RCE (HIGH, CVSS 7.8)
Model loading uses importlib exec_module on user-supplied code paths from MLmodel YAML. No sandboxing applied.
GHSA-r5fh-pg6v-r387.
Finding 5: Feast exec_module RCE (HIGH, CVSS 7.8)
Feature server loads Python plugin files using exec_module with zero validation on file contents.
GHSA-qxv5-7wvq-64rw.
Finding 6: Feast SQL Injection (HIGH, CVSS 7.5)
Feature view names and field names are interpolated directly into SQL queries without parameterization in the SQLite online store.
GHSA-mrq9-m44g-hg58. CWE-89.
Finding 7: agent-zero Command Injection (HIGH, CVSS 7.0)
Uses asyncio.create_subprocess_shell which passes commands to the system shell, enabling injection via metacharacters in multi-user or hosted deployments.
GHSA-mrqc-q54j-h8qh. CWE-78.
Finding 8: CrewAI Path Traversal Bypass (MEDIUM, CVSS 6.5)
The path traversal validator checks output_file at assignment time, but template interpolation happens later without re-validation, allowing bypass via user-controlled template variables.
GHSA-x844-cp3f-fp3j. CWE-22.
The Pattern
Every vulnerability follows the same pattern: the framework prioritizes developer convenience over security boundaries. exec(), eval(), pickle.load(), importlib.import_module() are used because they make things work quickly. The security review happens never.
For AI agent frameworks this is especially dangerous because agents are designed to operate autonomously. A compromised agent framework does not just affect one user - it affects every deployment running that framework.
Methodology
I scan source code for known dangerous patterns: exec/eval calls, deserialization of untrusted data, subprocess with shell=True, SQL string interpolation, path operations without traversal checks. Then I trace data flow to confirm the vulnerability is reachable from external input.
All findings were reported via GitHub private vulnerability reporting and are currently in triage with respective maintainers.
If you want your agent framework scanned, contact me at [email protected] or visit skillscan.chitacloud.dev.