DeerFlow: ByteDance's Open-Source SuperAgent
A deep look at DeerFlow 2.0, ByteDance's open-source SuperAgent harness that orchestrates sub-agents, sandboxes, memory, and skills to handle complex tasks lasting minutes to hours. Covers architecture, setup, MCP integration, Claude Code bridging, and how to monitor parallel sub-agents with Termdock.
The Problem with a Single Agent
Imagine asking one person to research a topic, write a report, generate the slides, produce supporting images, and publish a webpage -- all at once. Even a brilliant person would buckle under the task-switching overhead. They would lose context jumping between research notes and slide layouts, forget details between the report draft and the code for the webpage.
This is what happens when you send a complex request to a single AI agent. The agent does one thing at a time. It holds one context window. It loses earlier reasoning as the conversation grows longer. And it has no way to step back and coordinate parallel efforts.
DeerFlow is ByteDance's answer to this bottleneck: stop making one agent do everything and start orchestrating many.
What DeerFlow Actually Is
DeerFlow (Deep Exploration and Efficient Research Flow) is an open-source SuperAgent harness. Not a single agent, but an orchestration layer that spawns, coordinates, and synthesizes the work of multiple sub-agents. It shipped as version 2.0 on February 28, 2026, hit #1 on GitHub Trending within hours, and has accumulated over 89,000 stars since.
The word "harness" matters. A harness does not do the pulling -- the horses do. DeerFlow does not write your code or generate your slides directly. It decides which sub-agents to spawn, what tools each one gets, how much context they carry, and how their results merge into a coherent output.
Think of it as the difference between a chef and a kitchen manager. A single AI agent is one chef doing everything. DeerFlow is the manager who assigns the prep cook to chop vegetables, the pastry chef to handle dessert, and the line cook to fire the entrees -- then plates the final dish from everyone's work.
The project is built on LangGraph and LangChain, runs a Python backend with a React frontend, and ships under the MIT license. It supports any LangChain-compatible model: GPT-4, DeepSeek v3.2, Gemini 2.5 Flash via OpenRouter, ByteDance's own Doubao-Seed models, and more.
Architecture: How the Pieces Fit
DeerFlow's architecture has four core layers that stack on each other like floors in a building. Understanding each floor clarifies why the system works the way it does.
The Lead Agent
At the top sits the lead agent. It receives your request, breaks it into sub-tasks, decides which sub-agents to spawn, and synthesizes their outputs. The lead agent is the only component that sees the full picture. It maintains the high-level plan and adjusts it as sub-agents report back.
Sub-Agents
Each sub-agent gets a scoped assignment: a specific question to research, a piece of code to write, an image to generate. Sub-agents receive only the context they need -- not the full conversation history. This is deliberate context engineering. A sub-agent researching "current battery technology trends" does not need to know about the slide design sub-agent's layout preferences.
Sub-agents run in parallel when their tasks are independent. When sub-agent A needs the output of sub-agent B, the lead agent sequences them. Once all sub-agents report back, the lead agent synthesizes their structured results into a final output.
The fan-out / fan-in pattern is the core idea. A research task might spawn twelve sub-agents exploring different angles, then converge into a single report. A website-building task might spawn separate sub-agents for content, styling, and deployment.
The Sandbox
Every task runs in an isolated environment. DeerFlow supports three execution modes:
- Local execution -- code runs directly on the host machine. Fast, no setup overhead, but no isolation.
- Docker execution -- code runs in isolated containers. Each task gets a dedicated filesystem at
/mnt/user-data/withuploads/,workspace/, andoutputs/directories. - Kubernetes execution -- code runs in K8s pods via a provisioner service. For teams running DeerFlow at scale.
The sandbox is not cosmetic. Sub-agents can read files, write files, execute bash commands, and view images -- all within their sandbox. If a sub-agent's code crashes or misbehaves, it stays contained. Your host machine is untouched.
Skills and Memory
Skills are structured capability modules defined in Markdown files with YAML frontmatter. Each skill describes a workflow: how to write a research report, how to generate a slide deck, how to build a web page. DeerFlow loads skills progressively -- only pulling in the skill definition when a sub-agent actually needs it. This keeps context windows lean instead of front-loading every possible capability.
Built-in skills cover research, report generation, slide creation, web pages, and image/video generation. You can install external skills via .skill archives through the Gateway API.
Memory is persistent across sessions. DeerFlow's memory system automatically extracts user context, facts, and preferences from conversations. It stores them locally with confidence scores and uses debounced updates to minimize LLM calls. When you return to DeerFlow after a week, it remembers your project context, your preferred output format, and accumulated knowledge from prior sessions.
Key Features Worth Knowing
InfoQuest Search Integration
DeerFlow integrates BytePlus's InfoQuest, an intelligent search and crawling toolset. This is not just a web search wrapper. InfoQuest handles structured crawling, content extraction, and result ranking specifically tuned for research tasks. When a sub-agent needs to answer a factual question, it gets search results that are already filtered and structured rather than raw web pages.
Claude Code Bridge
The claude-to-deerflow skill deserves special attention for developers already using Claude Code. This skill lets you interact with a running DeerFlow instance directly from your terminal -- submit research tasks, check status, manage threads, and upload files without leaving Claude Code.
The workflow looks like this: you are deep in a coding session in Claude Code and realize you need background research on an API migration strategy. Instead of context-switching to DeerFlow's web UI, you send the research task from your terminal. DeerFlow's sub-agents fan out, research the topic, and compile a report while you continue coding. When the report is ready, you pull it back into your Claude Code session.
MCP Server Support
DeerFlow ships with core tools -- web search, web fetch, file operations, bash execution -- and extends via the Model Context Protocol (MCP). You can connect any MCP-compatible server to give DeerFlow additional capabilities: database access, API integrations, custom data sources. HTTP/SSE MCP servers support OAuth token flows for authenticated access.
Messaging Channels
DeerFlow connects to Telegram, Slack, and Feishu/Lark via long-polling or WebSocket -- no public IP required. You can send tasks from your phone's Telegram, have DeerFlow's sub-agents work for an hour, and receive the completed report as a message. Commands include /new, /status, /models, /memory, and /help.
Setting Up DeerFlow
Prerequisites
- Python 3.12+
- Node.js 22+
- pnpm
- Docker (recommended for sandbox isolation)
Docker Setup (Recommended)
git clone https://github.com/bytedance/deer-flow.git
cd deer-flow
# Generate config and set up environment
make config
# Edit config.yaml to add your model and API key
# Example model configuration:
# models:
# - name: gpt-4
# display_name: GPT-4
# use: langchain_openai:ChatOpenAI
# model: gpt-4
# api_key: $OPENAI_API_KEY
# Pull sandbox image and start all services
make docker-init
make docker-start
Access DeerFlow at http://localhost:2026. The Docker path bundles everything -- backend, frontend, Nginx reverse proxy, and sandbox -- so you skip installing individual dependencies.
Local Development Setup
git clone https://github.com/bytedance/deer-flow.git
cd deer-flow
make config # Generate configuration
make check # Validate environment
make install # Install dependencies
make setup-sandbox # Optional: set up Docker sandbox
make dev # Start all services in dev mode
Model Configuration
DeerFlow works with any LangChain-compatible model. The config.yaml file takes a list of models with provider-specific parameters. For OpenAI-compatible providers (including local models via Ollama or vLLM), set the base_url field:
models:
- name: deepseek-v3
display_name: DeepSeek v3.2
use: langchain_openai:ChatOpenAI
model: deepseek-chat
api_key: $DEEPSEEK_API_KEY
base_url: https://api.deepseek.com
DeerFlow vs. Running Multiple CLI Agents Manually
You could, in theory, achieve something similar to DeerFlow by opening five terminals, running five separate Claude Code or Codex CLI sessions, and manually coordinating their work. Developers have been doing this since multi-agent workflows became practical in 2025.
The problem is coordination. When you run five independent agents:
- No shared plan. Each agent works from its own understanding of the task. You are the orchestrator, relaying information between them, resolving conflicts, deciding what to do next.
- No context isolation. Each agent accumulates its full conversation history. No aggressive summarization, no offloading intermediate results to the filesystem. Context windows fill up faster.
- No structured synthesis. Merging five agents' outputs into one coherent deliverable is manual labor. DeerFlow's lead agent does this automatically.
- No persistent memory. If you close the terminals, the context is gone. DeerFlow persists memory across sessions.
DeerFlow's value is the orchestration layer itself -- the planning, the scoped context distribution, the parallel execution, and the automatic synthesis. It is the difference between five musicians playing in the same room and five musicians playing in an orchestra with a conductor.
That said, DeerFlow is not a replacement for dedicated CLI coding agents. Claude Code is still the better tool for deep, single-codebase refactoring. Codex CLI is still the better tool for fast, sandboxed code generation. DeerFlow sits above them as a coordination layer for tasks that span research, content creation, and code generation simultaneously.
Monitoring Sub-Agents in Parallel
When DeerFlow fans out into multiple sub-agents, each running its own task in a sandbox, the practical challenge is visibility. Which sub-agent is progressing? Which one is stuck? What is each one doing right now?
This is where terminal layout matters. You want DeerFlow's web UI in one pane, the backend logs in another, and individual sub-agent outputs visible in their own terminals. Being able to resize and rearrange these panes on the fly -- expanding the log pane when debugging, shrinking it when things are stable -- turns a chaotic multi-process workflow into something manageable.
Limitations and Honest Considerations
Resource Requirements
DeerFlow is not lightweight. Running the full stack (LangGraph agent server, Gateway API, frontend, Nginx, Docker sandbox) requires meaningful resources. On a laptop, expect noticeable CPU and memory usage. The Docker setup is cleaner but heavier than running a single CLI agent.
Complexity Budget
DeerFlow solves a real problem -- multi-agent orchestration -- but it adds a layer of abstraction that you need to understand and maintain. The config.yaml, model configuration, skill system, MCP server setup, and sandbox modes are all surfaces that can break. For simple coding tasks, this is overkill. DeerFlow earns its complexity on tasks that genuinely require parallel research, content generation, and code execution across multiple sub-agents.
ByteDance Origin
DeerFlow is MIT-licensed open source. The code is auditable. But it is worth noting that it originated at ByteDance, and some enterprise environments have policies about ByteDance-origin software. Review the source code and make your own assessment. The MIT license means you can fork, modify, and self-host without restrictions.
Security Surface
Any system that executes code in sandboxes, connects to external MCP servers, and installs third-party skills has a broader attack surface than a single-process CLI agent. The Docker sandbox provides isolation, but the skill installation mechanism and MCP OAuth flows are trust boundaries worth auditing for production deployments.
Model Cost
DeerFlow amplifies model usage. A task that fans out into ten sub-agents makes ten times the LLM calls. If you are on pay-per-token pricing, a complex research task can burn through significant API credits. Monitor your token usage, especially when experimenting with high sub-agent counts.
Who Should Use DeerFlow
DeerFlow is not for everyone. It is for developers and teams who:
- Regularly handle tasks spanning research + code + content. If your work involves investigating a topic, writing code based on findings, and producing a report or presentation, DeerFlow's multi-agent orchestration genuinely saves time.
- Need persistent context across long-running tasks. Tasks that take hours, not minutes. DeerFlow's memory and intermediate-result offloading handle context window limits that single agents hit.
- Want to integrate multiple AI tools into one workflow. The MCP support and skill system let you combine web search, code execution, image generation, and custom tools under one orchestrator.
- Run teams that need asynchronous AI task submission. The Telegram/Slack/Feishu channels let team members submit tasks and receive results without accessing the web UI directly.
For quick coding tasks, stick with Claude Code or Codex CLI. For multi-step projects that cross the boundary between research, code, and content, DeerFlow is worth the setup cost.
Ready to streamline your terminal workflow?
Multi-terminal drag-and-drop layout, workspace Git sync, built-in AI integration, AST code analysis — all in one app.