March 22, 202613 min readai-agent-workflows

Build a Research Assistant Workflow Entirely in Your Terminal

A step-by-step guide to building a terminal-native research workflow: web search, AI summarization, structured notes, and final report generation — all scriptable and repeatable.

DH
Danny Huang

The Fourteen-Tab Problem

You have been there. Twenty minutes into a research task, your browser looks like a crime scene. Fourteen tabs open. A half-written note in Notion. Two PDFs downloaded but unread. A nagging feeling that you already found the answer somewhere -- three tabs ago? Five? You cannot remember.

Your context is scattered across tools that do not talk to each other. There is no record of your search path. Tomorrow, you will return to this task and retrace half your steps because the trail went cold overnight.

This article builds a different way to work. A terminal-native research workflow where every step -- search, extract, summarize, cite, synthesize -- happens in one environment, driven by an AI agent, with structured output at each stage. Think of it as replacing your chaotic desk covered in sticky notes with a well-organized filing cabinet that an assistant keeps tidy for you.

The workflow is scriptable. It is repeatable. And it produces a citable report, not a pile of browser tabs.

By the end, you will have:

  • A CLAUDE.md configuration that turns Claude Code into a research assistant
  • A multi-step workflow: decompose question, search, extract, synthesize, report
  • Structured notes with source citations at every step
  • A repeatable process you can apply to any research topic

The concrete example: researching "AI CLI tools market landscape 2026."

Why the Terminal Works for Research

Research is a pipeline. You start with a broad question. Break it into sub-questions. Gather sources for each. Extract relevant findings. Synthesize them into a coherent output. Every step transforms the previous step's output into the next step's input.

That is exactly what terminal workflows do well. Pipes, structured files, and scriptable commands turn a messy process into a reproducible one. Add an AI agent that can search the web, read documents, and generate structured output, and you have a research assistant that keeps its context in files you can inspect, edit, and version-control.

Three properties make terminal-based research better than tab-hopping:

  1. Persistent context. Your research state lives in files, not in browser session memory. Close the terminal, reopen it tomorrow, and everything is exactly where you left it. Like a physical notebook versus a whiteboard that gets erased every night.
  2. Structured output at every stage. Instead of mental notes about what you found, each step produces a markdown file with citations. The synthesis step reads those files, not your memory.
  3. Reproducibility. Run the same workflow on a different topic by changing one variable. The process is the same; only the question changes. Like a lab protocol you can hand to a colleague.

Prerequisites

  • Claude Code installed and authenticated. See First Hour with Claude Code if you need setup help.
  • Web search MCP configured. Claude Code's built-in web search works, or configure a dedicated search MCP server.
  • A project directory. Create a folder for your research. This is where all notes, sources, and reports will live.
mkdir -p ~/research/ai-cli-landscape-2026
cd ~/research/ai-cli-landscape-2026
git init

Version-controlling your research directory means you can track how your understanding evolved over time. It also means the AI agent can use git history as context.

Step 1: Configure CLAUDE.md for Research

Every good research team has a methodology document. This is yours -- except it is machine-readable. The CLAUDE.md tells the agent how to behave as a research assistant. It is not a generic instruction file. It encodes the specific methodology you want the agent to follow, like handing a new hire an SOP on their first day.

Create CLAUDE.md in the research project root:

## Role
You are a research assistant. Your job is to help me investigate topics
systematically. Every claim must have a source. Never state facts without
citing where the information came from.

## Research Methodology

### Question Decomposition
When given a broad research question, break it into 3-7 specific sub-questions
that collectively answer the original. Each sub-question should be answerable
with a focused search. Present the sub-questions for approval before proceeding.

### Source Collection
For each sub-question:
1. Search the web for recent, authoritative sources (prefer primary sources,
   official docs, peer-reviewed papers, reputable industry reports).
2. For each useful source, record: title, URL, date, and a 2-3 sentence summary
   of what it contributes to our question.
3. Save source notes to research/sources/[topic-slug].md

### Extraction Rules
- Always include direct quotes with page/section references when possible.
- Distinguish between facts, claims, and opinions in your notes.
- Flag contradictions between sources explicitly.
- When two sources disagree, note both positions and the evidence each provides.

### Synthesis
When asked to synthesize, read all files in research/sources/ and produce a
structured report in research/reports/. The report must:
- Answer each sub-question with evidence from collected sources.
- Include a full reference list at the end.
- Flag areas where evidence is thin or contradictory.
- Use inline citations in [Author, Year] or [Source Title] format.

## Output Conventions
- All notes go in research/sources/
- All reports go in research/reports/
- Use markdown. No HTML.
- File names use kebab-case: market-size-estimates.md, not Market Size.md
- Every file starts with a YAML-style header: topic, date, status (draft/reviewed/final)

This CLAUDE.md encodes three patterns that matter for research quality.

First, question decomposition. A broad question like "AI CLI tools market landscape" is too vague for a single search. It is like trying to catch fish with a net the size of a lake -- you need smaller nets for smaller ponds. Breaking it into sub-questions -- market size, key players, adoption rates, pricing models, technical differentiation -- means each search is focused and the results are more relevant. This is the same principle behind sub-query decomposition in retrieval-augmented generation (RAG, a technique where AI systems answer complex questions by retrieving and combining focused pieces of information): complex queries answered by combining focused retrievals outperform single monolithic queries.

Second, mandatory citations. The instruction "never state facts without citing" forces the agent to ground every claim in a source. Without this constraint, language models confidently generate plausible-sounding but unverifiable statements. The citation requirement turns the agent from a text generator into a research tool.

Third, structured file output. Instead of dumping everything into one long conversation, the agent writes files. Each file is a discrete unit of research that can be reviewed, edited, and referenced independently. When synthesis time comes, the agent reads the files -- not its conversation memory, which may have been compacted. Think of it as the difference between a shoebox of receipts and a spreadsheet. Both contain the same data. Only one is usable.

Step 2: Decompose the Research Question

Start a Claude Code session in your research directory and give it the broad question:

Research question: What is the AI CLI tools market landscape in 2026?
I need a comprehensive analysis covering market players, technical capabilities,
adoption patterns, and where the space is heading.

The agent, following the CLAUDE.md methodology, will decompose this into sub-questions. A good decomposition for this topic looks like:

## Sub-Questions

1. What are the major AI CLI tools available in 2026, and who builds them?
2. How do these tools differ technically — context windows, model access,
   extensibility (MCP, plugins, skills)?
3. What is the adoption trajectory — developer survey data, download counts,
   GitHub stars as a proxy for mindshare?
4. How do pricing and cost models compare across tools?
5. What workflows are developers actually using these tools for beyond
   code generation?
6. What are the main limitations and unresolved problems in the current
   generation of AI CLI tools?

Review the sub-questions. Add any you think are missing. Remove any that are out of scope. Then tell the agent to proceed. This review step is critical -- it is cheaper to adjust the telescope before looking through it than to realize you were pointed at the wrong star after an hour of observation.

Step 3: Search and Extract Per Sub-Question

Now the agent works through each sub-question. For each one, it searches the web, evaluates sources, and writes a structured notes file.

Start with sub-question 1. Search for authoritative sources on the major
AI CLI tools in 2026. Write your findings to research/sources/major-players.md

The agent will search, read results, and produce a file like this:

---
topic: Major AI CLI Tools in 2026
date: 2026-03-22
lastUpdated: "2026-03-22T10:46:45"
status: draft
---

## Key Players

### Anthropic — Claude Code
- Release: 2025. Current version as of March 2026 supports extended thinking,
  MCP tool use, multi-agent orchestration via subagents.
- Source: [Anthropic documentation](https://docs.anthropic.com/en/docs/claude-code)
- Notable: First major CLI tool to ship with native MCP support.

### Google — Gemini CLI
- Release: June 2025. Open-source, 1M token context window via Gemini 2.5 Pro.
- Source: [Google Developers Blog](https://developers.googleblog.com/)
- Notable: Free tier with generous rate limits. Strong for large-codebase analysis.

### OpenAI — Codex CLI
- Release: April 2025. Open-source, runs on codex-mini model by default.
- Source: [OpenAI Codex CLI repository](https://github.com/openai/codex)
- Notable: Lightweight, sandbox-first approach.

...

Repeat for each sub-question. The agent writes a separate file for each topic area. After a few rounds, your directory looks like:

research/
  sources/
    major-players.md
    technical-comparison.md
    adoption-data.md
    pricing-models.md
    developer-workflows.md
    limitations.md

Each file is self-contained, with citations. You can review them individually, mark them as reviewed, or ask the agent to dig deeper on any sub-question where the evidence is thin.

Step 4: Handle Contradictions and Gaps

Real research produces contradictions. One source says adoption is growing at 40% year-over-year. Another says the market is consolidating around two players. Both can be true, but the agent should flag the tension rather than silently picking one.

After the initial source collection, ask the agent to audit:

Review all files in research/sources/. Identify contradictions between sources,
areas where we only have one source (single points of failure), and questions
we have not answered well. Write the audit to research/sources/gap-analysis.md

The gap analysis is your research compass. Maybe the pricing comparison needs primary data from each vendor's pricing page. Maybe the adoption data relies too heavily on one survey. This iterative refinement -- search, extract, audit, search again -- is what separates a rigorous analysis from a first-pass summary. It is the difference between a journalist who double-checks with a second source and one who publishes the first thing they hear.

Step 5: Synthesize into a Report

Once your source files are solid, ask the agent to synthesize:

Read all files in research/sources/. Synthesize a comprehensive report answering
the original research question. Follow the synthesis rules in CLAUDE.md.
Write to research/reports/ai-cli-landscape-2026.md

The agent reads every source file, cross-references findings, resolves or flags contradictions, and produces a structured report with inline citations and a full reference list.

The synthesis step benefits from having structured, cited source files rather than raw conversation history. The agent processes clean inputs, not a 50-turn conversation where half the context was compacted away. This is the payoff of the file-based approach: by the time you need synthesis, the inputs are already organized. The filing cabinet pays for itself.

Step 6: Iterate and Refine

The first report is a draft. Read it critically. Ask the agent to strengthen weak sections:

Section 4 on pricing is too surface-level. Search for the actual pricing pages
of Claude Code, Gemini CLI, and Codex CLI. Update research/sources/pricing-models.md
with specific tier details. Then regenerate the pricing section of the report.

Because each source file is independent, you can update one without regenerating the entire report. The agent reads the updated file and patches the relevant section. Like replacing one chapter of a book without reprinting the whole thing.

Making the Workflow Repeatable

The entire process above can be templated. Create a shell script that scaffolds a new research project:

#!/bin/bash
# new-research.sh — scaffold a research project
TOPIC_SLUG="$1"
if [ -z "$TOPIC_SLUG" ]; then
  echo "Usage: new-research.sh <topic-slug>"
  exit 1
fi

PROJECT_DIR="$HOME/research/$TOPIC_SLUG"
mkdir -p "$PROJECT_DIR/research/sources"
mkdir -p "$PROJECT_DIR/research/reports"

# Copy the research CLAUDE.md template
cp "$HOME/research/templates/CLAUDE.md" "$PROJECT_DIR/CLAUDE.md"

cd "$PROJECT_DIR"
git init
echo "Research project scaffolded at $PROJECT_DIR"

Now starting a new research project is one command:

./new-research.sh quantum-computing-error-correction-2026

Same directory structure, same CLAUDE.md methodology, same workflow. Only the question changes.

The Multi-Pane Research Layout

Research involves constant switching between reading source files, talking to the agent, and reviewing outputs. Doing this in a single terminal pane is like cooking a complex meal on a one-burner stove -- technically possible, but painfully slow.

The layout that works best for research:

  • Left pane (40%): Claude Code session -- this is where you give instructions and the agent works.
  • Top-right pane (30%): File viewer showing the current source file or report being generated. Watch it populate in real time.
  • Bottom-right pane (30%): File tree or a second file viewer for cross-referencing a different source.

When the agent writes to research/sources/adoption-data.md, you see the file update live in the adjacent pane. When you want to compare two source files before synthesis, you open one in each right-side pane. When the report generates, you read it while the agent is still working on the reference list.

This is not about aesthetics. It is about keeping context visible. Research fails when you lose track of what you have found. A multi-pane layout makes the research state tangible -- you see the files, you see the agent working, you see the output. Nothing is hidden behind a tab.

If you run multiple research projects in parallel -- say, one on market landscape and another on technical architecture -- workspace switching lets you flip between them without tearing down your terminal layout. Each workspace has its own pane arrangement, its own CLAUDE.md, its own file state.

Try Termdock Multi Panel works out of the box. Free download →

Comparison: Terminal Research vs. Traditional Research

AspectBrowser + Notes AppTerminal Research Workflow
Context persistenceLost when tabs closeFiles on disk, version-controlled
Citation trackingManual, error-proneEnforced by CLAUDE.md rules
ReproducibilityNone -- depends on your memoryScripted -- same process, different topic
Contradiction handlingUnnoticed until writingExplicit gap analysis step
CollaborationShare a Google DocShare a git repo with full research history
AutomationCopy-paste between appsOne command scaffolds the project

When This Workflow Is Overkill

Not every research task needs this machinery. If you need a quick answer to a specific question, just ask the agent directly. No scaffolding required. This workflow is for the kind of research that takes days, involves multiple sources, and produces a deliverable -- a report, a presentation, a decision document.

Rules of thumb:

  • Under 30 minutes of research: Just ask. No scaffolding needed.
  • 30 minutes to 2 hours: Use the CLAUDE.md methodology but skip the shell script scaffolding.
  • Over 2 hours or multiple sessions: Full workflow. The investment in structure pays for itself when you return to the project after a break and find everything exactly where you left it.

Start Building

Create a research directory. Drop in the CLAUDE.md from this article. Give the agent a real question -- something you actually need to research for work. Run through the full cycle: decompose, search, extract, audit, synthesize. The first time takes longer because you are learning the workflow. The second time, you already have the template and the muscle memory.

The point is not that AI replaces your judgment about what matters. The point is that the mechanical parts of research -- finding sources, extracting relevant passages, tracking citations, organizing notes -- are exactly the kind of structured, repeatable work that an agent in a terminal handles well. Your job becomes reading, thinking, and deciding. The agent handles the filing.

For the broader picture of AI agent workflows and how research fits alongside code generation, automation, and multi-agent orchestration, see the AI Agent Workflow Guide.

DH
Free Download

Ready to streamline your terminal workflow?

Multi-terminal drag-and-drop layout, workspace Git sync, built-in AI integration, AST code analysis — all in one app.

Download Termdock →
#research#workflow#ai-agent#terminal#mcp

Related Posts