Friday Afternoon. Two Spreadsheets. 900 Rows Each.
Picture this. It is the last Friday of the month. You export a CSV from Stripe. You export another from QuickBooks. You open both in a spreadsheet, and you start matching rows by hand.
Immediately, nothing lines up.
Stripe calls it payment_intent_id. QuickBooks calls it Reference No.. Stripe timestamps are UTC. QuickBooks uses your local timezone. Stripe lists gross amounts in cents -- 4999 means $49.99. QuickBooks shows dollars with decimals -- 49.99. One file has 847 rows. The other has 912.
You squint. You scroll. You build a VLOOKUP that almost works but breaks on the 47th row because QuickBooks truncated a customer email. Three hours later, you have matched most of them. Probably. You are not sure if you missed anything. You are definitely sure you never want to do this again.
This is every indie developer and small team running their own books. Month-end reconciliation -- matching transactions across two systems to make sure nothing fell through the cracks -- is one of the most error-prone recurring tasks in a small business. It is tedious, it scales terribly, and the consequences of getting it wrong range from minor tax headaches to serious accounting discrepancies.
An AI CLI agent eliminates the manual matching. Feed it two CSV files. It normalizes the data (think of it as translating both files into the same language), matches transactions by amount, date, and description, flags anything that does not reconcile, and outputs a structured report. The entire process runs in your terminal. A thousand transactions take under a minute. The cost is a few cents in API tokens.
By the end of this article, you will have:
- A reconciliation script with dry-run mode that works on any two CSV sources
- Matching logic that handles amount tolerances, date offsets, and fuzzy descriptions
- Confidence scores for every match
- A structured JSON report you can pipe into accounting tools or review manually
- A CLAUDE.md workflow block for repeatable one-command reconciliation
Sample Data: Stripe vs. QuickBooks
Before building anything, look at what real reconciliation inputs look like. Two exports from different systems describing the same underlying transactions, but structured as if they were designed to annoy you.
Stripe export (stripe_march.csv):
id,created,amount,currency,fee,net,description,customer_email
pi_3Ox1a2B,2026-03-01 08:14:22 UTC,4999,usd,175,4824,Pro Plan - Monthly,alice@example.com
pi_3Ox1a2C,2026-03-01 14:30:01 UTC,9999,usd,320,9679,Team Plan - Monthly,bob@corp.dev
pi_3Ox1a2D,2026-03-03 09:45:33 UTC,4999,usd,175,4824,Pro Plan - Monthly,carol@startup.io
pi_3Ox1a2E,2026-03-05 16:22:10 UTC,19999,usd,610,19389,Enterprise - Annual,dave@bigco.com
pi_3Ox1a2F,2026-03-07 11:08:44 UTC,4999,usd,175,4824,Pro Plan - Monthly,eve@freelance.dev
pi_3Ox1a2G,2026-03-07 11:09:02 UTC,-4999,usd,0,-4999,Refund - Pro Plan,eve@freelance.dev
QuickBooks export (quickbooks_march.csv):
Date,Num,Name,Memo,Amount,Balance
03/01/2026,1001,Stripe Transfer,alice@example.com - Pro,49.99,10249.99
03/01/2026,1002,Stripe Transfer,bob@corp.dev - Team,99.99,10349.98
03/03/2026,1003,Stripe Transfer,carol - Pro Monthly,49.99,10399.97
03/05/2026,1004,Stripe Transfer,Enterprise annual - dave,199.99,10599.96
03/07/2026,1005,Stripe Transfer,eve - Pro,49.99,10649.95
03/07/2026,1006,Stripe Refund,Refund eve,-49.99,10599.96
03/08/2026,1007,AWS,March infrastructure,,-10400.00
Four differences make manual matching painful:
- Amount format. Stripe stores cents as integers (
4999). QuickBooks stores dollars with decimals (49.99). Same number, different languages. - Date format. Stripe uses ISO 8601 with UTC timezone. QuickBooks uses
MM/DD/YYYYwith no timezone. A payment at 11 PM UTC on March 7th might land on March 8th in your local books. - Description fields. Stripe has structured
descriptionandcustomer_emailcolumns. QuickBooks has a single free-textMemofield that sometimes includes the email, sometimes a name, sometimes just a plan label. It is like matching a filing cabinet against a pile of sticky notes. - Extra rows. QuickBooks has an AWS infrastructure payment with no Stripe counterpart. This is expected -- different payment source entirely. The agent should flag it as unmatched, not as an error.
The Matching Logic
Transaction matching works in three passes, each progressively looser. Think of it like airport security with three checkpoints. The first catches the obvious matches. The second catches the ones that shifted a day or two. The third catches the ones where only the description gives them away.
Pass 1: Exact Amount + Date Match
The highest-confidence match. Two transactions with the same normalized amount on the same date are almost certainly the same transaction. Like two puzzle pieces that snap together perfectly.
Normalization rules:
- Convert Stripe cents to dollars:
amount / 100 - Parse both date formats to
YYYY-MM-DD - Match amounts within a tolerance of
$0.01to handle rounding
Pass 2: Amount Match with Date Window
Some transactions post on different days across systems. A payment Stripe receives on Friday evening UTC might not appear in QuickBooks until Monday. This pass widens the date window to 3 business days while still requiring an exact amount match.
Pass 3: Fuzzy Description Matching
For the remaining unmatched transactions, the agent compares descriptions using text similarity. It extracts entities from both sides -- customer names, email addresses, plan names -- and computes a similarity score. This catches cases where amounts differ slightly (Stripe shows gross, QuickBooks shows net-of-fee) but the description clearly identifies the same transaction.
Each match gets a confidence score:
| Pass | Confidence | Criteria |
|---|---|---|
| Pass 1 | 0.95-1.0 | Exact amount + same date |
| Pass 2 | 0.80-0.94 | Exact amount + date within 3 business days |
| Pass 3 | 0.60-0.79 | Fuzzy description match + amount within 5% |
| No match | 0.00 | No corresponding transaction found |
Anything below 0.60 is flagged for manual review. Better to surface a question mark than silently approve a bad match.
The Reconciliation Script
Here is a complete Python script that implements the three-pass matching. Feed it two CSV files, and it outputs a structured JSON report.
#!/usr/bin/env python3
"""
reconcile.py -- Match transactions across two CSV exports.
Usage:
python reconcile.py stripe.csv quickbooks.csv --dry-run
python reconcile.py stripe.csv quickbooks.csv -o report.json
"""
import argparse
import csv
import json
import sys
from datetime import datetime, timedelta
from difflib import SequenceMatcher
from pathlib import Path
def parse_stripe_row(row: dict) -> dict:
"""Normalize a Stripe CSV row into a standard transaction format."""
return {
"source": "stripe",
"id": row["id"],
"date": datetime.strptime(row["created"][:10], "%Y-%m-%d").date().isoformat(),
"amount": round(int(row["amount"]) / 100, 2),
"currency": row["currency"],
"description": row.get("description", ""),
"email": row.get("customer_email", ""),
"raw": row,
}
def parse_quickbooks_row(row: dict) -> dict:
"""Normalize a QuickBooks CSV row into a standard transaction format."""
amount_str = row.get("Amount", "0").replace(",", "")
return {
"source": "quickbooks",
"id": row.get("Num", ""),
"date": datetime.strptime(row["Date"], "%m/%d/%Y").date().isoformat(),
"amount": float(amount_str) if amount_str else 0.0,
"currency": "usd",
"description": row.get("Memo", ""),
"email": "",
"raw": row,
}
def load_csv(filepath: str, parser) -> list[dict]:
"""Load and parse a CSV file using the given row parser."""
rows = []
with open(filepath, newline="", encoding="utf-8") as f:
reader = csv.DictReader(f)
for row in reader:
try:
rows.append(parser(row))
except (ValueError, KeyError) as e:
print(f" Skipping row: {e}", file=sys.stderr)
return rows
def amount_match(a: float, b: float, tolerance: float = 0.01) -> bool:
return abs(a - b) <= tolerance
def date_within_window(d1: str, d2: str, days: int = 0) -> bool:
dt1 = datetime.fromisoformat(d1).date()
dt2 = datetime.fromisoformat(d2).date()
return abs((dt1 - dt2).days) <= days
def description_similarity(desc_a: str, desc_b: str, email: str = "") -> float:
"""Compute text similarity between two descriptions.
Boost score if email or customer name appears in both."""
base = SequenceMatcher(None, desc_a.lower(), desc_b.lower()).ratio()
if email and email.lower() in desc_b.lower():
base = min(base + 0.3, 1.0)
return round(base, 3)
def reconcile(source_a: list[dict], source_b: list[dict]) -> dict:
"""Three-pass reconciliation. Returns structured results."""
matched = []
unmatched_a = list(source_a)
unmatched_b = list(source_b)
# --- Pass 1: Exact amount + same date ---
still_unmatched_a = []
for txn_a in unmatched_a:
best = None
for txn_b in unmatched_b:
if amount_match(txn_a["amount"], txn_b["amount"]) and \
date_within_window(txn_a["date"], txn_b["date"], days=0):
best = txn_b
break
if best:
matched.append({
"source_a": txn_a,
"source_b": best,
"confidence": 0.97,
"match_pass": 1,
})
unmatched_b.remove(best)
else:
still_unmatched_a.append(txn_a)
unmatched_a = still_unmatched_a
# --- Pass 2: Exact amount + 3-day window ---
still_unmatched_a = []
for txn_a in unmatched_a:
best = None
for txn_b in unmatched_b:
if amount_match(txn_a["amount"], txn_b["amount"]) and \
date_within_window(txn_a["date"], txn_b["date"], days=3):
best = txn_b
break
if best:
matched.append({
"source_a": txn_a,
"source_b": best,
"confidence": 0.85,
"match_pass": 2,
})
unmatched_b.remove(best)
else:
still_unmatched_a.append(txn_a)
unmatched_a = still_unmatched_a
# --- Pass 3: Fuzzy description ---
still_unmatched_a = []
for txn_a in unmatched_a:
best = None
best_score = 0.0
for txn_b in unmatched_b:
sim = description_similarity(
txn_a["description"], txn_b["description"], txn_a.get("email", "")
)
pct_diff = abs(txn_a["amount"] - txn_b["amount"]) / max(abs(txn_a["amount"]), 0.01)
if sim > 0.4 and pct_diff < 0.05 and sim > best_score:
best = txn_b
best_score = sim
if best and best_score > 0.4:
matched.append({
"source_a": txn_a,
"source_b": best,
"confidence": round(0.60 + best_score * 0.19, 2),
"match_pass": 3,
})
unmatched_b.remove(best)
else:
still_unmatched_a.append(txn_a)
unmatched_a = still_unmatched_a
# --- Build report ---
total_a = sum(t["amount"] for t in source_a)
total_b = sum(t["amount"] for t in source_b)
return {
"summary": {
"source_a_count": len(source_a),
"source_b_count": len(source_b),
"matched_count": len(matched),
"unmatched_a_count": len(unmatched_a),
"unmatched_b_count": len(unmatched_b),
"source_a_total": round(total_a, 2),
"source_b_total": round(total_b, 2),
"difference": round(total_a - total_b, 2),
},
"matched": [
{
"source_a_id": m["source_a"]["id"],
"source_b_id": m["source_b"]["id"],
"amount": m["source_a"]["amount"],
"date": m["source_a"]["date"],
"confidence": m["confidence"],
"match_pass": m["match_pass"],
}
for m in matched
],
"unmatched_source_a": [
{"id": t["id"], "amount": t["amount"], "date": t["date"], "description": t["description"]}
for t in unmatched_a
],
"unmatched_source_b": [
{"id": t["id"], "amount": t["amount"], "date": t["date"], "description": t["description"]}
for t in unmatched_b
],
"flags": [],
}
def add_flags(report: dict) -> dict:
"""Add human-readable flags for items needing attention."""
flags = []
if report["summary"]["difference"] != 0:
flags.append({
"severity": "warning",
"message": f"Total difference: ${report['summary']['difference']:.2f}",
})
for item in report["unmatched_source_a"]:
flags.append({
"severity": "error",
"message": f"No match in source B for {item['id']} (${item['amount']:.2f} on {item['date']})",
})
for item in report["unmatched_source_b"]:
flags.append({
"severity": "error",
"message": f"No match in source A for {item['id']} (${item['amount']:.2f} on {item['date']})",
})
for m in report["matched"]:
if m["confidence"] < 0.80:
flags.append({
"severity": "warning",
"message": f"Low confidence match ({m['confidence']}) between {m['source_a_id']} and {m['source_b_id']}",
})
report["flags"] = flags
return report
def main():
parser = argparse.ArgumentParser(description="Reconcile two CSV transaction exports.")
parser.add_argument("source_a", help="Path to first CSV (e.g., Stripe export)")
parser.add_argument("source_b", help="Path to second CSV (e.g., QuickBooks export)")
parser.add_argument("-o", "--output", help="Output JSON file path")
parser.add_argument("--dry-run", action="store_true", help="Print summary only, do not write file")
parser.add_argument("--tolerance", type=float, default=0.01, help="Amount match tolerance in dollars")
parser.add_argument("--date-window", type=int, default=3, help="Date window in days for Pass 2")
args = parser.parse_args()
print(f"Loading {args.source_a}...")
txns_a = load_csv(args.source_a, parse_stripe_row)
print(f" {len(txns_a)} transactions loaded")
print(f"Loading {args.source_b}...")
txns_b = load_csv(args.source_b, parse_quickbooks_row)
print(f" {len(txns_b)} transactions loaded")
print("Running reconciliation...")
report = reconcile(txns_a, txns_b)
report = add_flags(report)
# Print summary
s = report["summary"]
print(f"\n--- Reconciliation Summary ---")
print(f"Source A: {s['source_a_count']} transactions, total ${s['source_a_total']:.2f}")
print(f"Source B: {s['source_b_count']} transactions, total ${s['source_b_total']:.2f}")
print(f"Matched: {s['matched_count']}")
print(f"Unmatched (A): {s['unmatched_a_count']}")
print(f"Unmatched (B): {s['unmatched_b_count']}")
print(f"Difference: ${s['difference']:.2f}")
if report["flags"]:
print(f"\n--- Flags ({len(report['flags'])}) ---")
for flag in report["flags"]:
severity = flag["severity"].upper()
print(f" [{severity}] {flag['message']}")
if args.dry_run:
print("\nDry run complete. No file written.")
return
output_path = args.output or "reconciliation_report.json"
# Remove raw data from output for cleanliness
with open(output_path, "w") as f:
json.dump(report, f, indent=2)
print(f"\nReport written to {output_path}")
if __name__ == "__main__":
main()
Save this as reconcile.py. Run it in dry-run mode first -- look before you leap:
python reconcile.py stripe_march.csv quickbooks_march.csv --dry-run
Output:
Loading stripe_march.csv...
6 transactions loaded
Loading quickbooks_march.csv...
7 transactions loaded
Running reconciliation...
--- Reconciliation Summary ---
Source A: 6 transactions, total $249.96
Source B: 7 transactions, total $-10150.04
Matched: 6
Unmatched (A): 0
Unmatched (B): 1
Difference: $10400.00
--- Flags (2) ---
[WARNING] Total difference: $10400.00
[ERROR] No match in source A for 1007 ($-10400.00 on 2026-03-08)
Dry run complete. No file written.
All six Stripe transactions matched their QuickBooks counterparts. The AWS infrastructure payment was correctly flagged as unmatched. The $10,400 difference is expected -- it is a completely different payment source, not a discrepancy. The script told you exactly what to investigate and what to ignore.
Structured Output: The Report Format
The JSON report follows a strict schema so downstream tools can consume it without guessing. Think of the schema as a contract between the reconciliation engine and everything that reads its output.
{
"summary": {
"source_a_count": "integer",
"source_b_count": "integer",
"matched_count": "integer",
"unmatched_a_count": "integer",
"unmatched_b_count": "integer",
"source_a_total": "number",
"source_b_total": "number",
"difference": "number"
},
"matched": [
{
"source_a_id": "string",
"source_b_id": "string",
"amount": "number",
"date": "string (ISO 8601)",
"confidence": "number (0.0-1.0)",
"match_pass": "integer (1-3)"
}
],
"unmatched_source_a": [
{
"id": "string",
"amount": "number",
"date": "string",
"description": "string"
}
],
"unmatched_source_b": [],
"flags": [
{
"severity": "error | warning | info",
"message": "string"
}
]
}
Why does a strict schema matter? Without one, you get freeform text that a human can read but a machine cannot. With one, you can build reliable automation on top: a script reads the report, creates Jira tickets for every error-severity flag, and posts a summary to Slack. The report becomes a gear in a larger machine, not a dead-end document.
When using an AI agent to generate this report (instead of the deterministic script), instruct it to output only valid JSON matching this schema. In your CLAUDE.md:
## Reconciliation Output Rules
When generating a reconciliation report, output ONLY valid JSON.
Follow this exact schema -- no additional fields, no missing fields:
- summary: object with source_a_count, source_b_count, matched_count,
unmatched_a_count, unmatched_b_count, source_a_total, source_b_total, difference
- matched: array of objects with source_a_id, source_b_id, amount, date, confidence, match_pass
- unmatched_source_a: array of objects with id, amount, date, description
- unmatched_source_b: array of objects with id, amount, date, description
- flags: array of objects with severity (error/warning/info) and message
Handling Messy Real-World Data
Production CSV exports are never clean. They arrive with missing columns, duplicate rows, mixed currencies, and creative interpretations of what a "refund" means. Here are the patterns the script handles and how to extend it.
Missing columns. QuickBooks exports sometimes omit the Amount column for journal entries, leaving it blank. The parser defaults to 0.0 and the reconciliation flags it as unmatched. Better to surface a weird row than silently drop it. Think of it as a metal detector beeping -- you investigate, even if it turns out to be a coin.
Duplicate transactions. Stripe retries can create duplicate payment_intent IDs with different charge IDs. The script matches sequentially, so the first occurrence matches and the duplicate falls to the unmatched list. Check your unmatched items for duplicates before investigating further.
Multi-currency. The sample script assumes USD. For multi-currency reconciliation, normalize all amounts to a base currency using the exchange rate from the transaction date. Add a currency field to the match criteria -- two transactions with the same dollar amount but different currencies are not a match. They are coincidences.
Partial refunds. Stripe records partial refunds as separate negative transactions. QuickBooks might adjust the original transaction amount instead. The fuzzy matching pass handles this: description similarity catches the customer name, and the 5% amount tolerance accommodates the difference between original and adjusted amounts.
Scaling: Batch Processing Hundreds of Transactions
The deterministic script handles thousands of rows in seconds. But when you want the AI agent to analyze the hard cases -- ambiguous descriptions, unusual amounts, multi-step refund chains -- you need a cost-effective approach.
Process transactions in batches of 50-100 rows. Send each batch to the agent with the unmatched items from the previous batch carried forward. This keeps context windows manageable and prevents token costs from exploding on large datasets. Like eating an elephant: one bite at a time.
# Split a large CSV into 100-row chunks
split -l 100 stripe_full_year.csv chunk_
# Process each chunk, carrying forward unmatched items
for chunk in chunk_*; do
claude -p "Reconcile these Stripe transactions against quickbooks_2026.csv. \
Here are previously unmatched items: $(cat unmatched_carry.json). \
Output JSON per the reconciliation schema." < "$chunk" >> results.json
done
For most reconciliation tasks, the deterministic script is the right tool. Reserve the AI agent for the hard cases: the 5-10% of transactions the script flags as unmatched or low-confidence. This hybrid approach gives you the speed of a script for the easy 90% and the reasoning ability of an agent for the ambiguous remainder. Use the right tool for the right job.
The CLAUDE.md Workflow Block
Add this to your project's CLAUDE.md for one-command reconciliation:
## Transaction Reconciliation Workflow
When I ask to reconcile transactions:
### Input
- I will provide two CSV file paths
- Identify the source system for each file from column headers
- Supported formats: Stripe, QuickBooks, Xero, bank CSV exports
### Processing
1. Load both files and detect column mappings automatically
2. Normalize: convert amounts to decimal dollars, dates to ISO 8601, descriptions to lowercase
3. Run three-pass matching: exact amount+date, amount+date window, fuzzy description
4. Calculate confidence scores for each match
5. Flag all unmatched transactions and low-confidence matches
### Output
- Print a summary table to stdout
- Write detailed JSON report to reconciliation_report.json
- List every flag with severity and recommended action
- If --dry-run, print summary only
### Rules
- Never silently drop transactions. Every input row must appear in matched or unmatched.
- Round all amounts to 2 decimal places.
- Treat negative amounts as refunds, not errors.
- If both files have identical row counts and all match at 0.95+, print "Clean reconciliation" and exit.
Run it:
claude -p "reconcile stripe_march.csv against quickbooks_march.csv --dry-run"
The agent reads both files, applies the matching logic, and prints the summary. Remove --dry-run to write the full report.
The Split-Pane Advantage
Reconciliation is inherently a side-by-side task. You are always comparing two data sources and asking: do these match?
The productive layout is three panes. Left pane: the Stripe CSV open in less or csvlook, scrolled to a flagged transaction. Right pane: the QuickBooks CSV, scrolled to the corresponding row. Center pane: the AI agent session, where you ask follow-up questions about specific discrepancies.
When the agent flags a low-confidence match -- say, confidence 0.72 between pi_3Ox1a2E and QuickBooks row 1004 -- you glance left and see the Stripe description ("Enterprise - Annual"). You glance right and see the QuickBooks memo ("Enterprise annual - dave"). Match confirmed in two seconds. No window switching. No losing your place in a 900-row spreadsheet. No "wait, which tab was that in?"
For monthly reconciliation runs, save this layout as a workspace. Next month, open the workspace, drop in the new CSV files, run the same command. The entire session -- data, agent, report -- is visible on one screen.
Recap
Transaction reconciliation is data normalization plus pattern matching. The hard parts -- inconsistent formats, timezone mismatches, free-text descriptions that follow no convention -- are exactly what AI agents handle well. The deterministic script covers the easy 90%. The AI agent covers the ambiguous 10%. The structured JSON output makes everything machine-readable and auditable.
The workflow: export two CSVs, run reconcile.py for the fast pass, hand the flagged items to the agent for analysis, review the final report. A thousand transactions drop from hours of manual spreadsheet squinting to under five minutes.
For ongoing reconciliation, wire the script into a monthly cron job and let the agent process only the exceptions. Month-end close becomes a review task instead of a matching task. You spend your Friday evening doing something other than staring at spreadsheets.
Ready to streamline your terminal workflow?
Multi-terminal drag-and-drop layout, workspace Git sync, built-in AI integration, AST code analysis — all in one app.