The toolbox

The stack we build with.

Tools are not the work. The work is closing the gap between your ad click and your closed deal. The toolbox is the leverage that lets one operator do it for SMB prices, on your existing stack, in eight weeks. This page is the working list.

On this page — Applications, each grouping its Tools

What's new — May 2026 How we pick — three tests Application — Marketing Engine Pilot Application — Articulate Voice Application — Hermes Agent (24/7 substrate) · full page Application — Distribution backbone Application — Content production Application — Build surface Application — Research surface Application — Operator infrastructure What we deliberately don't use Policy on finding what's best

What's new — May 2026

Nine additions in one pass, grouped by where they sit in the stack. The detail is in the relevant section further down — these are the headlines.

Models — three moves

Stronger character work, a second frontier, a landmark we won't ship

Higgsfield Soul Cinema, unlimited tier. Upgraded from Plus on 20 May. Soul Cinema is now the default model for character-scene work — composite multi-character scenes that were impossible at the Plus tier are now routine. Nano Banana Pro retired for character work (magazine-spread bias).

OpenAI API — registered as the second-opinion frontier. Tier 1, GPT-5.5 and the o-series. Sits inside the taste-to-skill pipeline as the dual-frontier debias against Anthropic Opus. We are an Anthropic-first stack; the OpenAI key earns its place by catching what one model alone misses.

Sintra.ai — logged as a landmark, not a tool we use. Every SMB buyer in the next eighteen months will mention it. We needed a thirty-second answer that explains why MEP is a different shape. The full review is on the record — one out of nine on the internal Hype Radar; consumer-tier, credit-capped, no MCPs, no workflows, no agents. The commercial-affiliate angle is a separate question filed in proposals.

Network and storage — two additions

The operator can now reach their own machines, and lose them safely

Tailscale mesh, wired 24 May. Solves a binding constraint that has lived in our hard-rules file for weeks — the agent sandbox could not reach the always-on Mac mini at its LAN address. Every mini deploy was a paste-block out to the operator's laptop. Tailscale gives every node a stable mesh name. The sandbox now SSHes the mini and runs the deploy itself. Reusable, ephemeral, tagged. Dead sandboxes evaporate.

Backblaze B2, off-mini backup. Single bucket, region-pinned, application key scoped read-and-write to one bucket only — blast radius bounded. rclone mirrors the curated archive and the docker configs from the mini. Roughly fifty-four cents a month over the free tier for a hundred gigabytes. The mini is no longer the single point of failure for anything we wouldn't want to redo from scratch.

Distribution — X is now a first-class surface

Wired through a Worker, gated by a brand-voice skill

X / Twitter API, pay-per-use. Half a cent per read, one cent per post, no monthly minimum. Account is @anthony71booth. Two Workers: articulate-x-publish live since 24 May, OAuth 1.0a signed, 280-character validated, bearer-token gated; articulate-x-read queued as the panel-of-experts inflow side. The Worker holds the signing key once so every other script just POSTs JSON.

post-to-x skill — the brand-voice gate in front of the Worker. The Worker accepts any bearer-authed JSON. The skill is the only thing that runs the draft through the de-Claudification rules first. Never POST direct to the Worker from an agent session — always through the skill. Stops the "Claude voice leaks onto the timeline" failure mode at source.

Cloudflare and skills — three smaller landings

Closing placeholders and adding the discipline layer

Cloudflare API token captured. Workers-compute family scope only — KV, R2, Pages, Scripts, Routes, Tail, Builds, Containers, Observability, plus account and user reads. No DNS-edit, no zone-write. Closes the placeholder that's been on the credentials page for two weeks. Worker deploys now run end-to-end from the sandbox.

film-still-composite skill — the discipline layer on top of Soul Cinema. Higgsfield's unlimited tier made composite multi-character scenes routine. This skill wraps every composite-scene brief with the prompt-anchor library prepend and the Roger-discipline lighting/framing/lens register, so the brief stays cinematographer-grade rather than Higgsfield-default. Belinda and Dylan are the standing approval gate.

post-to-x skill — see above. Listed twice because it earns its place in both buckets: a new distribution capability and a new discipline gate.

How we pick

Every tool earns its place on three tests.

Does it do the job in one tool, or three? One beats three.
Does it run on your stack after we hand over, or does it lock you into us? Hand-over beats lock-in.
Is it the best in its category right now, or the most familiar? Best beats familiar.

Test three is the discipline most operators fail. Familiarity is comfortable. The category moves every quarter.

The stack — Applications, each grouping its Tools

Tools alone don't ship anything. Applications do — the assemblies of tools, models, and skills that produce a specific outcome for a client. This page reads top-down: each Application explains what it does and what's underneath it. The same tool can appear under more than one Application — that's a feature, not a duplication. A capable Anthropic key, for example, lives under Voice, MEP, and Build.

Application — flagship offer

Marketing Engine Pilot

Eight-week install — bridges close the gap between an ad click and a closed deal, on the client's existing stack

The productised offer. Five MEP recipes ship as Cloudflare Workers (HTTP-shaped) plus optional Hermes skills (conversation- and daemon-shaped) — the client owns both at week eight. No no-code platform between us and the client's stack. No lock-in beyond DNS, which is portable anyway.

Cloudflare Workers

Automation substrate

Five MEP recipes — lead routing, CRM enrich, ad-spend monitor, form-to-Slack, invoice nudge — all ship as Workers, deployed via Wrangler to the client's own free Cloudflare account at handover.

Hermes Agent

Conversational + daemon backend

Paired with Workers. Anything WhatsApp-shaped, memory-shaped, or self-learning lives here. Full page →

Anthropic API

Inference

Opus / Sonnet / Haiku picked per task. Most MEP recipes use Haiku for triage, Sonnet for drafts, Opus only where reasoning is the work.

OpenRouter

Provider router

One key fans out to 100+ models. Used when a tenant needs a non-Anthropic provider for data-residency or cost-shape reasons.

Cloudflare D1 / KV / R2

State + storage

Small relational state in D1, key-value in KV, files in R2. Region-pinnable for PDPL-strict clients.

GitHub repo + Wrangler

Hand-over substrate

One repo per engagement — articulate-mep-<client>. wrangler deploy moves the Workers to the client's account in 20–60 minutes at week 7.

Application — productised surface layer

Articulate Voice — "AI worth talking to"

Phone-native, voice-first, brand-tuned conversational interface — lives in WhatsApp by default, anywhere by configuration

A six-layer stack that turns any AI product, agent, or website into a channel-native voice surface. The Roche-Debbie pilot is the canonical build; BossCouple's WhatsApp agent is the second deployment. Each layer is configurable per tenant — channel, routing, voice-in, brain, context, voice-out.

WhatsApp (Baileys bridge)

Layer 1 — channel

Bot phone number with allowlisted users. Voice notes auto-transcribed; TTS replies sent as MP3 attachments. iMessage, Signal, Telegram all map to the same shape.

ElevenLabs Scribe

Layer 3 — voice in

Transcribes `.opus` voice notes on the way in. Multilingual. Same Creator-plan API key as TTS.

Claude / OpenRouter

Layer 4 — brain

Per-persona system prompt, per-deployment model dial. The mentor for Debbie sounds like Constance; the BossCouple agent sounds like the brand.

Vault + Gmail + M365 + CRM

Layer 5 — context

Per-tenant context loader. The bot reads what it needs to know about the user's life and work before it replies.

ElevenLabs TTS — per-persona voice ID

Layer 6 — voice out

Brand-aligned voice ID per persona. Belinda gate before any voice ID locks. Dylan owns the prompt anchor library per brand.

ElevenLabs Conversational AI

Trial — alternative full-stack path

Streaming voice agent with turn-detection + interruption. On trial as the alternative to self-stitching layers 3–6. Replaces the multi-tool path when low-latency conversation is the requirement.

Hermes Agent on Hostinger

Runtime + memory

The 24/7 substrate. Memory across sessions, self-learning skill library, Curator. Full page →

Application — 24/7 substrate

Hermes Agent

The persistent-daemon agent runtime — pairs with Claude Code (interactive) for everything always-on

Open-source from Nous Research. MIT-licensed. Replaces n8n in the MEP automation pair. Where Claude Code is the interactive build surface, Hermes is the 24/7 daemon — WhatsApp pipeline, self-learning skill loop, Curator-managed library, persistent context per user. BossCouple is pilot #1. Full deploy spec, costs, training loop and risks on the standalone page.

Read the full page →

Hermes runtime (Nous Research)

The daemon

Persistent background process with cron + event triggers. Same SKILL.md format as Anthropic / agentskills.io — skills are portable both directions.

Hostinger KVM 2

VPS host

$8.99/mo · 2 vCPU · 8 GB RAM · UAE-reachable PoP. $215.76 over 24 months upfront. The named substrate.

OpenRouter

LLM router

Default model layer for Hermes. One key, 100+ models. Per-call provider selection for cost-shaping or data-residency.

Baileys WhatsApp bridge

Conversation gateway

Dedicated bot number, allowlisted users, conversation history persisted. Strangers get silence. UAE prepaid SIM ~AED 50.

ElevenLabs Scribe

Voice notes in

Auto-transcription path for inbound voice messages. Same Creator plan key.

The Curator

Skill library hygiene

Autonomous 7-day pass that grades, consolidates, archives agent-created skills. Snapshots before mutating; rollback is one command.

MEMORY.md + USER.md

Pinned context

2,200 + 1,375 chars in every session prompt. Plus FTS5 SQLite search across every past conversation.

Optional — Ollama on the VPS

Data-sovereign inference

For clients who want "no third-party inference, full stop." Quality drops vs frontier; requires KVM 4 minimum for usable Llama 70B.

Application — owning the surface

Distribution backbone

Worker-mediated publishing, brand-voice-gated, one bearer token per surface

X went live as a first-class distribution surface on 24 May — the first of a Worker-plus-skill pattern that generalises to every social channel. The Worker holds the signing key; the skill holds the brand-voice rules. Nothing posts without both.

X / Twitter API

Pay-per-use

$0.005/read, $0.01/post. No monthly minimum. Account is @anthony71booth. App is articulate-radar-reader.

articulate-x-publish Worker

Outbound — live 24 May

OAuth 1.0a signed. 280-char validated. Bearer-token gated. Returns the X response transparently for debugging.

articulate-x-read Worker

Inbound — queued

Panel-of-experts inflow. Pulls quoted timelines, scrapes specific accounts, feeds the read-side of the content engine. Build trigger: first panel curated.

post-to-x skill

Brand-voice gate

The only thing that runs drafts through de-Claudification before signing the bearer call. Stops "Claude voice on the timeline" at source.

LinkedIn (manual today)

Anthony's personal

Three-posts-a-week target. Worker shape queued for the same Worker-plus-skill pattern once volume justifies.

Cloudflare Tunnel

Public surfaces from the mini

Carries gigreels.articulate-ai.work, famflix.articulate-ai.work, plex.articulate-ai.work. Internet-in routing without exposing the home gateway.

Cloudflare Pages

Static deploy target

For one-pagers and microsites that don't need a Mac mini. Each engagement's case-study page sits here.

Application — what we ship to clients

Content production

The discipline layer + the model layer + the composition layer, in that order

Tools are not the work. The brief is the work. These tools earn their place because they let one operator hold the brief from concept through to ship without losing register. Hard lines: never AI-generate a real property or a third-party brand mark. Real-named likenesses: photography preferred per engagement.

Higgsfield — Soul Cinema, unlimited

Character video + composite stills

Upgraded from Plus on 20 May. Soul Cinema is the default for character-scene work. Composite multi-character scenes that were impossible at the Plus tier are now routine.

Leonardo.ai

Brand stills, hero imagery, ad creative

Custom Element training locks a house style once — every future image lands in the same visual register. Used for everything that isn't a real face, property, or brand-mark.

Canva (via MCP)

Composition + type layouts

Where the type sits on the image. Brand-kit colours, multi-slide layouts, the carousels themselves.

ElevenLabs TTS

Narration + voiceover

British register, broadcast-quality. 100k chars/month covers a working content engine. Daniel + Charlie are the locked British voices.

ai-image skill (wrapper)

Brand-anchor enforcement

The skill that fails LOUDLY if a brand has no anchor library. $10/day cost gate. Belinda + Dylan standing gate before anything ships.

film-still-composite skill

Roger-discipline composite scenes

Wraps Soul Cinema with the cinematography brief — lighting, framing, lens register — so composites stay Roger-grade rather than Higgsfield-default.

get-image skill

Universal image sourcing

One brief → 10–50 candidates across Pinterest, Unsplash, Pexels, Pixabay, Adobe Stock. Live moodboard artifact. $0 baseline.

tufte-viz skill

Data-viz discipline

Auto-fires on chart-design or chart-critique. Loads Tufte's full canon — data-ink ratio, lie factor, small multiples. Every chart for an external surface runs the Tufte test first.

Application — where the work gets done

Build surface — agents, models, plumbing

The interactive runtime + the model dial + the MCP layer that ties them together

Claude Code is the daily driver — files, commands, MCPs, persistent project context. Cowork is the same model family wearing a Word-doc-shaped hat. The MCP layer is the bridge from "the agent thinks" to "the agent acts on real systems."

Claude Code (CLI)

Terminal agent

Reads files, writes files, runs commands, calls connected services. Holds context across a project for as long as the project lives. Lived in daily.

Claude Desktop, Cowork mode

Documents and decks

Same model family, different shape. Better for the things that have to look right on a screen — offer one-pagers, audit reports, fit-call follow-ups, slide decks.

Opus / Sonnet / Haiku

The model dial

Opus for hard reasoning. Sonnet for the daily driver. Haiku for fast and cheap. We pick per task. The same model for everything is the wrong default.

OpenAI API

Second-opinion frontier

GPT-5.5 + o-series. Used by the dual-frontier debias step in the taste-to-skill pipeline. We are Anthropic-first by conviction; the OpenAI key catches what one frontier alone misses.

MCP layer

The plumbing

Gmail, Drive, Calendar, Apple Notes, HubSpot, PayPal, Stripe, Square, Slack, Notion, Atlassian, M365, Box, Gong, Granola, Klaviyo, Figma, Canva, Adobe, Spotify, Chrome, Vercel, plus PDF + shell. Some installed, some on demand.

Skills and plugins

Installable capability

brand-voice, marketing, small-business, PDF tools, deploy, deliverables-scaffold, tufte-viz, ai-image, film-still-composite, post-to-x. A new skill is a one-file commit.

Scheduled tasks

Cron inside Claude

Recurring agent runs that fire while the agent is open. Interactive work; for always-on, see Hermes.

Next.js 15 on Vercel

The site stack

React 19, TypeScript, Tailwind v4, Geist. Single-file deploys for client landers. Full Next.js for routing or data. Client owns the repo.

Application — fetch, parse, browse, watch

Research surface

The fan-out layer for "what's actually out there"

WebSearch alone returned verbal surface and missed the live page. The current pairing solves it — Firecrawl for deep fetch and JSON extraction; Claude in Chrome for login-walled or JavaScript-heavy pages; Granola for WhatsApp call capture; iMazing for historic-bulk WhatsApp extraction.

Firecrawl

Deep fetch + JSON extraction

1,000 pages/month free. firecrawl_scrape single URL → clean markdown. firecrawl_search multi-source ranked results with optional inline scrape. Confirmed live 24 May.

Exa (queued)

Semantic search

1,000 requests/month free. Best for "find me thinkers semantically near X." Not installed yet; install trigger is the first brief where named-target Firecrawl isn't enough.

WebSearch (native)

Cheapest first check

Built into Cowork. Cheapest "is X a known fact" surface. Not the right tool for live-page research — that's Firecrawl's job.

Claude in Chrome

JavaScript-rendered + login-walled

Browser agent. The 10% of the web that has no MCP — government portals, broker dashboards, anything client-rendered.

whatsapp-mcp (lharries)

WhatsApp text + voice notes

Baileys-based local bridge. SQLite DB never leaves the Mac. MCP tools surfaced to Claude: list_messages, get_chat, send_message, list_contacts, download_media.

Granola

WhatsApp call audio capture

System-audio capture of WhatsApp calls on Mac. Native transcription. Lands at Clients/<client>/calls/ auto-routed by the Haiku classifier.

iMazing

Historic WhatsApp bulk

One-shot per client. Reads iPhone backup, exports full WhatsApp DB to CSV (~$45 one-time licence). BossCouple Resurrect is the open job.

Application — what the operator owns + runs

Operator infrastructure

The mini, the mesh, the backups, the push channel

Articulate runs on infrastructure Anthony owns end-to-end, not on a fragile chain of someone else's SaaS. One always-on Mac mini, one mesh VPN, one off-host backup, one push channel to the phone in his pocket. Everything else is a SaaS that earns its place by passing the three tests at the top.

Mac mini, always-on

Local server

Review-stage builds, password-gated client deliverables, scheduled jobs that don't need a cloud bill. Caddy + cloudflared + OrbStack + launchd jobs.

Tailscale mesh

Sandbox → mini reach

Wired 24 May. Mini at cowork-mini.tailf7afc5.ts.net. Sandbox joins ephemeral per session, SSHes the mini for deploys, evaporates when the session dies.

Backblaze B2

Off-host backup

Single bucket, region-pinned, bucket-scoped key. rclone mirrors curated-flac + docker-configs from mini. ~$0.54/mo over the free tier for 100 GB.

Cloudflare — registrar, DNS, edge

One vendor, no lock-in beyond DNS

Registrar, DNS, Universal SSL, CDN, Tunnel, Workers, Pages. API token captured 24 May — Workers-compute scope only, no DNS-edit, no zone-write.

ntfy.sh

Push to the phone

One-way push channel coworkmini-x7q9p3aw2t. Free, public-channel obscurity-only. Used by mini-resident monitors and Hermes for alerts.

OrbStack

Docker on the mini

Hosts the *arr stack (Plex, Radarr, Sonarr, Lidarr, Prowlarr, Transmission) for MediaServer. Light-touch alternative to Docker Desktop.

Caddy

Static + auth-walled HTTP

Static file server with bcrypt basic_auth. Hosts the GIG case-study site at port 8000, behind the Cloudflare Tunnel.

cloudflared

Outbound tunnel client

Named tunnels from the mini to Cloudflare edge. No port-forwarding through the Etisalat home gateway. Every public-facing surface routes here.

launchd

macOS service supervisor

Owns every always-on service on the mini — Caddy, cloudflared, transmission-daemon, wifi monitor, throughput agent, Tailscale.

Sintra.ai — landmark, not in the stack

Consumer-tier comparator

Listed here as the landmark every SMB buyer mentions. One out of nine on the internal Hype Radar. The full review is on the record — credit-capped, no MCPs, no workflows, no agents.

What we deliberately don't use

No-code platforms as the system. Zapier, Make. Fine for a small bridge — wrong as the foundation. Locks you into a subscription and a builder shortage. We write code you own. n8n was on this page until 23 May as a visual-editor exception — retired entirely after Hermes replaced it as the conversational + daemon substrate. Workers + Hermes cover both columns.

Single-vendor marketing suites as the brain. HubSpot, Salesforce Marketing Cloud, the rest. We integrate with them. We do not ride them.

AI image generation as a headline tool. Useful occasionally. Never the proposition.

AI video as the deliverable. Different offer, different business. This one is the marketing engine.

Consumer-tier AI Helpers (Sintra, etc.) as the productisation answer. Real tools for solopreneurs; structurally shallow for the MEP shape. Credit caps, no MCPs, no workflows, no agents. They're a landmark on the page, not a tool in the stack.

Our policy on regularly finding what's best

The category moves fast. We move with it. Here is the schedule that holds us to it.

Every month

One peer-benchmark observation, written down. Something a competitor or peer is shipping that we are not — or vice versa. Sourced from a podcast, a conference, a LinkedIn scroll, a client conversation.

Mid-month

Cadence audit. Per tool, per skill — is the stack still firing? Anything we touched fewer than three times this month is a candidate to be retired.

Every quarter

Three hours blocked. Per tool we use: did anything better ship this quarter? If yes — run a one-week pilot. If the pilot wins, we cut over. If it loses, we log why and don't revisit for a quarter.

Every year

The whole stack gets re-justified from zero. Anything on the list has to earn its place against everything that shipped that year, not against what it replaced two years ago.

In public

Decisions go on the record. We name what we switched from, what we switched to, and what we accepted losing by the swap. Clients reading this page see the working — not just the conclusion.

The toolbox is mine. The system is yours.

You don't need to know what any of this is. You need to know the system we install in your business runs on tools your team can keep using after the eight weeks. Nothing is locked to me. Nothing is on a subscription only I can pay for. Nothing breaks if I disappear.

Last reviewed · 24 May 2026 — nine additions grouped under "What's new"

Next review · 24 June 2026