Case Study: Building the Machine That Builds the Notes

The Context

Running a strategic consulting practice with six to eight active client engagements creates a specific workflow challenge. Every meeting generates context that needs to flow to the right project. Every insight needs to be captured before it fades.

"I had an essential meeting this morning on Teams and forgot to start transcription until 10 minutes in, then it randomly stopped before the meeting was over."

The core friction was clear: manual, unreliable capture led to manual processing, which led to manual routing, which led to context switching overhead. Every step leaked time and cognitive energy away from strategic thinking.

The goal was straightforward: eliminate the babysitting. Capture every meeting automatically, process it intelligently, and route the summary to the right project without human intervention at each step.

6-8

Active Clients

90%

Virtual Meetings

$350

Hourly Rate

10+

Hours Saved Monthly

Architecture Evolution

The path from concept to production required abandoning every initial assumption. The original design and final implementation differ substantially:

Component	Original Design	Final Implementation
Transcription	Alter (local processing)	Roam (cloud-based, webhook-enabled)
Detection	Polling-based daemon via launchd	Folder Actions (event-driven)
Output	Three artifacts per meeting	Single context-aware summary per project
Routing	Single-client per transcript	Multi-context (one meeting → multiple projects)

The Alter Problem

The original design assumed Alter would be the transcription engine. Local processing seemed superior: privacy preserved, no API costs, complete control.

In practice, Alter caused severe video degradation - 20fps during meetings - despite running on an M5 Pro with 32GB RAM. The local Parakeet transcription model competed for GPU resources with the video feed.

Lesson Learned

Local processing is not always superior. Cloud-based solutions can be more reliable when resource contention matters. The right architecture depends on specific constraints, not ideology.

Solution: Switched to Roam (ro.am), which processes transcription in the cloud and provides webhook APIs for automation. Zero GPU contention. Zero babysitting during meetings.

The launchd Saga

The original plan assumed macOS launchd would handle background processing reliably. A polling daemon watching for new files seemed straightforward.

Two days of debugging revealed the gap between Terminal behavior and background service behavior:

iCloud paths behave differently for background services. The path ~/Library/Mobile Documents/com~apple~CloudDocs/ that works in Terminal does not always work for launchd daemons.
Python environment isolation. The Python that runs in Terminal is not necessarily the same Python that launchd invokes. Packages installed via pip3 may not be available to background services.
Output buffering. Python buffers stdout by default. Background services showed no log output until we added the -u flag for unbuffered output.
Service restart loops. The service would start, crash silently in the first polling cycle, restart, crash again - showing multiple "Transcript Processor started" messages with no actual processing.

We tried wrapper shell scripts to set environment variables, PYTHONPATH exports in launchd plists, HOME directory explicit declarations, various path quoting strategies. None produced reliable execution.

What Finally Worked: Automator Folder Actions. Unlike launchd polling, Folder Actions run in the user's context with full environment access, handle iCloud paths correctly, and trigger immediately when files arrive.

Final Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         CAPTURE LAYER                           │
├─────────────────────────────────────────────────────────────────┤
│  Roam (ro.am)                                                   │
│  - Cloud transcription (no GPU contention)                      │
│  - Webhook API for automation                                   │
│  - Zero babysitting during meetings                             │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                        WEBHOOK LAYER                            │
├─────────────────────────────────────────────────────────────────┤
│  ngrok (static domain) → Flask receiver                         │
│  - Receives transcript-ready webhooks                           │
│  - Downloads transcript to _inbox folder                        │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                       PROCESSING LAYER                          │
├─────────────────────────────────────────────────────────────────┤
│  Folder Actions (Automator)                                     │
│  - Watches _inbox folder                                        │
│  - Triggers Python processor                                    │
│  - Runs in user context (full env access)                       │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                      INTELLIGENCE LAYER                         │
├─────────────────────────────────────────────────────────────────┤
│  Claude API (Sonnet)                                            │
│  - Receives transcript + ALL client configs                     │
│  - Identifies substantive contexts                              │
│  - Creates separate summary per relevant project                │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                        OUTPUT LAYER                             │
├─────────────────────────────────────────────────────────────────┤
│  Single Processed Transcripts/ folder                           │
│  Naming: [Context]_[FirstNameLastInitial]_[YYYYMMDD].md         │
│  Example: MyDriveScore_AdvisorA_20260127.md                        │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                        REVIEW LAYER                             │
├─────────────────────────────────────────────────────────────────┤
│  Manual import to Claude Projects                               │
│  (Intentional friction - keeps human engaged with content)      │
└─────────────────────────────────────────────────────────────────┘

Multi-Context Routing

The original design routed each transcript to a single client folder. This broke down immediately - a single meeting with an advisor might touch MyDriveScore funding strategy, Proven collective positioning, AI Coaching curriculum ideas, and a passing reference to a client engagement.

New Model:

Processor sends transcript to Claude with ALL client configs
Claude identifies which contexts have substantive content (not just passing mentions)
Creates separate summary file for each substantive context
All output goes to single Processed Transcripts/ folder

Example Output from One Meeting:

MyDriveScore_AdvisorA_20260127.md
AICoaching_AdvisorA_20260127.md
Proven_AdvisorA_20260127.md

Intentional Friction: Files must be manually imported into Claude Projects. This keeps the human engaged with the content rather than letting insights go "out of sight, out of mind." Full automation would route summaries directly, but the brief touch point of manual import ensures nothing important gets auto-filed and forgotten.

The Implementation Journey

What follows is the actual sequence of problems encountered and solutions discovered. This was not a single afternoon - it was two days of debugging spread across a week.

Phase 1: Initial Setup

Basic Components Working

Got Roam transcription working, configured ngrok for webhook receiving, wrote Flask receiver. Each component worked in isolation. Integration began.

Phase 2: First Failure

Port 5000 Blocked

Flask wouldn't start. macOS AirPlay Receiver had claimed port 5000. Changed to port 8080. Trivial fix, but an hour lost to diagnosis.

Phase 3: Second Failure

ngrok URL Changes

Free tier ngrok assigns random domains. Every restart meant updating the webhook URL in Roam. Upgraded to Hobby plan ($10/mo) for static domain.

Phase 4: Third Failure

Webhook Signature Validation

Roam webhooks include signature validation. Initial implementation failed silently. Discovered Roam uses Standard Webhooks format, not custom implementation. Fixed validation logic.

Phase 5: The launchd Saga

Background Processing Hell

Two full days debugging why the polling daemon wouldn't run. iCloud paths, Python environments, output buffering, restart loops. Eventually abandoned launchd entirely.

Phase 6: Breakthrough

Folder Actions Discovery

Automator Folder Actions work in user context with full environment. Immediate file detection. No polling delays. Problem solved in 30 minutes after two days of launchd debugging.

Phase 7: Zombie Processes

Competing Automation

Old launchd service was still running alongside new Folder Actions. Duplicate output files, conflicting processing. Had to hunt down and kill all zombie processes, remove abandoned plist files.

Phase 8: Production

System Stable

Multi-context routing working. Single output folder with smart naming. Human review step preserved. System running reliably.

Issues Encountered and Resolutions

The complete log of problems and solutions. Every issue here cost real debugging time.

Issue	Root Cause	Resolution
Alter video degradation	Local GPU contention	Switched to Roam (cloud processing)
Port 5000 blocked	macOS AirPlay Receiver	Changed to port 8080
ngrok URL changes	Free tier random domains	Upgraded to Hobby plan ($10/mo) for static domain
Webhook signature validation	Incorrect spec implementation	Used Standard Webhooks format
launchd not finding files	iCloud path resolution in background context	Switched to Folder Actions
Python module not found	Different Python installations	Added PYTHONPATH to shell script
No log output	Python stdout buffering	Added `-u` flag for unbuffered
Service restart loops	Silent crash in polling loop	Folder Actions eliminated polling
Wrong client routing	First-match-wins logic	Multi-context analysis with Claude
Path with spaces breaking	Unquoted path in .env	Added quotes around path value
Folder Action not triggering	Pointed to old deleted folder	Re-added folder in Folder Actions Setup
Duplicate files created	Multiple processor instances	Killed all processes, removed old launchd plist
Path truncation errors ('/U')	Zombie launchd service with bad config	Removed abandoned service configuration

Critical Lesson: Clean Up Failed Approaches

When iterating through solutions (launchd → wrapper scripts → Folder Actions), always fully remove the previous approach before testing the new one. Zombie processes from abandoned approaches can run for days, causing subtle bugs that only appear intermittently.

# Before declaring a new approach "working," verify:
ps aux | grep processor | grep -v grep
launchctl list | grep fontanini

# Should only see ngrok and roam-receiver, NOT transcript-processor

Key Principles

Put Intelligence Where Context Lives

Roam can transcribe but cannot understand client relationships. Claude can understand context but cannot capture audio. The architecture places each capability where it can perform optimally. When deciding which tool handles which step, always ask: "Where does the context live?"

Event-Driven Beats Polling

Folder Actions trigger immediately when files arrive. Polling introduces delay and complexity. The switch from launchd polling to Folder Actions eliminated an entire class of bugs. When evaluating tools, webhook support is a key differentiator.

Cloud vs. Local is Context-Dependent

Local transcription seemed superior until it degraded video quality. Cloud processing eliminated resource contention entirely. Evaluate each tool placement decision on its specific tradeoffs: privacy, latency, cost, and resource usage all factor in differently for each use case.

Let AI Determine Context

Hard-coded routing rules (title matching, participant matching) fail on edge cases. Sending the full transcript to Claude with all configs and asking "what's substantive here?" produces better results than any rule-based approach.

Intentional Friction Has Value

Full automation would route summaries directly to Projects. Manual import keeps the human engaged with content, preventing important insights from being auto-filed and forgotten.

For each automation decision, ask: "Does removing this friction also remove valuable engagement?"

Single Output Location

Routing to multiple client folders created confusion and misrouting. A single Processed Transcripts/ folder with smart naming ([Context]_[FirstNameLastInitial]_[YYYYMMDD].md) is simpler and more reliable. Before creating complex routing rules, ask: "Could a single well-organized location serve this need better?"

The Implementation Protocol

Based on this experience, a replicable protocol emerged for building personal automation systems.

1 Start with Manual

Understand the workflow before automating it. Know exactly what decisions get made at each step.

2 Isolate Components

Get each piece working independently before integration. Capture → Process → Route → Review.

3 Prefer Event-Driven

Webhooks and file watchers over polling. Immediate response, fewer failure modes.

4 Run in User Context

Background services have different permissions and paths. Prefer tools that run as the user.

5 Let AI Judge Context

Rules fail on edge cases. Give AI full information and let it make the routing decision.

6 Keep One Review Point

Full automation loses visibility. One manual step keeps human engaged with output.

Cost Summary

Item	Cost	Frequency
Roam	Included	Existing subscription
ngrok Hobby	$10	Monthly
Claude API (Sonnet)	~$5-15	Monthly (estimated)

Total monthly cost: $15-25

Break-even: At $350/hour, system pays for itself if it saves 5 minutes per month. Actual savings: 10+ hours monthly.

Skills Developed

Building this system required learning tools I had never used before:

Homebrew package management on macOS
ngrok tunnel configuration and static domains
Python Flask webhook receivers
Roam API integration (webhooks, transcript fetching)
launchd service configuration (and its limitations)
Automator Folder Actions
macOS notification scripting via osascript
Environment variable management for background processes
Anthropic Claude API for document processing

None of this was in my wheelhouse before the project started.

The AI coaching implication: executives building personal automation develop transferable technical fluency as a side effect.

Closing Reflection

The system that works looks nothing like the system that was designed. Local processing became cloud processing. Polling became event-driven. Single-client routing became multi-context analysis. Every assumption was wrong.

But this is the nature of building systems: the gap between concept and production is where the real learning happens. The debugging marathon was not a failure of planning - it was the work itself.

The meta-lesson is broader than workflow automation:

The executive who builds their own systems develops instincts that cannot be delegated. The two days spent debugging launchd produced knowledge about macOS, Python environments, and background services that will transfer to every future automation project.

This is the future of AI-augmented work. Not AI replacing human judgment, but AI accelerating human capability to build systems that would otherwise require a development team.

Building the Machine That Builds the Notes