The Context
Running a strategic consulting practice with six to eight active client engagements creates a specific workflow challenge. Every meeting generates context that needs to flow to the right project. Every insight needs to be captured before it fades.
"I had an essential meeting this morning on Teams and forgot to start transcription until 10 minutes in, then it randomly stopped before the meeting was over."
The core friction was clear: manual, unreliable capture led to manual processing, which led to manual routing, which led to context switching overhead. Every step leaked time and cognitive energy away from strategic thinking.
The goal was straightforward: eliminate the babysitting. Capture every meeting automatically, process it intelligently, and route the summary to the right project without human intervention at each step.
Architecture Evolution
The path from concept to production required abandoning every initial assumption. The original design and final implementation differ substantially:
| Component | Original Design | Final Implementation |
|---|---|---|
| Transcription | Alter (local processing) | Roam (cloud-based, webhook-enabled) |
| Detection | Polling-based daemon via launchd | Folder Actions (event-driven) |
| Output | Three artifacts per meeting | Single context-aware summary per project |
| Routing | Single-client per transcript | Multi-context (one meeting → multiple projects) |
The Alter Problem
The original design assumed Alter would be the transcription engine. Local processing seemed superior: privacy preserved, no API costs, complete control.
In practice, Alter caused severe video degradation - 20fps during meetings - despite running on an M5 Pro with 32GB RAM. The local Parakeet transcription model competed for GPU resources with the video feed.
Lesson Learned
Local processing is not always superior. Cloud-based solutions can be more reliable when resource contention matters. The right architecture depends on specific constraints, not ideology.
Solution: Switched to Roam (ro.am), which processes transcription in the cloud and provides webhook APIs for automation. Zero GPU contention. Zero babysitting during meetings.
The launchd Saga
The original plan assumed macOS launchd would handle background processing reliably. A polling daemon watching for new files seemed straightforward.
Two days of debugging revealed the gap between Terminal behavior and background service behavior:
- iCloud paths behave differently for background services. The path
~/Library/Mobile Documents/com~apple~CloudDocs/that works in Terminal does not always work for launchd daemons. - Python environment isolation. The Python that runs in Terminal is not necessarily the same Python that launchd invokes. Packages installed via
pip3may not be available to background services. - Output buffering. Python buffers stdout by default. Background services showed no log output until we added the
-uflag for unbuffered output. - Service restart loops. The service would start, crash silently in the first polling cycle, restart, crash again - showing multiple "Transcript Processor started" messages with no actual processing.
We tried wrapper shell scripts to set environment variables, PYTHONPATH exports in launchd plists, HOME directory explicit declarations, various path quoting strategies. None produced reliable execution.
What Finally Worked: Automator Folder Actions. Unlike launchd polling, Folder Actions run in the user's context with full environment access, handle iCloud paths correctly, and trigger immediately when files arrive.
Final Architecture
┌─────────────────────────────────────────────────────────────────┐
│ CAPTURE LAYER │
├─────────────────────────────────────────────────────────────────┤
│ Roam (ro.am) │
│ - Cloud transcription (no GPU contention) │
│ - Webhook API for automation │
│ - Zero babysitting during meetings │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ WEBHOOK LAYER │
├─────────────────────────────────────────────────────────────────┤
│ ngrok (static domain) → Flask receiver │
│ - Receives transcript-ready webhooks │
│ - Downloads transcript to _inbox folder │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ PROCESSING LAYER │
├─────────────────────────────────────────────────────────────────┤
│ Folder Actions (Automator) │
│ - Watches _inbox folder │
│ - Triggers Python processor │
│ - Runs in user context (full env access) │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ INTELLIGENCE LAYER │
├─────────────────────────────────────────────────────────────────┤
│ Claude API (Sonnet) │
│ - Receives transcript + ALL client configs │
│ - Identifies substantive contexts │
│ - Creates separate summary per relevant project │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ OUTPUT LAYER │
├─────────────────────────────────────────────────────────────────┤
│ Single Processed Transcripts/ folder │
│ Naming: [Context]_[FirstNameLastInitial]_[YYYYMMDD].md │
│ Example: MyDriveScore_AdvisorA_20260127.md │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ REVIEW LAYER │
├─────────────────────────────────────────────────────────────────┤
│ Manual import to Claude Projects │
│ (Intentional friction - keeps human engaged with content) │
└─────────────────────────────────────────────────────────────────┘
Multi-Context Routing
The original design routed each transcript to a single client folder. This broke down immediately - a single meeting with an advisor might touch MyDriveScore funding strategy, Proven collective positioning, AI Coaching curriculum ideas, and a passing reference to a client engagement.
New Model:
- Processor sends transcript to Claude with ALL client configs
- Claude identifies which contexts have substantive content (not just passing mentions)
- Creates separate summary file for each substantive context
- All output goes to single
Processed Transcripts/folder
Example Output from One Meeting:
MyDriveScore_AdvisorA_20260127.md
AICoaching_AdvisorA_20260127.md
Proven_AdvisorA_20260127.md
Intentional Friction: Files must be manually imported into Claude Projects. This keeps the human engaged with the content rather than letting insights go "out of sight, out of mind." Full automation would route summaries directly, but the brief touch point of manual import ensures nothing important gets auto-filed and forgotten.
The Implementation Journey
What follows is the actual sequence of problems encountered and solutions discovered. This was not a single afternoon - it was two days of debugging spread across a week.
Got Roam transcription working, configured ngrok for webhook receiving, wrote Flask receiver. Each component worked in isolation. Integration began.
Flask wouldn't start. macOS AirPlay Receiver had claimed port 5000. Changed to port 8080. Trivial fix, but an hour lost to diagnosis.
Free tier ngrok assigns random domains. Every restart meant updating the webhook URL in Roam. Upgraded to Hobby plan ($10/mo) for static domain.
Roam webhooks include signature validation. Initial implementation failed silently. Discovered Roam uses Standard Webhooks format, not custom implementation. Fixed validation logic.
Two full days debugging why the polling daemon wouldn't run. iCloud paths, Python environments, output buffering, restart loops. Eventually abandoned launchd entirely.
Automator Folder Actions work in user context with full environment. Immediate file detection. No polling delays. Problem solved in 30 minutes after two days of launchd debugging.
Old launchd service was still running alongside new Folder Actions. Duplicate output files, conflicting processing. Had to hunt down and kill all zombie processes, remove abandoned plist files.
Multi-context routing working. Single output folder with smart naming. Human review step preserved. System running reliably.
Issues Encountered and Resolutions
The complete log of problems and solutions. Every issue here cost real debugging time.
| Issue | Root Cause | Resolution |
|---|---|---|
| Alter video degradation | Local GPU contention | Switched to Roam (cloud processing) |
| Port 5000 blocked | macOS AirPlay Receiver | Changed to port 8080 |
| ngrok URL changes | Free tier random domains | Upgraded to Hobby plan ($10/mo) for static domain |
| Webhook signature validation | Incorrect spec implementation | Used Standard Webhooks format |
| launchd not finding files | iCloud path resolution in background context | Switched to Folder Actions |
| Python module not found | Different Python installations | Added PYTHONPATH to shell script |
| No log output | Python stdout buffering | Added -u flag for unbuffered |
| Service restart loops | Silent crash in polling loop | Folder Actions eliminated polling |
| Wrong client routing | First-match-wins logic | Multi-context analysis with Claude |
| Path with spaces breaking | Unquoted path in .env | Added quotes around path value |
| Folder Action not triggering | Pointed to old deleted folder | Re-added folder in Folder Actions Setup |
| Duplicate files created | Multiple processor instances | Killed all processes, removed old launchd plist |
| Path truncation errors ('/U') | Zombie launchd service with bad config | Removed abandoned service configuration |
Critical Lesson: Clean Up Failed Approaches
When iterating through solutions (launchd → wrapper scripts → Folder Actions), always fully remove the previous approach before testing the new one. Zombie processes from abandoned approaches can run for days, causing subtle bugs that only appear intermittently.
# Before declaring a new approach "working," verify:
ps aux | grep processor | grep -v grep
launchctl list | grep fontanini
# Should only see ngrok and roam-receiver, NOT transcript-processor
Key Principles
Put Intelligence Where Context Lives
Roam can transcribe but cannot understand client relationships. Claude can understand context but cannot capture audio. The architecture places each capability where it can perform optimally. When deciding which tool handles which step, always ask: "Where does the context live?"
Event-Driven Beats Polling
Folder Actions trigger immediately when files arrive. Polling introduces delay and complexity. The switch from launchd polling to Folder Actions eliminated an entire class of bugs. When evaluating tools, webhook support is a key differentiator.
Cloud vs. Local is Context-Dependent
Local transcription seemed superior until it degraded video quality. Cloud processing eliminated resource contention entirely. Evaluate each tool placement decision on its specific tradeoffs: privacy, latency, cost, and resource usage all factor in differently for each use case.
Let AI Determine Context
Hard-coded routing rules (title matching, participant matching) fail on edge cases. Sending the full transcript to Claude with all configs and asking "what's substantive here?" produces better results than any rule-based approach.
Intentional Friction Has Value
Full automation would route summaries directly to Projects. Manual import keeps the human engaged with content, preventing important insights from being auto-filed and forgotten.
For each automation decision, ask: "Does removing this friction also remove valuable engagement?"
Single Output Location
Routing to multiple client folders created confusion and misrouting. A single Processed Transcripts/ folder with smart naming ([Context]_[FirstNameLastInitial]_[YYYYMMDD].md) is simpler and more reliable. Before creating complex routing rules, ask: "Could a single well-organized location serve this need better?"
The Implementation Protocol
Based on this experience, a replicable protocol emerged for building personal automation systems.
1 Start with Manual
Understand the workflow before automating it. Know exactly what decisions get made at each step.
2 Isolate Components
Get each piece working independently before integration. Capture → Process → Route → Review.
3 Prefer Event-Driven
Webhooks and file watchers over polling. Immediate response, fewer failure modes.
4 Run in User Context
Background services have different permissions and paths. Prefer tools that run as the user.
5 Let AI Judge Context
Rules fail on edge cases. Give AI full information and let it make the routing decision.
6 Keep One Review Point
Full automation loses visibility. One manual step keeps human engaged with output.
Cost Summary
| Item | Cost | Frequency |
|---|---|---|
| Roam | Included | Existing subscription |
| ngrok Hobby | $10 | Monthly |
| Claude API (Sonnet) | ~$5-15 | Monthly (estimated) |
Total monthly cost: $15-25
Break-even: At $350/hour, system pays for itself if it saves 5 minutes per month. Actual savings: 10+ hours monthly.
Skills Developed
Building this system required learning tools I had never used before:
- Homebrew package management on macOS
- ngrok tunnel configuration and static domains
- Python Flask webhook receivers
- Roam API integration (webhooks, transcript fetching)
- launchd service configuration (and its limitations)
- Automator Folder Actions
- macOS notification scripting via osascript
- Environment variable management for background processes
- Anthropic Claude API for document processing
None of this was in my wheelhouse before the project started.
The AI coaching implication: executives building personal automation develop transferable technical fluency as a side effect.
The Result
A fully automated meeting capture and routing system that transforms raw transcripts into context-appropriate summaries.
- Zero-touch capture: Roam handles transcription automatically
- Intelligent routing: Claude identifies relevant contexts per meeting
- Multi-project output: One meeting can produce summaries for multiple projects
- Smart naming:
[Context]_[Person]_[Date].mdconvention - Intentional review: Manual import preserves human engagement
- 10+ hours saved monthly at $350/hour billing rate
Closing Reflection
The system that works looks nothing like the system that was designed. Local processing became cloud processing. Polling became event-driven. Single-client routing became multi-context analysis. Every assumption was wrong.
But this is the nature of building systems: the gap between concept and production is where the real learning happens. The debugging marathon was not a failure of planning - it was the work itself.
The meta-lesson is broader than workflow automation:
The executive who builds their own systems develops instincts that cannot be delegated. The two days spent debugging launchd produced knowledge about macOS, Python environments, and background services that will transfer to every future automation project.
This is the future of AI-augmented work. Not AI replacing human judgment, but AI accelerating human capability to build systems that would otherwise require a development team.