Back to Build LogDECISION LOG

Decision: Why We Switched TTS Engines Mid-Build

F5-TTS took 7+ minutes for 4 seconds of speech on our production droplet. edge-tts delivered the same output in 6.7 seconds across 6 scenes. The constraint was never quality — it was compute profile vs. deployment target.

Diosh Lequiron, PD-SML, PhD, MBA, CSM

April 12, 2026 · 2 min read

#tts#architecture-decision#ml#edge-tts#content-engine#droplet

We made a technology switch mid-build. This is the decision record.

The Constraint

Our production environment is a 2vCPU/4GB DigitalOcean droplet. No GPU. No specialized ML hardware. A daily content pipeline needs to complete in under 10 minutes to be useful.

What F5-TTS Actually Cost

F5-TTS is a state-of-the-art open-source TTS system. On our droplet, it took more than 7 minutes to generate 4 seconds of speech. Transformer-based inference without GPU acceleration does not scale to a 2vCPU environment. The model was not the problem. The deployment target was the problem.

The Benchmark Protocol We Now Use

Before committing to any ML-backed tool in production: (1) Run the tool on the actual target hardware. (2) Measure wall-clock time for a representative workload. (3) Calculate whether the tool can complete within the pipeline time budget. (4) Only then commit to the integration.

Why edge-tts

edge-tts is a Python library that calls Microsoft Neural TTS via a free cloud-backed API. On our droplet, it generated 6 scenes of voiceover in 6.7 seconds. The quality is Microsoft Neural — professional, natural, consistent. Trade-off: we depend on a free external API with no SLA.

The Rule This Added to Our Stack Decisions

Every ML tool evaluation now includes a deployment target test. If a tool requires GPU acceleration and the deployment target has no GPU, the tool is not suitable for that context — regardless of quality or cost.

Diosh Lequiron, PD-SML, PhD, MBA, CSM

President & CEO, HavenWizards 88 Ventures

Building arena-forged execution systems and deploying governed Filipino talent across multiple venture lines. Every build log entry comes from real operations, not theory.

More from the Build Log

Reflect

Reflecting on Batch 2: When Velocity and Quality Diverge

Batch 2 produced 10 articles in under 2 weeks. Mid-month audit found fabricated metrics in 4 of them. Every round-number statistic was invented. We rewrote all 14 published articles before any new content shipped. Velocity without editorial gates is marketing theater, not content infrastructure.

May 2, 2026 · 1 min read Ship

What We Shipped in April 2026

10 SEO articles published and rewritten to brand standards. 1 Google indexing crisis identified and fixed — root cause: await headers() in root layout cascading dynamic rendering to every page. Cache-Control went from private no-store to public for 100+ pages.

May 1, 2026 · 1 min read Learn

The GSC Indexing Crisis: What await headers() Does to Your Entire Site

await headers() in the root layout forced every page into dynamic rendering. Next.js responded with Cache-Control: private, no-store. Google read private as personalized content and stopped indexing. 100+ pages crawlable, 1 indexed.

Apr 28, 2026 · 1 min read

Back to Build LogDECISION LOG

Decision: Why We Switched TTS Engines Mid-Build

Diosh Lequiron, PD-SML, PhD, MBA, CSM

April 12, 2026 · 2 min read

#tts#architecture-decision#ml#edge-tts#content-engine#droplet

We made a technology switch mid-build. This is the decision record.

The Constraint

Our production environment is a 2vCPU/4GB DigitalOcean droplet. No GPU. No specialized ML hardware. A daily content pipeline needs to complete in under 10 minutes to be useful.

What F5-TTS Actually Cost

The Benchmark Protocol We Now Use

Why edge-tts

The Rule This Added to Our Stack Decisions

Diosh Lequiron, PD-SML, PhD, MBA, CSM

President & CEO, HavenWizards 88 Ventures

Building arena-forged execution systems and deploying governed Filipino talent across multiple venture lines. Every build log entry comes from real operations, not theory.