The Operating Thesis
Tonal pressure destroys more Thai learners than vocabulary gaps. Every mainstream language app — Duolingo, Babbel, Memrise — teaches Thai as if it were French with stranger letters. It is not. Thai is a five-tone language where mai can mean new, not, silk, wood, or burn depending only on pitch contour. Flashcards do not simulate the moment a vendor in Chiang Mai asks a clarifying question and the learner has 1.5 seconds to answer in the right tone. SawasdeeTalk exists because conversational courage in Thai cannot be self-taught from a quiz — it has to be rehearsed under live pressure, in a place where being wrong is free.
What We Deploy
- Six mission arcs: m1–m6, each a goal-driven scenario (greetings, café, market, travel, food, taxi). Every mission is a vocabulary stage paired with a conversation stage. The AI signals MISSION COMPLETE only when the learner has actually achieved the in-world goal — not when they tap through.
- Five-Tone Practice: A dedicated screen drills mid, low, high, rising, and falling tones with Gemini TTS playback. The ear is trained against the same voice the conversation engine uses — no whiplash between practice and live.
- AI Conversation Partner: Gemini 2.5 Flash with character-specific system instructions, JSON-schema-enforced responses (English text, Thai script, phonetic, tutor note), and a Real Mode toggle that drops the teaching scaffolding mid-mission when the learner is ready to be treated as a peer.
- Translator: Bidirectional Thai ↔ English on demand, wired to the same Gemini pipeline so the voice and register stay consistent with the conversation engine.
- Explore Mode: A cultural-scene browser (96 scene records) covering attractions and contexts the learner is statistically about to walk into — Bangkok temples, Chiang Mai markets, Phuket beaches.
- Mission Map + Persona Journey: Gamified progression with badges and a persona arc, so the learner can see what they have rehearsed and what they have not — the inverse of a streak counter that rewards showing up over performing.
The Architecture
Production-grade infrastructure. Vite 6 build pipeline, React 19.2 with lazy-loaded screens behind Suspense so first paint is fast on a phone over Bangkok 4G. The conversation, translation, and TTS layers all run through a single Gemini service module (services/geminiService.ts) calling gemini-2.5-flash for text and gemini-2.5-flash-preview-tts with the Kore voice for Thai audio. State lives in a React Context (UserProgressContext) — no hidden Redux, no premature backend. The PWA layer (manifest.json + service-worker.js) makes the app installable on Android and iOS without app-store friction. A Supabase scaffold (profiles, user_progress, badges) is wired through scripts/apply-supabase-migrations.mjs so the day persistent accounts ship, the schema is already arena-forged, not bolted on.
What's Built (Verified 2026-05-07)
- 16 lazy-loaded screens: splash, onboarding, home dashboard, mission map, mission detail, conversation, vocabulary practice, tone practice, translator, explore, profile, settings, pricing, blog, privacy policy, terms of service.
- 7 module sections: core, learning, tools, content, monetization, legal, resources. Each section owns its screens and components — boundaries enforced by folder, not by convention.
- 75 TypeScript source files across sections, components, hooks, services, and screens. Seven custom hooks carry the runtime contract.
- 6 mission arcs and 96 scene records seeded in constants. Five Thai tones modeled in
constants/tonePracticeData.ts. - Single-service AI surface: One
geminiService.ts module exports getAITextResponse, getAudioForText, getPronunciationFeedback, and getTranslation — every model call routes through one file. Schema-validated JSON responses prevent the AI from drifting out of the teaching contract. - Test scaffolding: Vitest 4 unit harness with jsdom and Testing Library; Playwright 1.58 wired for end-to-end runs.
- PWA shipped:
manifest.json declares standalone display, brand theme #f97316, 192px and 512px icons. Service worker registered.
Out of scope — by design, forever
- Multi-language expansion: We do not add Vietnamese, Khmer, or Lao. Thai is the product. A second tonal language is a different product.
- Native app store distribution: PWA only. The install friction tax of an app store is paid by every user; the maintenance tax of two native binaries is paid by us. Neither is worth it for a pre-trip practice tool.
- Live human tutors: We do not broker teachers. The product is the AI partner that is patient at 11 PM the night before a flight — not a marketplace.
- Custom voice cloning: We use Gemini's prebuilt Thai voice. We do not run our own TTS pipeline. The model owners ship better Thai voices every quarter than we could build in a year.
- Personalized speech-pathology grading: Tone Practice gives feedback, not a clinical assessment. We are not a speech-language clinic.
Why This Matters for the Platform
SawasdeeTalk is the language-learning instance of the same operating thesis that powers every HavenWizards 88 venture: humans need externalized structure to behave well under pressure. The same systems lens that built CapitalWizards (behavioral discipline for retail investors), Bayanihan Harvest (60+ deployed agricultural systems), and HW88 Education (governance pedagogy) now lives in the conversational-fluency category. The Filipino diaspora and Thai-curious tourism flow between Manila, Bangkok, and Chiang Mai is large enough to support a focused product — and underserved enough that a serious tool feels like relief, not noise.
Roadmap
- Phase 1 — PWA MVP (current): Six missions, five-tone practice, translator, explore, AI conversation with Real Mode, PWA install path, Vitest plus Playwright harness, Supabase schema scaffolded for the next phase.
- Phase 2 — Persistence and Subscriptions (Q3 2026): User accounts and progress persistence on the Supabase tables already scaffolded. Tier-based feature locks, Free → Pro conversion measurement, 5–10 new missions, Explore expansion.
- Phase 3 — Voice Intelligence (Q4 2026 onward): Voice tone analysis with formal/informal/polite register feedback, SRS-driven vocabulary review, real-conversation mode refinement, accessibility hardening.