Real Chinese neural TTS: default Mandarin Putonghua with Yaoyao (zh-CN-YaoyaoNeural); switch to Cantonese (Hong Kong) voices such as Huihui in the picker when your script is Cantonese-first. Preview, tune speed/pitch, download MP3 — no account.
Enter your text above and click to generate natural speech
Chinese text to speech converts written Chinese into natural spoken audio. This workflow supports both Mandarin and Cantonese via the correct language codes — authentic tonal pronunciation, natural rhythm, and regional voices native speakers recognize.
Paste your text, select Mandarin (zh-CN) or Cantonese (zh-HK) and the matching neural voice, then download MP3. No account or card; per-clip character limits keep latency predictable.
They share a writing system but differ in phonology, tones, and everyday particles. Pick the right language in the tool — not just “Chinese.”
Spoken Mandarin and Cantonese are learned separately; defaulting one engine for both produces wrong audio for the other.
Mandarin’s four tones map differently from Cantonese’s richer tonal system — engines must apply the right rules.
Cantonese sentence-final particles do not exist in Mandarin; Cantonese-first TTS handles them when you use zh-HK voices.
Simplified for most Mandarin; Traditional for Cantonese or Taiwanese reads — up to 800 characters per generation.
Choose zh-CN for Putonghua defaults, or zh-HK for Cantonese — then pick male/female or formal profiles from the list.
Preview, adjust speed/pitch, export MP3 for any editor or LMS.
| Feature | Mandarin TTS | Cantonese TTS |
|---|---|---|
| Tonal system | Four tones | Nine tones |
| Primary character set | Simplified (typical) | Traditional (typical) |
| Primary markets | Mainland, Taiwan, Singapore | HK, Macau, diaspora |
| Colloquial particles | Mandarin set | Cantonese-specific |
| Best for | Mainland & global Mandarin | HK-focused & Yue content |
Creators on YouTube, Bilibili, and social — narration without studio overhead.
Learners — tonal reference that replays consistently.
E-learning teams — consistent module VO across libraries.
App developers — prompts, tutorials, accessibility.
Accessibility publishers — audio for Chinese readers.
Enterprises & diaspora orgs — training and community messaging in the right spoken variety.
It converts Simplified or Traditional Chinese text into spoken audio in Mandarin or Cantonese, depending on the language and voice you select.
Yes — no account or card. Fair use includes an 800-character cap per generation.
Different phonologies and tones — select zh-CN vs zh-HK (and matching voices) so the engine applies the correct rules.
Simplified for most mainland Mandarin; Traditional for Cantonese and many Taiwanese contexts.
Yes — neural TTS is useful for replayable, correctly toned reference audio for Mandarin or Cantonese study.
MP3 — works in Premiere, Resolve, Final Cut, Audacity, GarageBand, podcast hosts, and accessibility tools.
Review current site terms and destination platform rules before monetizing synthetic speech.
Yes — modern mobile browsers supported; no app install required.