Chinese Text to Speech: Natural Mandarin and Cantonese Voices, Free

Real Chinese neural TTS: default Mandarin Putonghua with Yaoyao (zh-CN-YaoyaoNeural); switch to Cantonese (Hong Kong) voices such as Huihui in the picker when your script is Cantonese-first. Preview, tune speed/pitch, download MP3 — no account.

Your Content

Characters: 0 / 800 Words: 0

Enter your text above and click to generate natural speech

What is Chinese text to speech?

Chinese text to speech converts written Chinese into natural spoken audio. This workflow supports both Mandarin and Cantonese via the correct language codes — authentic tonal pronunciation, natural rhythm, and regional voices native speakers recognize.

Paste your text, select Mandarin (zh-CN) or Cantonese (zh-HK) and the matching neural voice, then download MP3. No account or card; per-clip character limits keep latency predictable.

Mandarin and Cantonese are not the same language

They share a writing system but differ in phonology, tones, and everyday particles. Pick the right language in the tool — not just “Chinese.”

Mutual unintelligibility

Spoken Mandarin and Cantonese are learned separately; defaulting one engine for both produces wrong audio for the other.

Tones: 4 vs 9

Mandarin’s four tones map differently from Cantonese’s richer tonal system — engines must apply the right rules.

Colloquial particles

Cantonese sentence-final particles do not exist in Mandarin; Cantonese-first TTS handles them when you use zh-HK voices.

Languages & voice styles

Mandarin (Putonghua)

  • Standard Putonghua — neutral broadcast clarity
  • Taiwanese Mandarin — select zh-TW voices when listed
  • Formal / conversational registers — match your script tone
  • Female / male Mandarin voices — pick from neural list

Cantonese (Yue)

  • Standard Hong Kong Cantonese — zh-HK voices like Huihui
  • Conversational vs formal — controlled by your wording
  • Female / male HK delivery — choose from loaded zh-HK list
  • Heavy Cantonese projects — also try our dedicated page

Open Cantonese TTS →

What you can create

How it works

1

Enter your Chinese text

Simplified for most Mandarin; Traditional for Cantonese or Taiwanese reads — up to 800 characters per generation.

2

Select language & voice

Choose zh-CN for Putonghua defaults, or zh-HK for Cantonese — then pick male/female or formal profiles from the list.

3

Generate & download

Preview, adjust speed/pitch, export MP3 for any editor or LMS.

Mandarin vs Cantonese TTS

Feature Mandarin TTS Cantonese TTS
Tonal systemFour tonesNine tones
Primary character setSimplified (typical)Traditional (typical)
Primary marketsMainland, Taiwan, SingaporeHK, Macau, diaspora
Colloquial particlesMandarin setCantonese-specific
Best forMainland & global MandarinHK-focused & Yue content

Who uses Chinese text to speech

Creators on YouTube, Bilibili, and social — narration without studio overhead.

Learners — tonal reference that replays consistently.

E-learning teams — consistent module VO across libraries.

App developers — prompts, tutorials, accessibility.

Accessibility publishers — audio for Chinese readers.

Enterprises & diaspora orgs — training and community messaging in the right spoken variety.

Tips for best output

  1. Match Simplified vs Traditional to your target language to reduce mapping errors.
  2. Add Pinyin for ambiguous Mandarin characters when precision matters.
  3. Add Jyutping for ambiguous Cantonese readings.
  4. Match written register to spoken register (formal vs colloquial).
  5. Segment long scripts into 800-character clips for cleaner edits.
  6. Test proper nouns with a short clip before full production.

FAQ

What is Chinese text to speech?

It converts Simplified or Traditional Chinese text into spoken audio in Mandarin or Cantonese, depending on the language and voice you select.

Is Chinese text to speech free?

Yes — no account or card. Fair use includes an 800-character cap per generation.

Mandarin vs Cantonese TTS?

Different phonologies and tones — select zh-CN vs zh-HK (and matching voices) so the engine applies the correct rules.

Which character set should I use?

Simplified for most mainland Mandarin; Traditional for Cantonese and many Taiwanese contexts.

Language learning use?

Yes — neural TTS is useful for replayable, correctly toned reference audio for Mandarin or Cantonese study.

Audio format?

MP3 — works in Premiere, Resolve, Final Cut, Audacity, GarageBand, podcast hosts, and accessibility tools.

Commercial use?

Review current site terms and destination platform rules before monetizing synthetic speech.

Mobile?

Yes — modern mobile browsers supported; no app install required.

More voice tools