Skip to content

TTS Providers Overview

When you’re reviewing an annotated game, your eyes are doing double duty. You’re trying to follow the pieces on the board and read the commentary at the same time. Your gaze bounces between the board and the annotation panel, and every time it does, you lose the position for a split second. You have to re-find the pieces, re-trace the lines, re-build the picture in your head.

Text-to-speech fixes this completely.

With TTS enabled, you step through a game and the annotations are spoken to you. Your eyes stay on the board. You watch the knight land on f3 while a voice tells you why it’s a strong developing move. You see the pawn structure shift while the commentary explains the strategic idea behind it. The board and the words arrive together, the way a coach sitting across from you would teach.

This is especially powerful for:

  • Opening study — hear the ideas behind each move while you watch the position develop
  • Game review — step through your own annotated games and absorb the lessons naturally
  • Endgame practice — keep your focus on the critical squares while the commentary guides you
  • Language immersion — study chess in French, German, Spanish, Russian, Japanese, Chinese, or Korean with all chess terms properly translated. Hear “Cavalier f3, echec” instead of “Knight f3, check.” Learn the game in the language you think in.
  • Accessibility — for players who find it easier to listen than to read, or who want to study away from a desk

Once you try it, going back to silent annotations feels like watching a movie on mute.

En Parlant~ ships with five TTS providers, ranging from cloud APIs with studio-quality voices to fully local options that need no internet at all. You only need one to get started. They’re listed below from best to worst voice quality.

The best voice quality available. ElevenLabs produces expressive, human-like speech with real personality — some voices sound like audiobook narrators, others like broadcasters. Dozens of unique voices to choose from. Supports 34+ languages including excellent CJK (Japanese, Chinese, Korean) pronunciation, plus Arabic, Hindi, and all major European languages.

The free tier gives you 10,000 characters per month (enough for 2-5 annotated games). Paid plans start at $5/month for 30,000 characters. Setup is simple: create an account, copy your API key, paste it into En Parlant~.

Requires internet. Best for voice quality enthusiasts.

ElevenLabs Setup Guide

The best balance of quality, language support, and value. Google’s WaveNet neural voices sound natural and clear across 30+ languages — including CJK, Arabic, Hindi, Bengali, Filipino, Vietnamese, and all major European languages. The free tier is generous — one million characters per month covers hundreds of annotated games.

Setup takes about 5 minutes: create a Google Cloud account, enable the Text-to-Speech API, generate an API key. No charges unless you exceed the free tier (very hard to do with chess annotations).

Requires internet. Best for most users.

Google Cloud Setup Guide

High-quality local AI that runs entirely on your machine. Uses a lightweight ~25MB neural model with 8 expressive voices (4 male, 4 female). The quality is remarkably good — natural intonation, clear pronunciation, genuine expressiveness.

The trade-off is hardware: KittenTTS uses PyTorch for CPU inference, so it needs a modern multi-core processor. On an 8-core machine it sounds great; on an older laptop you may notice lag. English only for now.

The first time each annotation is spoken there’s a brief generation delay (1-2 seconds on a fast CPU, longer on slower hardware). After that, the audio is cached in memory and replays instantly — stepping backward and forward through moves you’ve already heard has zero lag. You can also precache an entire game in the background from settings, so every annotation is ready before you start studying.

No internet required. No API keys. Best local quality.

KittenTTS Setup Guide

An open-source TTS server that runs on your machine via Docker. Nothing leaves your computer. Bundles several TTS engines (Larynx, Festival, eSpeak, Coqui-TTS), giving you 75+ voices for English alone.

The trade-off is voice quality: these are older neural and rule-based engines, so the output sounds more robotic than ElevenLabs or Google. Works best with European languages (English, German, French, Spanish, Russian, Dutch, Swedish, Italian, and more) — CJK is not supported. Honestly, if you’re going to go through the trouble of setting up a local model, KittenTTS gives you better quality with less hassle. Unless there’s significant demand for OpenTTS, we’ll likely deprecate it in a future release.

No internet required. No API keys. Best for maximum privacy with many voice options.

OpenTTS Setup Guide

Your operating system’s built-in speech synthesis. Nothing to install, no API keys, no servers. Select it and go. The voice quality is basic — you’ll hear the characteristic robotic tone of OS-level TTS — but it works instantly with zero setup.

On Linux this is typically eSpeak or speech-dispatcher; on macOS it’s the system voice; on Windows it’s SAPI. Language support depends entirely on what voice packs your operating system has installed.

No internet required. Best for quick testing.

System TTS Setup Guide

ProviderTypeQualitySetupLanguages
ElevenLabsCloud APIExceptionalAPI key34+ (incl. CJK)
Google CloudCloud APIVery good (WaveNet)API key30+ (incl. CJK)
KittenTTSLocal neural AIGoodPython + venvEnglish only
OpenTTSLocal DockerFairDockerEuropean only
System (OS Native)OS built-inBasicNoneOS-dependent

Hardware note: The local providers (KittenTTS and OpenTTS) run neural inference on your CPU. They need a modern multi-core processor (8+ cores recommended) to generate speech without noticeable lag. Think of it like running yet another chess engine on your machine. If your machine is older or low-power, use one of the cloud providers instead.

Start with ElevenLabs if you want the richest voice quality — the free tier is enough to try it out. For the best balance of quality and free usage, Google Cloud covers hundreds of games per month. For high-quality local TTS with no cloud dependency, KittenTTS is excellent if you have a modern CPU. For zero-setup testing, System TTS works instantly. For maximum privacy with many voice options, OpenTTS runs everything locally via Docker.

All TTS settings are in Settings > Sound:

SettingWhat it does
Text-to-SpeechMaster on/off switch for all TTS features
Auto-Narrate on MoveAutomatically speak annotations when you step through moves
TTS ProviderSwitch between the five providers
TTS VoiceProvider-specific voice selection
TTS LanguageLanguage for narration — chess terms are translated automatically
TTS VolumeHow loud the narration plays
TTS SpeedPlayback speed (0.5x to 2x) — adjusts without re-generating audio
ElevenLabs API KeyYour ElevenLabs API key (only shown when using ElevenLabs)
Google Cloud API KeyYour Google Cloud API key (only shown when using Google)
KittenTTS CPU ThreadsCPU threads for inference (0 = auto / use all cores)
TTS Audio CacheClear cached audio to force re-generation

TTS narration supports many languages with fully translated chess vocabulary. Here are some examples:

LanguageChess example
EnglishKnight f3, check. A strong developing move.
FrancaisCavalier f3, echec. Un coup de developpement fort.
EspanolCaballo f3, jaque. Un fuerte movimiento.
DeutschSpringer f3, Schach. Ein starker Entwicklungszug.
日本語ナイト f3、チェック。強い展開の手。
РусскийКонь f3, шах. Сильный развивающий ход.
中文马 f3,将军。一步控制中心的强力出子。
한국어나이트 f3, 체크. 중앙을 지배하는 강력한 전개 수.

Every chess term — piece names, “check”, “checkmate”, “castles”, “takes”, move quality annotations like “Brilliant move” and “Blunder” — is spoken in the selected language. Comments in your PGN files are spoken as written, so annotate your games in the language you want to hear.

The TTS engine doesn’t just read raw text — it understands chess notation. Before any text is spoken, a preprocessing step converts PGN notation into natural speech:

Written in PGNSpoken aloud
Nf3”Knight f3”
Bxe6+”Bishop takes e6, check”
O-O-O”castles queenside”
e8=Q#”e8 promotes to Queen, checkmate”
Rae1”Rook a e1” (disambiguation)
5.Qxd8+ (in comments)“5, Queen takes d8, check”
en prise”on preez” (French pronunciation)
Ra8 is hanging”Rook on a8 is hanging”
R vs R”Rook versus Rook”
6...Bf5 (move number dots)“6, Bishop f5” (natural pause, no “dot”)

Comments are cleaned before speaking: [%eval], [%cal], [%csl] tags are stripped. Leading quality words that duplicate the NAG symbol are removed (so ?? {BLUNDER. The rook hangs} doesn’t stutter “Blunder. Blunder.”).

Every narration is cached in memory after the first generation. Stepping backward and forward through a game replays instantly from cache — no API calls, no re-generation delay. You can also precache an entire game tree in the background so there are zero pauses during playback.

The cache is keyed by provider:voiceId:lang:text, so changing the voice or provider creates separate cache entries. Changing playback speed does not invalidate the cache — speed is applied client-side on the audio element.

A Clear Audio Cache button in Settings lets you force re-generation after editing annotations.

  • Use Auto-Narrate. Turn on “Auto-Narrate on Move” and just use your arrow keys to step through games. The commentary arrives naturally as you move, like having a coach at your shoulder.

  • Annotate your own games. TTS really shines when you’re listening to commentary on your games. Annotate your games, then step through them with narration. Hearing “Grabbing the pawn looks tempting, but your entire kingside is still asleep” while staring at the position hits different than reading it.

  • Try different speeds. Some players like 1x for careful study, others prefer 1.3x for faster review. The speed slider adjusts playback in real-time without using additional API characters.

  • Use the speaker icon. Every comment in the move list has a small speaker icon. Click it to hear just that one annotation.

  • Switch languages to learn chess vocabulary. If you’re studying chess in a second language, set the TTS language to match. You’ll naturally pick up terms like “Cavalier” (Knight), “echec” (check), and “mat” (checkmate) just by listening.

These guidelines produce the best spoken narration from your PGN annotations.

Use standard SAN notation. The preprocessor expands it automatically:

  • "After 7.Nf3, White controls e5" becomes “After 7, Knight f3, White controls e5”
  • "The Bg5 pins the knight" becomes “The Bishop g5 pins the knight”

The NAG glyph (!, ??, !?, etc.) generates spoken quality words automatically. Don’t duplicate them in the comment:

  • Bad: ?? {BLUNDER. A terrible move...} — TTS says “Blunder. Blunder. A terrible move”
  • Good: ?? {A terrible move...} — TTS says “Blunder. A terrible move”

Standard PGN notation works: 6...Bf5. The preprocessor converts dots to commas for natural pauses instead of “dot dot dot.”

Periods create natural TTS pauses. Use them between distinct ideas:

{Doubled isolated e-pawns. The f-file is ripped open. The position is strategically won.}

[%cal ...] and [%csl ...] tags are stripped from audio automatically. Use them freely for visual annotations without affecting narration.

A note on redistribution for anyone building on En Parlant~:

  • ElevenLabs — You retain all rights to your generated audio (ElevenLabs Terms of Use, section c(ii)). You can redistribute it freely.
  • Google Cloud — You retain all IP rights to generated audio. No restrictions.
  • KittenTTS, OpenTTS, System TTS — No redistribution restrictions on generated audio.

En Croissant is an open-source chess study tool created by Francisco Salgueiro. Francisco built something genuinely special — a free, powerful, community-driven platform for studying chess — and released it under the GPL-3.0 license so that anyone can use it, improve it, and share it. This TTS feature exists because of that generosity. We’re grateful for the foundation he built, and we’re proud to contribute back to it.

The TTS plugin was developed by Darrell at Red Shed, with the help of Claude Code. Five providers, multi-language support, translated chess vocabulary across many languages, local AI inference, dependency management — built from source, tested by hand, and contributed with care.

That’s the beauty of open source. Someone builds something great. Someone else adds to it. Everyone benefits.

We’d love to hear how TTS is working for you. Comments, suggestions, and feedback are always welcome.

  • Want a language we don’t support yet? Let us know — we can add new languages quickly.
  • Found a bug? Tell us and we’ll fix it fast.
  • Have an idea for another TTS provider? We’re happy to add it.
  • Just want to say it’s working? That’s great to hear too.

Open an issue on GitHub, or reach out directly at darrell@redshed.ai.