POST
/
v1
/
ai
/
tts
import { JigsawStack } from "jigsawstack";
import fs from "fs";

const jigsawstack = JigsawStack({
  apiKey: "your-api-key",
});

// Basic usage with default voice
const response = await jigsawstack.audio.text_to_speech({
  text: "Hello, how are you doing? This is a demonstration of the text-to-speech API using the default voice.",
});

// Save the audio to a file
const buffer = await response.arrayBuffer();
fs.writeFileSync("output.mp3", Buffer.from(buffer));

// Using a custom accent/voice
const customVoice = await jigsawstack.audio.text_to_speech({
  text: "Hello, I'm speaking with a British accent now. Isn't that lovely?",
  accent: "en-GB-female-2",
});

// Voice cloning from a URL
const clonedVoice = await jigsawstack.audio.text_to_speech({
  text: "This is my cloned voice speaking through the text-to-speech API.",
  speaker_clone_url: "https://example.com/my-voice-sample.mp3",
});

// Get available voices
const voices = await jigsawstack.audio.speaker_voice_accents();
console.log(voices);

Overview

The Text to Speech API converts written text into lifelike speech with options for:

  • Multiple languages and accents (700+ different voices)
  • Voice cloning from audio samples
  • High-quality, natural-sounding output

Request Parameters

Body

text
string
required

The text to generate speech from. Character Limits: - Standard TTS: 5-1,500 characters - Voice Cloning TTS: 5-500 characters

accent
string
default:"en-US-female-27"

Speaker voice accent. Not required if using voice cloning. Over 700 different voices across multiple languages are available. See the Speaker Voice Accents documentation for the complete list.

speaker_clone_url
string

URL to an audio file for voice cloning. The API will analyze this sample and generate speech that mimics the voice characteristics. Not required if speaker_clone_file_store_key is specified. When using voice cloning, the text is limited to 500 characters maximum.

speaker_clone_file_store_key
string

The key of an audio file stored in Jigsawstack File Storage to use for voice cloning. Not required if speaker_clone_url is specified. When using voice cloning, the text is limited to 500 characters maximum.

Only one voice source is needed. You can either specify an accent for built-in voices, or provide a voice sample via speaker_clone_url or speaker_clone_file_store_key for voice cloning.

x-api-key
string
required

Your JigsawStack API key

Response

The API returns the generated audio file directly in the response body as binary data, typically in MP3 format.

Here’s a selection of commonly used voice accents:

English Voices

  • en-US-female-27 (Default) - American English, Female 27
  • en-US-male-24 - American English, Male 24
  • en-GB-female-2 - British English, Female 2
  • en-GB-male-2 - British English, Male 2
  • en-AU-female-2 - Australian English, Female 2
  • en-IN-female-3 - Indian English, Female 3

Other Languages

  • fr-FR-female-12 - French, Female 12
  • de-DE-female-1 - German, Female 1
  • es-ES-female-9 - Spanish (Spain), Female 9
  • es-MX-female-12 - Spanish (Mexico), Female 12
  • ja-JP-female-14 - Japanese, Female 14
  • zh-CN-female-15 - Chinese (Mandarin), Female 15

Voice Cloning

Voice cloning allows you to generate speech that mimics a specific voice from an audio sample. This is useful for creating custom voices for applications, personalized assistants, or consistent brand voices.

Using Voice Cloning

  1. Provide a clear audio sample using either:

    • speaker_clone_url - Direct URL to an audio file
    • speaker_clone_file_store_key - Key to a file in Jigsawstack storage
  2. Provide the text you want to convert to speech (maximum 500 characters)

  3. The API will analyze the sample and generate speech with similar voice characteristics

Best Practices for Voice Cloning

  • Use high-quality audio samples with minimal background noise
  • Samples should be clear speech without music or other voices
  • Longer samples (30+ seconds) provide better voice modeling
  • For best results, the sample should contain speech similar in tone and style to your desired output

Getting Available Voices

To retrieve the complete list of available voices programmatically:

const voices = await jigsawstack.audio.speaker_voice_accents();
console.log(voices);
voices = jigsawstack.audio.speaker_voice_accents()
print(voices)

Common Use Cases

  • Content Creation: Generate voiceovers for videos, podcasts, and presentations
  • Accessibility: Convert written content to audio for visually impaired users
  • Interactive Applications: Build voice assistants, language learning apps, and interactive experiences
  • Customer Service: Create automated voice responses for customer service systems
  • Localization: Translate and vocalize content in multiple languages
  • Personalization: Clone a specific voice for consistent brand messaging
import { JigsawStack } from "jigsawstack";
import fs from "fs";

const jigsawstack = JigsawStack({
  apiKey: "your-api-key",
});

// Basic usage with default voice
const response = await jigsawstack.audio.text_to_speech({
  text: "Hello, how are you doing? This is a demonstration of the text-to-speech API using the default voice.",
});

// Save the audio to a file
const buffer = await response.arrayBuffer();
fs.writeFileSync("output.mp3", Buffer.from(buffer));

// Using a custom accent/voice
const customVoice = await jigsawstack.audio.text_to_speech({
  text: "Hello, I'm speaking with a British accent now. Isn't that lovely?",
  accent: "en-GB-female-2",
});

// Voice cloning from a URL
const clonedVoice = await jigsawstack.audio.text_to_speech({
  text: "This is my cloned voice speaking through the text-to-speech API.",
  speaker_clone_url: "https://example.com/my-voice-sample.mp3",
});

// Get available voices
const voices = await jigsawstack.audio.speaker_voice_accents();
console.log(voices);