Overview
The Speech to Text API converts audio and video files containing speech into accurate text transcriptions. Powered by the Whisper large V3 AI model, it provides high-quality transcriptions for various applications. The API supports multiple audio formats and offers features like speaker diarization and language translation.
- High-accuracy transcription with advanced AI model
- Multiple audio format support (MP3, WAV, FLAC, etc.)
- Speaker diarization capability (identifying different speakers)
- Translation to English from other languages
- Support for asynchronous processing via webhooks
- Timestamps for each segment of transcription
API Endpoint
Quick Start
import { JigsawStack } from 'jigsawstack';
const jigsaw = new JigsawStack('your-api-key');
// Basic transcription from URL
const response = await jigsaw.audio.speech_to_text({
url: "https://example.com/path/to/audio.mp3"
});
console.log(response.text);
// With speaker diarization
const responseWithSpeakers = await jigsaw.audio.speech_to_text({
url: "https://example.com/path/to/audio.mp3",
by_speaker: true
});
// Display speakers and their text
responseWithSpeakers.speakers.forEach(segment => {
console.log(`${segment.speaker}: ${segment.text}`);
});
Response
{
"success": true,
"text": "Welcome to JigsawStack's Speech to Text API demo. This powerful tool converts spoken language into written text with high accuracy.",
"chunks": [
{
"timestamp": [0.0, 2.2],
"text": "Welcome to JigsawStack's Speech to Text API demo."
},
{
"timestamp": [2.3, 5.24],
"text": "This powerful tool converts spoken language into written text with high accuracy."
}
],
"speakers": [
{
"speaker": "Speaker 1",
"timestamp": [0.0, 2.2],
"text": "Welcome to JigsawStack's Speech to Text API demo."
},
{
"speaker": "Speaker 1",
"timestamp": [2.3, 5.24],
"text": "This powerful tool converts spoken language into written text with high accuracy."
}
]
}
Examples
If you’ve already uploaded files to JigsawStack’s storage:
Processing Files from Storage
// Using a file from JigsawStack storage
const result = await jigsaw.audio.speech_to_text({
file_store_key: "uploads/meeting_recording.mp3",
by_speaker: true
});
To transcribe audio in one language and translate to English:
// Transcribe and translate to English
const result = await jigsaw.audio.speech_to_text({
url: "https://example.com/path/to/french_audio.mp3",
translate: true
});
console.log("Translated text:", result.text);