Speech to Text

Overview

The Speech to Text API converts audio and video files containing speech into accurate text transcriptions. Powered by the Whisper large V3 AI model, it provides high-quality transcriptions for various applications. The API supports multiple audio formats and offers features like speaker diarization and language translation.

High-accuracy transcription with advanced AI model
Multiple audio format support (MP3, WAV, FLAC, etc.)
Speaker diarization capability (identifying different speakers)
Translation to English from other languages
Support for asynchronous processing via webhooks
Timestamps for each segment of transcription

API Endpoint

POST /v1/ai/transcribe

Quick Start

JavaScript

import { JigsawStack } from 'jigsawstack';

const jigsaw = new JigsawStack('your-api-key');

// Basic transcription from URL
const response = await jigsaw.audio.speech_to_text({
  url: "https://example.com/path/to/audio.mp3"
});

console.log(response.text);

// With speaker diarization
const responseWithSpeakers = await jigsaw.audio.speech_to_text({
  url: "https://example.com/path/to/audio.mp3",
  by_speaker: true
});

// Display speakers and their text
responseWithSpeakers.speakers.forEach(segment => {
  console.log(`${segment.speaker}: ${segment.text}`);
});

Response

{
  "success": true,
  "text": "Welcome to JigsawStack's Speech to Text API demo. This powerful tool converts spoken language into written text with high accuracy.",
  "chunks": [
    {
      "timestamp": [0.0, 2.2],
      "text": "Welcome to JigsawStack's Speech to Text API demo."
    },
    {
      "timestamp": [2.3, 5.24],
      "text": "This powerful tool converts spoken language into written text with high accuracy."
    }
  ],
  "speakers": [
    {
      "speaker": "Speaker 1",
      "timestamp": [0.0, 2.2],
      "text": "Welcome to JigsawStack's Speech to Text API demo."
    },
    {
      "speaker": "Speaker 1",
      "timestamp": [2.3, 5.24],
      "text": "This powerful tool converts spoken language into written text with high accuracy."
    }
  ]
}

Examples

If you’ve already uploaded files to JigsawStack’s storage:

Processing Files from Storage

// Using a file from JigsawStack storage
const result = await jigsaw.audio.speech_to_text({
  file_store_key: "uploads/meeting_recording.mp3",
  by_speaker: true
});

To transcribe audio in one language and translate to English:

Translation Example

// Transcribe and translate to English
const result = await jigsaw.audio.speech_to_text({
  url: "https://example.com/path/to/french_audio.mp3",
  translate: true
});

console.log("Translated text:", result.text);

Get Started

Quick Start

Integration

Core AI

Web

Data

Translate

Vision

Audio

Content Validation

File Management

Webhooks

Resources

Overview

API Endpoint

Quick Start

Response

Examples

Get Started

Quick Start

Integration

Core AI

Web

Data

Translate

Vision

Audio

Content Validation

File Management

Webhooks

Resources

​Overview

​API Endpoint

​Quick Start

​Response

​Examples

Overview

API Endpoint

Quick Start

Response

Examples