> ## Documentation Index
> Fetch the complete documentation index at: https://jigsaw-13.mintlify.app/llms.txt
> Use this file to discover all available pages before exploring further.

# Speech to Text

> Learn how to use JigsawStack's Speech to Text API to transcribe audio and video files

## Overview

The Speech to Text API converts audio and video files containing speech into accurate text transcriptions. Powered by the Whisper large V3 AI model, it provides high-quality transcriptions for various applications. The API supports multiple audio formats and offers features like speaker diarization and language translation.

* High-accuracy transcription with advanced AI model
* Multiple audio format support (MP3, WAV, FLAC, etc.)
* Speaker diarization capability (identifying different speakers)
* Translation to English from other languages
* Support for asynchronous processing via webhooks
* Timestamps for each segment of transcription

## API Endpoint

```
POST /v1/ai/transcribe
```

## Quick Start

```javascript JavaScript theme={null}
import { JigsawStack } from 'jigsawstack';

const jigsaw = new JigsawStack('your-api-key');

// Basic transcription from URL
const response = await jigsaw.audio.speech_to_text({
  url: "https://jigsawstack.com/preview/stt-example.wav"
});

console.log(response.text);

// With speaker diarization
const responseWithSpeakers = await jigsaw.audio.speech_to_text({
  url: "https://jigsawstack.com/preview/stt-example.wav",
  by_speaker: true
});

// Display speakers and their text
responseWithSpeakers.speakers.forEach(segment => {
  console.log(`${segment.speaker}: ${segment.text}`);
});
```

## Response

```json theme={null}
{
  "success": true,
  "text": "Welcome to JigsawStack's Speech to Text API demo. This powerful tool converts spoken language into written text with high accuracy.",
  "chunks": [
    {
      "timestamp": [0.0, 2.2],
      "text": "Welcome to JigsawStack's Speech to Text API demo."
    },
    {
      "timestamp": [2.3, 5.24],
      "text": "This powerful tool converts spoken language into written text with high accuracy."
    }
  ],
  "speakers": [
    {
      "speaker": "Speaker 1",
      "timestamp": [0.0, 2.2],
      "text": "Welcome to JigsawStack's Speech to Text API demo."
    },
    {
      "speaker": "Speaker 1",
      "timestamp": [2.3, 5.24],
      "text": "This powerful tool converts spoken language into written text with high accuracy."
    }
  ]
}
```

## Auto Language Detection

By default, JigsawStack's Speech to Text API automatically detects the language of your audio content. This means you don't need to specify the language parameter - the API will intelligently identify what language is being spoken and transcribe accordingly.

However, if you know the language of your audio in advance, specifying it can significantly optimize performance and reduce processing time.

```javascript theme={null}
const result = await jigsaw.audio.speech_to_text({
  url: "https://jigsawstack.com/preview/stt-example.wav",
  language: "auto" // default behavior
});

console.log(result.text);
```

## Examples

If you've already uploaded files to JigsawStack's storage:

```javascript Processing Files from Storage theme={null}
// Using a file from JigsawStack storage
const result = await jigsaw.audio.speech_to_text({
  file_store_key: "uploads/meeting_recording.mp3",
  by_speaker: true
});
```

To transcribe audio in one language and translate to English:

```javascript Translation Example theme={null}
// Transcribe and translate to English
const result = await jigsaw.audio.speech_to_text({
  url: "https://example.com/path/to/french_audio.mp3",
  translate: true
});

console.log("Translated text:", result.text);
```

<Note>Find more information on Speech to Text API [here](/docs/api-reference/ai/speech-to-text)</Note>
