100+ languages
Speaker separation
Timestamp every word
Blazing fast speed with always-on GPUs
High accuracy with OpenAI Whisper large v3 model
Speech to Text Preview (Alpha)
Try more models ->
{
"success": true,
"text": "the little tales they tell are false the door was barred locked and bolted as well ripe pears are fit for a queen's table a big wet stain was on the round carpet the kite dipped and swayed but stayed aloft the pleasant hours fly by much too soon the room was crowded with a mild wab the room was crowded with a wild mob this strong arm shall shield your honour she blushed when he gave her a white orchid The beetle droned in the hot June sun.",
"chunks": [
{
"timestamp": [
1.8,
44.24
],
"text": "the little tales they tell are false the door was barred locked and bolted as well ripe pears are fit for a queen's table a big wet stain was on the round carpet the kite dipped and swayed but stayed aloft the pleasant hours fly by much too soon the room was crowded with a mild wab the room was crowded with a wild mob this strong arm shall shield your honour she blushed when he gave her a white orchid"
},
{
"timestamp": [
46.16,
51.74
],
"text": "The beetle droned in the hot June sun."
}
]
}
Don't pay for the length of the audio, GPU cost or infrastructure, pay only the processing time needed. ~60mins audio = ~20s process time
Separate speakers in the audio and transcribe text for each speaker
Utilize best in class GPUs with an optimized version of OpenAI Whisper large v3 model without any setup
Run async jobs with secure webhooks or get instant results with synchronous API calls, allowing you to scale easily
Translate any audio from 100+ languages to any other language while maintaining language context and meaning
Get the latest AI model updates and feature improvements without any API changes
Easy to use REST APIs that work out of the box in every language and framework with fully managed caching, logging and authentication
import { JigsawStack } from "jigsawstack";
const jigsaw = JigsawStack({
apiKey: "sk39wo393.....32ncsmw9339RNj3"
});
const response = await jigsaw.audio.speech_to_text({"url":"https://storage.com/audio.mp3","by_speaker":true,"translate":true,"language":"zh"})
$
npm i jigsawstack
5 ways our customers use JigsawStack to build Speech to Text powered applications
Increase accessibility for your content by providing realtime transcriptions for your audio and video content
Automatically generate captions for your videos and podcasts to increase reach and engagement with your content
Translate your audio content to multiple languages to increase your reach and audience globally
Analyze your audio content to get insights on customer sentiment, feedback and more to improve your content
Build voice enabled applications with realtime transcription for meetings, interviews, podcasts and more
Striking the right balance between code and dashboard
Access real-time logs and analytics on all your APIs. Debug errors, track users, location maps, sessions, countries, IPs and 30+ data points
Fine grained control over API keys. Whitelist domains with flexible wildcard support, set expiration date and limit access to specific APIs with unlimited keys
The best docs are the kind that you don't need. Fully typed SDKs with auto-completion and self explanatory params
Manage multiple projects and teams with access control. Invite unlimited team members and assign roles
JigsawStack APIs are built from the ground up on the edge network
99.5% uptime with APIs latency reaching as low as 200ms globally
Scale up and down as you need without worrying about abused cost with usage based pricing
Consistent request and response structure across all API services for predictable use
Consistent training for all JigsawStack models to ensure the latest technology is always available without breaking changes