Transcription API

We’ve announced AI+ 2.0. Check it out

Welcome to VideoSummary.io, where every word matters. Our transcription service goes beyond mere words to bring you the context, clarity, and convenience you need. Powered by the renowned Whisper model, our platform offers speaker separation and diarization at a fraction of the cost.

Speaker Separation & Diarization: A Symphony of Voices
Don’t let your valuable insights get lost in translation. Our advanced speaker separation technology distinguishes between different speakers, ensuring that each voice is heard and every point is captured. With our speaker diarization feature, each transcript is a clear, easy-to-follow dialogue, no matter how many voices are in the room.

Affordable Accuracy
Get the precision of OpenAI's models without the hefty price tag. VideoSummary.io is committed to delivering affordability without compromising on quality. Experience the finesse of cutting-edge AI transcription that fits your budget.

Call the API

curl -X POST https://api.videosummary.io/v1/transcribe \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer xxx" \
  -d '{
    "url": "https://s3.us-east-1.wasabisys.com/videosummary/demo.mp4",
    "id": "your id",
    "callback": "https://your.callback.url/results"
  }'

API Response (Callback)

{
  "transcript": "https://url.to.transcript",
  "id": "123"
}

Sample Transcription Result

This is a truncated example. There are 3 keys, text, speakers, and chunks. Speakers and chunks are both at the word level with word level timestamps.

{
  "text": " And then we were like, all right, fine, deal. This business is so freaking hard. Let's try this. And in the end, I don't think it ever ended up working. But you got the customers. And that's the point of the episode. Yeah, exactly. The subsequent failure is irrelevant, okay? Yeah. All right, we're live. Sean, we're going to do a Q&A session today. We got a bunch of questions. I have a feeling we're going to spend most of the time on one question. But the question is, how did you get your first 100 customers in your businesses? And what are the best ways that you've heard of?...",
  "speakers": [
    {
      "speaker": "SPEAKER_03",
      "text": " And",
      "timestamp": [
        0.0,
        0.24
      ],
      "start": 0.0,
      "end": 0.24
    },
    {
      "speaker": "SPEAKER_03",
      "text": " then",
      "timestamp": [
        0.24,
        0.4
      ],
      "start": 0.24,
      "end": 0.4
    },
    {
      "speaker": "SPEAKER_03",
      "text": " we",
      "timestamp": [
        0.4,
        0.52
      ],
      "start": 0.4,
      "end": 0.52
    },
    {
      "speaker": "SPEAKER_03",
      "text": " were",
      "timestamp": [
        0.52,
        0.56
      ],
      "start": 0.52,
      "end": 0.56
    },
    {
      "speaker": "SPEAKER_03",
      "text": " like,",
      "timestamp": [
        0.56,
        0.76
      ],
      "start": 0.56,
      "end": 0.76
    }
  ],
  "chunks": [
    {
      "text": " And",
      "timestamp": [
        0.0,
        0.24
      ]
    },
    {
      "text": " then",
      "timestamp": [
        0.24,
        0.4
      ]
    },
    {
      "text": " we",
      "timestamp": [
        0.4,
        0.52
      ]
    },
    {
      "text": " were",
      "timestamp": [
        0.52,
        0.56
      ]
    },
    {
      "text": " like,",
      "timestamp": [
        0.56,
        0.76
      ]
    },
    {
      "text": " all",
      "timestamp": [
        0.76,
        1.06
      ]
    },
    {
      "text": " right,",
      "timestamp": [
        1.06,
        1.26
      ]
    },
    {
      "text": " fine,",
      "timestamp": [
        1.26,
        1.52
      ]
    },
    {
      "text": " deal.",
      "timestamp": [
        1.52,
        1.78
      ]
    },
    {
      "text": " This",
      "timestamp": [
        1.78,
        2.0
      ]
    },
    {
      "text": " business",
      "timestamp": [
        2.0,
        2.92
      ]
    },
    {
      "text": " is",
      "timestamp": [
        2.92,
        3.06
      ]
    },
    {
      "text": " so",
      "timestamp": [
        3.06,
        3.2
      ]
    },
    {
      "text": " freaking",
      "timestamp": [
        3.2,
        3.44
      ]
    },
    {
      "text": " hard.",
      "timestamp": [
        3.44,
        3.82
      ]
    },
    {
      "text": " Let's",
      "timestamp": [
        3.82,
        3.92
      ]
    },
    {
      "text": " try",
      "timestamp": [
        3.92,
        4.06
      ]
    },
    {
      "text": " this.",
      "timestamp": [
        4.06,
        4.42
      ]
    },
    {
      "text": " And",
      "timestamp": [
        4.42,
        4.6
      ]
    },
    {
      "text": " in",
      "timestamp": [
        4.6,
        5.3
      ]
    },
    {
      "text": " the",
      "timestamp": [
        5.3,
        5.44
      ]
    },
    {
      "text": " end,",
      "timestamp": [
        5.44,
        5.66
      ]
    },
    {
      "text": " I",
      "timestamp": [
        5.66,
        5.76
      ]
    },
    {
      "text": " don't",
      "timestamp": [
        5.76,
        5.86
      ]
    },
    {
      "text": " think",
      "timestamp": [
        5.86,
        5.94
      ]
    },
    {
      "text": " it",
      "timestamp": [
        5.94,
        6.02
      ]
    },
    {
      "text": " ever",
      "timestamp": [
        6.02,
        6.16
      ]
    },
    {
      "text": " ended",
      "timestamp": [
        6.16,
        6.32
      ]
    },
    {
      "text": " up",
      "timestamp": [
        6.32,
        6.48
      ]
    },
    {
      "text": " working.",
      "timestamp": [
        6.48,
        6.96
      ]
    },
    {
      "text": " But",
      "timestamp": [
        6.96,
        7.06
      ]
    },
    {
      "text": " you",
      "timestamp": [
        7.06,
        7.14
      ]
    },
    {
      "text": " got",
      "timestamp": [
        7.14,
        7.24
      ]
    },
    {
      "text": " the",
      "timestamp": [
        7.24,
        7.34
      ]
    },
    {
      "text": " customers.",
      "timestamp": [
        7.34,
        7.82
      ]
    },
    {
      "text": " And",
      "timestamp": [
        7.82,
        8.06
      ]
    },
    {
      "text": " that's",
      "timestamp": [
        8.06,
        8.24
      ]
    },
    {
      "text": " the",
      "timestamp": [
        8.24,
        8.3
      ]
    },
    {
      "text": " point",
      "timestamp": [
        8.3,
        8.5
      ]
    },
    {
      "text": " of",
      "timestamp": [
        8.5,
        8.6
      ]
    },
    {
      "text": " the",
      "timestamp": [
        8.6,
        8.64
      ]
    },
    {
      "text": " episode.",
      "timestamp": [
        8.64,
        9.36
      ]
    },
    {
      "text": " Yeah,",
      "timestamp": [
        9.36,
        9.58
      ]
    },
    {
      "text": " exactly.",
      "timestamp": [
        9.58,
        10.72
      ]
    },
    {
      "text": " The",
      "timestamp": [
        10.72,
        11.8
      ]
    },
    {
      "text": " subsequent",
      "timestamp": [
        11.8,
        12.12
      ]
    },
    {
      "text": " failure",
      "timestamp": [
        12.12,
        12.6
      ]
    },
    {
      "text": " is",
      "timestamp": [
        12.6,
        12.9
      ]
    },
    {
      "text": " irrelevant,",
      "timestamp": [
        12.9,
        13.38
      ]
    },
    {
      "text": " okay?",
      "timestamp": [
        13.38,
        13.84
      ]
    },
    {
      "text": " Yeah.",
      "timestamp": [
        13.84,
        14.68
      ]
    },
    {
      "text": " All",
      "timestamp": [
        14.68,
        22.94
      ]
    },
    {
      "text": " right,",
      "timestamp": [
        22.94,
        23.5
      ]
    },
    {
      "text": " we're",
      "timestamp": [
        23.5,
        23.64
      ]
    },
    {
      "text": " live.",
      "timestamp": [
        23.64,
        24.04
      ]
    },
    {
      "text": " Sean,",
      "timestamp": [
        24.04,
        24.4
      ]
    },
    {
      "text": " we're",
      "timestamp": [
        24.4,
        24.54
      ]
    },
    {
      "text": " going",
      "timestamp": [
        24.54,
        24.58
      ]
    },
    {
      "text": " to",
      "timestamp": [
        24.58,
        24.6
      ]
    },
    {
      "text": " do",
      "timestamp": [
        24.6,
        24.74
      ]
    },
    {
      "text": " a",
      "timestamp": [
        24.74,
        24.8
      ]
    },
    {
      "text": " Q",
      "timestamp": [
        24.8,
        24.96
      ]
    },
    {
      "text": "&A",
      "timestamp": [
        24.96,
        25.12
      ]
    },
    {
      "text": " session",
      "timestamp": [
        25.12,
        25.36
      ]
    },
    {
      "text": " today.",
      "timestamp": [
        25.36,
        25.78
      ]
    },
    {
      "text": " We",
      "timestamp": [
        25.78,
        26.12
      ]
    },
    {
      "text": " got",
      "timestamp": [
        26.08,
        26.22
      ]
    },
    {
      "text": " a",
      "timestamp": [
        26.22,
        26.32
      ]
    },
    {
      "text": " bunch",
      "timestamp": [
        26.32,
        26.46
      ]
    },
    {
      "text": " of",
      "timestamp": [
        26.46,
        26.52
      ]
    },
    {
      "text": " questions.",
      "timestamp": [
        26.52,
        26.84
      ]
    },
    {
      "text": " I",
      "timestamp": [
        26.84,
        27.14
      ]
    },
    {
      "text": " have",
      "timestamp": [
        27.14,
        27.3
      ]
    },
    {
      "text": " a",
      "timestamp": [
        27.3,
        27.4
      ]
    },
    {
      "text": " feeling",
      "timestamp": [
        27.4,
        27.58
      ]
    },
    {
      "text": " we're",
      "timestamp": [
        27.58,
        27.76
      ]
    },
    {
      "text": " going",
      "timestamp": [
        27.76,
        27.82
      ]
    },
    {
      "text": " to",
      "timestamp": [
        27.82,
        27.84
      ]
    },
    {
      "text": " spend",
      "timestamp": [
        27.84,
        28.0
      ]
    },
    {
      "text": " most",
      "timestamp": [
        28.0,
        28.26
      ]
    },
    {
      "text": " of",
      "timestamp": [
        28.26,
        28.32
      ]
    },
    {
      "text": " the",
      "timestamp": [
        28.32,
        28.38
      ]
    },
    {
      "text": " time",
      "timestamp": [
        28.38,
        28.58
      ]
    },
    {
      "text": " on",
      "timestamp": [
        28.58,
        28.7
      ]
    },
    {
      "text": " one",
      "timestamp": [
        28.7,
        28.86
      ]
    },
    {
      "text": " question.",
      "timestamp": [
        28.86,
        29.36
      ]
    },
    {
      "text": " But",
      "timestamp": [
        29.36,
        29.82
      ]
    },
    {
      "text": " the",
      "timestamp": [
        29.82,
        30.08
      ]
    },
    {
      "text": " question",
      "timestamp": [
        30.08,
        30.32
      ]
    },
    {
      "text": " is,",
      "timestamp": [
        30.32,
        30.62
      ]
    },
    {
      "text": " how",
      "timestamp": [
        30.62,
        30.76
      ]
    },
    {
      "text": " did",
      "timestamp": [
        30.76,
        30.86
      ]
    }
  ]
}