Speech-to-Text Transcription

The cluster focuses on discussions about automatic transcription of audio using speech-to-text models like Whisper and Descript, including accuracy issues, errors, tools, accessibility benefits, and preferences for text over audio.

➡️ Stable 0.6x AI & Machine Learning

2,309

Comments

Years Active

Top Authors

#2411

Topic ID

Activity Over Time

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

108

2017

126

2018

106

2019

124

2020

198

2021

173

2022

185

2023

261

2024

262

2025

267

2026

Top Contributors

ghaff (27) braindead_in (21) dylan604 (9) IanCal (7) simonw (7)

Keywords

Otter.ai callgraph.biz STT transcription transcript text speech audio whisper errors automated accuracy human

Sample Comments

pudiklubi • Jun 19, 2025 • View on HN

it was an audio recording, transcribed with speech to text models. there's definitely some errors and words lost. I also tried to emphasize this

richclominson • May 1, 2020 • View on HN

thanks for the feedback. we're trying to implement transcriptions.

radarsat1 • Mar 23, 2024 • View on HN

(Can't edit anymore, but I meant "automatic transcription" above..)

Applemoi • Sep 28, 2024 • View on HN

I sometimes feel like its a rough transcript of the audio or similar. Because not only is the response apparently natively audio, but I've often seen the text adhere to the audio only as a best effort rather than a full on accurate representation of the audio.

TheRealPomax • Jul 22, 2024 • View on HN

"[...] by transcript, not waveform".

oluoluoxenfree • Jul 21, 2022 • View on HN

It's a direct transcript that I made using Descript, hadn't considered how odd it would be to read on its own! I'll fix it ha ha

davidw • Mar 14, 2010 • View on HN

Interesting idea. I'd be curious to hear if they turn out a good transcript or whether their lack of understanding of the subject material makes it garbled. I guess with enough people doing it, you can check out the most common transcriptions and go with those...

debt • Feb 2, 2018 • View on HN

Rev works really well but an actual human that transcribes what you say and it takes a day.

lunixbochs • Jun 9, 2020 • View on HN

It's the transcription, I think this kind of site should allow you to fall back to listening to the audio. Even with high accuracy, if there's no human curation, there will be pretty awkward mistakes every once in a while in the text.

knieveltech • Dec 11, 2017 • View on HN

For those of us that struggle with his cadence of speech, are transcripts available online?