• RobotToaster@mander.xyz
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    6
    ·
    28 days ago

    How can it be that bad?

    I’ve used zoom’s ai transcriptions, for far less mission critical stuff, and it’s generally fine, (I still wouldn’t trust it for medical purposes)

    • huginn@feddit.it
      link
      fedilink
      English
      arrow-up
      28
      arrow-down
      1
      ·
      28 days ago

      Zoom ai transcriptions also make things up.

      That’s the point. They’re hallucination engines. They pattern match and fill holes by design. It doesn’t matter if the match isn’t perfect, it will patch it over with nonsense instead.

    • Grimy@lemmy.world
      link
      fedilink
      English
      arrow-up
      17
      arrow-down
      1
      ·
      edit-2
      28 days ago

      Whisper has been known to hallucinate during long moments of silence. Most of their examples though are most likely due to bad audio quality.

      I use whisper quite a bit and it will fumble a word here or there but never to the extent that is being shown in the article.

      • QuadratureSurfer@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        ·
        28 days ago

        Same, I’d say it’s way better than most other transcription tools I’ve used, but it does need to be monitored to catch when it starts going off the rails.

    • ElPussyKangaroo@lemmy.world
      link
      fedilink
      English
      arrow-up
      9
      ·
      28 days ago

      It’s not the transcripts that are the issue here. It’s that the transcripts are being interpreted by the model to give information.