A new study from Columbia Journalism Review showed that AI search engines and chatbots, such as OpenAI’s ChatGPT Search, Perplexity, Deepseek Search, Microsoft Copilot, Grok and Google’s Gemini, are just wrong, way too often.

  • Patch@feddit.uk
    link
    fedilink
    English
    arrow-up
    4
    ·
    21 hours ago

    It’s a real issue. A strong use case for LLM search engines is providing summaries which combine lots of facts that would take some time to compile through searching the old fashioned way. But if it’s only 90% accurate and 10% hallucinated bullshit, it becomes very difficult to pick out the bullshit from the truth.

    The other day I asked Copilot to provide an overview of a particular industrial sector in my area. It produced something that was 90% concise, accurate, readable and informative briefing, and 10% complete nonsense. It hallucinated an industrial estate that didn’t exist, a whole government programme that doesn’t exist, it talked about a scheme that went defunct 20 years ago as if it were still current, etc. If it weren’t for the fact that I was already very familiar with the subject, I might not have caught it. Anyone actually relying on that for useful work is in serious danger of making a complete tit of themselves.

    • OhVenus_Baby@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      19 hours ago

      Copilot sucks and I totally understand the POV. I stick with GPT, Mixtral. I don’t think their going anywhere anytime soon but they need significant actual refinement.