Over just a few months, ChatGPT went from accurately answering a simple math problem 98% of the time to just 2%, study finds::ChatGPT went from answering a simple math correctly 98% of the time to just 2%, over the course of a few months.

  • xantoxis@lemmy.one
    link
    fedilink
    English
    arrow-up
    64
    arrow-down
    8
    ·
    1 year ago

    Why is “98%” supposed to sound good? We made a computer that can’t do math good

    • Dojan@lemmy.world
      link
      fedilink
      English
      arrow-up
      47
      ·
      edit-2
      1 year ago

      It’s a language model, text prediction. It doesn’t do any counting or reasoning about the preceding text, just completes it with what seems like the most logical conclusion.

      So if enough of the internet had said 1+1=12 it would repeat in kind.

      • tony@lemmy.hoyle.me.uk
        link
        fedilink
        English
        arrow-up
        6
        ·
        1 year ago

        Someone asked it to list the even prime numbers… it then went on a long rant about how to calculate even primes, listing hundreds of them…

        ChatGPT knows nothing about what it’s saying, only how to put likely sounding words together. I’d use it for a cover letter, or something like that… but for maths… no.

      • kromem@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        Not quite.

        Legal Othello board moves by themselves don’t say anything about the board size or rules.

        And yet when Harvard/MIT researchers fed them into a toy GPT model, they found that the neural network best able to predict outputting legal moves had built an internal representation of the board state and rules.

        Too many people commenting on this topic as armchair experts are confusing training with what results from the training.

        Training on completing text doesn’t mean the end result can’t understand aspects that feed into the original generation of that text, and given a fair bit of research so far, the opposite is almost certainly the case to some degree.

    • themeatbridge@lemmy.world
      link
      fedilink
      English
      arrow-up
      18
      ·
      1 year ago

      Reminds me of that West Wing moment when the President and Leo are talking about literacy.

      President Josiah Bartlet: Sweden has a 100% literacy rate, Leo. 100%! How do they do that?

      Leo McGarry: Well, maybe they don’t and they also can’t count.

    • WackyTabbacy42069@reddthat.com
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      17
      ·
      edit-2
      1 year ago

      This program was designed to emulate the biological neural net of your brain. Oftentimes we’re nowhere near that good at math just off the top of our heads (we need tools like paper and simplifying formulas). Don’t judge it too harshly for being bad at math, that wasn’t it’s purpose.

      This lil robot was trained to know facts and communicate via natural language. As far as I’ve interacted with it, it has excelled at this intended task. I think it’s a good bot

      • Veraticus@lib.lgbt
        link
        fedilink
        English
        arrow-up
        31
        arrow-down
        4
        ·
        edit-2
        1 year ago

        LLMs act nothing like our brains and they aren’t trained on facts.

        LLMs are essentially complicated mathematical equations that ask “what makes the most sense as the next word following this one?” Think autosuggest on your phone taken to the extreme limit.

        They do not think in any sense and have no knowledge or facts internal to themselves. All they do is compose words together.

        And this is also why they’re garbage at math (and frequently lie, and why they can’t “remember” anything). They are simply stringing words together based on their model, not actually thinking. If their model shows that the next word after “one plus two equals” is more likely to be four than three, they will simply answer four.

        • Silinde@lemmy.world
          link
          fedilink
          English
          arrow-up
          7
          ·
          edit-2
          1 year ago

          LLMs act nothing like our brains and are not neural networks

          Err, yes they are. You don’t even need to read a paper on the subject, just go straight to the Wikipedia page and it’s right there in the first line. The ‘T’ in GPT is literally Transformer, you’re highly unlikely to find a Transformer model that doesn’t use an ANN at its core.

          Please don’t turn this place into Reddit by spreading misinformation.

        • cyd@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          arrow-down
          2
          ·
          1 year ago

          “Nothing like our brains” may be too strong. I strongly suspect that much of human reasoning is little different from stringing words together, albeit with more complicated criteria than current LLMs. For example, children learn maths in a rather similar way, based on language and repeated exposure; humans don’t have a built in maths processor in our brains.

      • jocanib@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        ·
        1 year ago

        This lil robot was trained to know facts and communicate via natural language.

        Oh stop it. It does not know what a fact is. It does not understand the question you ask it nor the answer it gives you. It’s a very expensive magic 8ball. It’s worse at maths than a 1980s calculator because it does not know what maths is let alone how to do it, not because it’s somehow emulating how bad the average person is at maths. Get a grip.

      • xantoxis@lemmy.one
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        17
        ·
        1 year ago

        Bro I wasn’t looking for a technical explanation. I know how they work. We made computers worse. The thing isn’t even smart enough to say “I wasn’t designed to do math problems, perhaps we should focus on something where I can make up a bunch of research papers out of thin air?”