• Lvxferre@lemmy.ml
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    1 year ago

    Machine text generation is such a cool concept, that I’m often talking about it. However I sound like a broken record when stating again the obvious:

    LLMs can’t self-correct in reasoning tasks for the same reason as pigs can’t self-correct in flying tasks. Saying that a LLM “reasons” is at the best metaphorical, and misleading; it’s likely able to handle primitive logic and not much else.

    This picture is specially facepalm-worthy:

    Focus on what is being implied to be medical conditions that are making the patient sick there. It’s a bag of four completely different things:

    • Hallucinations: a natural part of the output of those models.
    • Unfaithful reasoning: unreasonable expectation that Mr. Piggy will fly.
    • Flawed “codes” [SIC]: actual issues with the software, that can be addressed by recoding parts of it.
    • “Toxic” [whatever this means] “contents” [SIC]: moral issues with the output.

    The rest of the cycle might as well have been generated in a lorem ipsum generator, a LLM bot (eh) or by a kettle of buzzword-vomiting advertisers.

    LLM self-correction errors In many cases, intrinsic self-correction causes models to switch from the right answer to the wrong answer

    “Self-correction” is extra steps for feeding the bot with its own output. Of course it’ll go rogue.

    “Rather than labeling the multi-agent debate as a form of “debate” or “critique”, it is more appropriate to perceive it as a means to achieve “consistency” across multiple model generations,” the researchers write.

    Or rather as “culling”. The bots are culling the atypical answers. Except that, given that there are plenty ways for something to be wrong and only one to be right, the likelihood that the correct answer is among the atypical ones is huge.


    And as usual the comments in HN make me glad that this Lemmy comm exists. HN has great topics but inane comments.