Half of LLM users (49%) think the models they use are smarter than they are, including 26% who think their LLMs are “a lot smarter.” Another 18% think LLMs are as smart as they are. Here are some of the other attributes they see:

  • Confident: 57% say the main LLM they use seems to act in a confident way.
  • Reasoning: 39% say the main LLM they use shows the capacity to think and reason at least some of the time.
  • Sense of humor: 32% say their main LLM seems to have a sense of humor.
  • Morals: 25% say their main model acts like it makes moral judgments about right and wrong at least sometimes. Sarcasm: 17% say their prime LLM seems to respond sarcastically.
  • Sad: 11% say the main model they use seems to express sadness, while 24% say that model also expresses hope.
  • mindbleach
    link
    fedilink
    English
    arrow-up
    1
    ·
    21 hours ago

    That kitchen-sink definition is degreeless. You’re drawing a line so distant and steep that every Philosophy 101 question gets a clear answer and that answer is “nope.” Brain in a jar? No awareness. Chinese room? Not self-directed. This may overdefine thought to such an extent that being wrong doesn’t count. Like if someone has to think twice, the first time was something else.

    The second time might not count either, depending on how hard we examine “reactive pattern recognition.” Only an explanation of consciousness in terms of unconscious events could possibly explain consciousness.

    Thinking is the ability to reason about things. Concrete tools, abstract concepts, whatever. It’s all the same process. It differs considerably from person to person. We flat-out do not understand it well enough to pin down how it happens. We have to infer that it has happened, from observed results, the same way using a calculator demonstrates that it’s doing math.

    LLMs occasionally demonstrate that they’re doing thought. The context by which they pick the next word can require reasoning. As a concrete example, they can be given an answer that is wrong, figure it must be a joke, and deliberately make it wrong-er. That’s evaluation and adaptation. The model, at runtime, spotted bullshit and inferred a reason for bullshit. Failure modes that merely satisfy grammar rules include trying to justify it anyway, or “Yes, by which I mean no.”