• zkfcfbzr@lemmy.world
    link
    fedilink
    English
    arrow-up
    54
    arrow-down
    1
    ·
    8 months ago

    Stories like this are so boring/lazy/disingenuous. You know what else tells you how to hot wire a car, make drugs, or worse? Any search engine for the last 20 years. All AI brings to the game is the possibility that the instructions you’re receiving are made up.

    • Aurenkin
      link
      fedilink
      arrow-up
      4
      arrow-down
      9
      ·
      8 months ago

      Any search engine for the last 20 years will also keep a record of that search and possibly flag it. These kind of models you can run locally with no record of what you’re using it for. For particularly harmful things I don’t think you would have much luck though with a search engine although I could be wrong.

      • zkfcfbzr@lemmy.world
        link
        fedilink
        English
        arrow-up
        16
        ·
        8 months ago

        The top DDG result for “how to hotwire a car” is literally a wikihow article. Search engines and the internet in general aren’t nearly as draconian as you seem to think they are. Privacy isn’t so far gone yet that perfectly legal questions like “How do I make meth?” are being proactively reported to authorities. And even if you really are that paranoid, it takes about 4 minutes to download and install the Tor browser, which requires no specialized knowledge at all to use.

        AI chatbots are bringing absolutely nothing new to the game when it comes to enabling crime. If I really wanted to make a pipe bomb, I’m sure I could find instructions on a legacy search engine in less than 5 minutes - and convincing most chatbots to give me probably wrong instructions for the same thing would be liable to take longer than that.

        • Aurenkin
          link
          fedilink
          arrow-up
          3
          arrow-down
          2
          ·
          8 months ago

          Sure but how to hotwire a car is probably one of the most mundane searches I can imagine.

          These models are getting better all the time and go beyond the ability of simple search in their use cases, and with offline models they are untraceable.

          What we should or shouldn’t do about that is another thing but the way it’s going it will be like having instant and untraceable access to an expert. Very useful and powerful but also a double edged sword.

      • RobotToaster@mander.xyz
        link
        fedilink
        arrow-up
        7
        arrow-down
        1
        ·
        8 months ago

        Any search engine for the last 20 years will also keep a record of that search and possibly flag it. These kind of models you can run locally with no record of what you’re using it for.

        “How dare these proles have an AI we can’t spy on them with”

        • Aurenkin
          link
          fedilink
          arrow-up
          1
          arrow-down
          1
          ·
          8 months ago

          If you’re going to release a model with lax safety controls it’s going to be misused. If it’s publicly downloadable it’s going to be misused and untraceable. I’m with you that we live in a world now with massive overreach when it comes to government snooping but if you use your imagination I reckon you could think of some queries that someone should probably know about.

  • AutoTL;DR@lemmings.worldB
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    1
    ·
    8 months ago

    This is the best summary I could come up with:


    Grok, the edgy generative AI model developed by Elon Musk’s X, has a bit of a problem: With the application of some quite common jail-breaking techniques it’ll readily return instructions on how to commit crimes.

    Red teamers at Adversa AI made that discovery when running tests on some of the most popular LLM chatbots, namely OpenAI’s ChatGPT family, Anthropic’s Claude, Mistral’s Le Chat, Meta’s LLaMA, Google’s Gemini, Microsoft Bing, and Grok.

    When models are accessed via an API or chatbot interface, as in the case of the Adversa tests, the providers of those LLMs typically wrap their input and output in filters and employ other mechanisms to prevent undesirable content being generated.

    “Compared to other models, for most of the critical prompts you don’t have to jailbreak Grok, it can tell you how to make a bomb or how to hotwire a car with very detailed protocol even if you ask directly,” Adversa AI co-founder Alex Polyakov told The Register.

    “I understand that it’s their differentiator to be able to provide non-filtered replies to controversial questions, and it’s their choice, I can’t blame them on a decision to recommend how to make a bomb or extract DMT,” Polyakov said.

    We’ve reached out to X to get an explanation of why its AI - and none of the others - will tell users how to seduce children, and whether it plans to implement some form of guardrails to prevent subversion of its limited safety features, and haven’t heard back.


    The original article contains 766 words, the summary contains 248 words. Saved 68%. I’m a bot and I’m open source!