The version is 0.1 currently, so it is in a very early stages. Here is the project page.

  • rejoyce@infosec.pub
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    1 year ago

    “Inadequate Alignment” reads like simply another item on the list here, but to my knowledge the entire field of AI Alignment has been working on this problem for decades. And while they’ve made some really impressive progress, I believe the consensus is that they’re nowhere near solving it - it’s a very difficult problem.

    We can see this in how crafting prompts to get LLM’s to do complex tasks is actually quite a complex task (even for tasks it is capable of doing), but at least for now the errors are somewhat easy to catch as you get your reply immediately.

    As LLM’s become more integrated into people’s workflows I wonder when we’ll start seeing more serious incidents due to misaligned behaviors not being caught. Hopefully projects like this will lead to the development of more safeties before then, but I’m not holding my breath.

    • Capt. AIn@infosec.pubM
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      Good points, and I agree!

      The list is currently largely made to spark interest and discussion so it’ll likely change a lot. What you mentioned is also brought up on the Brainstorming page. It seems likely that “Inadequate Alignment” will be removed from the list.