Hey everyone, you may have noticed that some of us have been raising alarms about the amount of spam accounts being created on insufficiently protected instances.

As I wanted to get ahead of this before we’re shoulders deep in spam, I developed a small service which can be used to parse the Lemmy Fediverse Observer and retrieve instances which are suspicious enough to block.

The Overseer provides fully documented REST API which you can use to retrieve the instances in 3 different formats. One with all the info, one with just the names, and one as a csv you can copy-paste into your defederation setting. You can even adjust the level of suspicion you want to have.

Not only that, I also developed a python script which you can edit and run and it will automatically update your defederation list. You can set that baby to run on a daily schedule and it will take care that any new suspicious instances are also caught and any servers that cleared up their spam accounts will be recovered.

I plan to improve this service further. Feel free to send me ideas and PRs.

  • afatparakeet@beehaw.org
    link
    fedilink
    English
    arrow-up
    5
    ·
    1 year ago

    This is awesome, would you be open to contributions outside the realm of spam prevention?

    I’ve been working on some scripts/apis to pre-search communities from federated instances so they immediately show up within an instance’s search capability. Was also thinking about doing a bot account to auto subscribe.

    Seems like this could fit with the whole overseer/curation theme. Would you agree or nah?

    • db0@lemmy.dbzer0.comOPM
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 year ago

      I’m always up for collaboration and prs. I’m adding now some cool new capabilities so once I have that you’ll be able to build much easier on it

  • This sounds cool! Similar vein what I have been trying to do on Lemmyverse - Determine trust of federated instances based on trust lists from other instances, along with instance stats over time. I’m starting to collect stats like Instance User count by Time, so I can potentially export lists of instances with a given score for each Instance, which could be automatically read by your defederation scripts.

    I like your threshold:

    # If there's this many registered users per local post+comments, this site will be considered suspicious
    ACTIVITY_SUSPICION = 20
    

    I may steal some of your code to build out my scoring algo too :)

  • sinnerdotbin@lemmy.ca
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    1 year ago

    Cool. I was wondering when someone would implement this sort of email RBL list.

    I’m not finding any info on The Overseer how instances are deemed suspicious and what mechanisms there are for reporting/disputing. How are instances scored?