• RightHandOfIkaros@lemmy.world
    link
    fedilink
    English
    arrow-up
    41
    arrow-down
    2
    ·
    11 months ago

    We built a data set of 45 million comments on news articles on the Huffington Post website between January 2013 and February 2015.

    I am no expert but I feel like this is a really bad data set choice for this study.

    • KuroeNekoDemon
      link
      fedilink
      English
      arrow-up
      12
      ·
      11 months ago

      It is. They should’ve used Reddit and Twitter posts/comments from it’s start to the present to get a more accurate database

      • awwwyissss@lemm.ee
        link
        fedilink
        English
        arrow-up
        2
        ·
        11 months ago

        Or from the start up until like 2016 when the shills and bots started showing up en masse.

    • MomoTimeToDie
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      5
      ·
      11 months ago

      It’s just a bad data set for basically anything

      • sugar_in_your_tea
        link
        fedilink
        English
        arrow-up
        6
        ·
        edit-2
        11 months ago

        Yup, comments on news articles are pure cancer. Comments about news articles can be decent though, but they need to be hosted elsewhere.