I’m not super familiar with the Archive Team - what distinguishes this specific archiving effort from the dataset that PushShift archived? Is this primarily focusing on archiving specifically media (video, iamges), or comments/submissions in the time period since PushShift closed, or everything from the entire time period from 2005 onward?
Bump. We have the PushShift archive—what is this project archiving that we don’t already have?
This is such an awesome project and something I had been worried about with reddit dying a slow death. Thank you for the post and bringing awareness to the project!
Only 15 million items left out of over 11.5 billion!
I run it using Docker on my Unraid server. It is available on community applications so it is easy to setup.
I’m now seeing reports over on [email protected] that people who have deleted all the comments from their accounts - even those who did it years ago, not just in the past few weeks out of protest - are having all their comments reappear again. This apparently also includes comments that were overwritten with edits.
Scummy behaviour from Reddit, but a potential boon for archivists. People who are running backups or maintaining archives of Reddit comments might want to take this opportunity to re-check historical deleted comments to see if they can be collected now, in this remaining window of API accessibility.
If people feel like helping, I am currently writing a reddit to lemmy mirroring app