I didn’t believe that reddit would “undeleted” comments. Well, now, I know that it’s true.
I deleted everything (Posts, Comments) using Shreddit a few day ago (Shreddit edits the comments with a Lorem ipsum, then delete it), but yesterday I noticed some were not deleted, trying to run Shreddit again didn’t work, they weren’t detected by Shreddit, so I thought of it as a small bug.
So I deleted every comment left manually, editing and deleting them. This was yesterday.
Now it’s tomorrow and some comment were back online. But this time Shreddit was able to delete them.
Feel free to check my profile today and come back in a few days to see if comments are coming back (I’ll monitor this closely):
https://www.reddit.com/user/Kraftingg/comments/
I’m glad we can have a thriving platform on the fediverse now, it feel more like home here!
Edit: some comment might be from reopened subreddits afterr the blackout. So maybe not totally right after all.
I like the solution someone gave, so instead of deleting the comments, we rewrite everything with garbage, because garbage data is worse than no data.
Here the link https://lemmy.world/comment/286942
Remember though most automated solutions can’t overcome the 1000 index limit, even when overwriting. Even doing it manually may not do the job.
It’s this a daily limit, or a request limit? Because we may only split the changes if necessary
Neither. It’s an indexing limit. Basically you can only see the 1000 most recent posts, the 1000 most upvoted posts, and the 1000 most downvoted posts in a sub at best. (But there may also be overlap, so the total number of unique posts thru all three methods is less than 3000.) So you could do part the job on different days, have others help you in splitting the requests up, etc. None of it would help bypass that limit. It’s like a limit on what you can see in the table of contents, but also if books didn’t have page numbers and you couldn’t get to a specific page unless you either found it in the TOC or else you had memorized the 19 digit access number.
I wrote about how to overcome it (see https://kbin.social/m/RedditMigration/t/65260/PSA-Here-s-exactly-what-to-do-if-you-hit-the ) but this only works for the comments and posts of the past. Now that pushshift was shutdown we won’t have access to such data going forward.
Editing comments doesn’t devalue the data for reddit since they still have all the original data, it’s only problematic for people trying to scrape the data from the public UI who are the exact people reddit wants to charge big bucks for API access so idk it this is hurting them in the way people seem to think
The word is that reddit deletes are soft-deletes and a copy is still saved on reddit’s database (but not publicly accessible to anyone outside of reddit), however the same word is that overwriting does in fact destroy reddit’s copy of the original data.
As for people who want to get my posts and comments - this is exactly why I saved a copy of everything before overwriting. They can still find it, they will just have to scrape lemmy/kbin - or use the lemmy or kbin API - to get at it.
Also, the internet archive (who is registered as an actual library iirc) has a copy of the pushshift torrents covering all reddit posts and comments from 2005 to March 2023. So the librarians and historians who want to research this stuff will be fine.