A way to lower dependency on reddit.

Haui@discuss.tchncs.de · edit-2 1 year ago

A way to lower dependency on reddit.

RootBeerGuy@discuss.tchncs.de · 1 year ago

I sympathise with your disclaimer down there but you and I both know, if someone makes this automated for users, someone else will add it to a bot and overdo it. It’s the way of the world.

Haui@discuss.tchncs.de · 1 year ago

Thank you for your kind words! I‘m always the idealist. :)

Lvxferre · edit-2 1 year ago

I’m no coder but I think that this could work. I picture something like this.

Create a Lemmy comm. Let’s call it c/redditimports.
The only user allowed to post in c/redditimports is a bot, let’s call it u/Ribbit (Reddit Import Bot, Busy Importing Threads.). However everyone can comment in it.
There’s always one pinned thread in c/redditimports. In that thread, humans can comment which Reddit threads+comment chains they want to import from Reddit to Lemmy. They do it by pinging u/Ribbit, then providing links to 2+ comments in the same thread.
When u/Ribbit is pinged, it checks if someone provided it some Reddit comment links. If they’re valid, u/Ribbit creates a new post in c/redditimports, and fills it with content from: the Reddit post, the Reddit comments directly linked, their parents, grandparents etc., recursively.

Now here’s the hard part - someone coding the bot. Preferably in a way that Reddit doesn’t tell it apart from an actual human being, as Reddit Inc. and Greedy Pigboy will certainly not like it.

Also, don’t worry about lowering Reddit dependency. Bots there will do it for us, the immigration leftover is already complaining about it.

Haui@discuss.tchncs.de · 1 year ago

Thank you for laying this out. Maybe someone can make something out of it. :)

GregorGizeh@lemmy.zip · 1 year ago

We already have something extremely similar, a bot instance that reposts content from subs that users request. It is incredibly useless in reality, and really spamming the shit out of my feed. The first user I blocked.

There is not much use in third hand content that suggests a discussion, while it isn’t being discussed here. Now don’t get me wrong, I’d love for reddit to up and die, but it isn’t happening yet if we are being realistic and won’t happen by artificially creating zombie repost content on Lemmy.

Id rather we had more content creators to provide valuable OC.

Lvxferre · 1 year ago

An important key difference is that the bot that you’re talking about mirrors whole subreddits, regardless of people interacting with it; while my suggestion is a bot that copies specific threads upon user request. As such the later would generate considerably less noise, enough to keep it contained to a single comm.

There is not much use in third hand content that suggests a discussion, while it isn’t being discussed here.

OP proposed a use - building a knowledge base here that could attract other users.

Id rather we had more content creators to provide valuable OC.

I agree with you but I don’t think that we need to choose between one or another.

chicken@lemmy.dbzer0.com · 1 year ago

I took a shot at it, made a post here

Haui@discuss.tchncs.de · 1 year ago

Thanks mate! Very awesome! Pinging my other account haui_lemmy@lemmy.giftedmc.com

ninjan@lemmy.mildgrim.com · 1 year ago

Could of course be done with a browser extension like “webscraper”, with some tweaking you could likely get the data you want without much issue. That could then be packaged into a new extension made specifically for scraping a Reddit post you find and push it to Lemmy. The main downside I see is that all comments will be from the same user, so you likely want to set up a specific user for those posts. Like “reddit-scraper” or some such.

If I didn’t have 2 toddlers and a 7 year old then I’d love to take a stab at it. But what little free time I have is not enough to make any proper progress on something like this.

Haui@discuss.tchncs.de · 1 year ago

I agree completely. I would argue that we dont need the comments at all. A post, laying out the initial comment chain would be enough. Like I said „user1“ and so on but with real names.

The reason I‘d like to do it this way (without comments but preserving the OPs names) is twofold: its not pretentious (as comments from the same person) and it would honor the original posters.

density@kbin.social · 1 year ago

I havent checked into its status but you should check out the archive team’s reddit project.

I made a post a few months ago with description. Suffice to say archive team are your best hope for mass scraping.

For personal use use you can use a web clipper extension (if you want to convert to markdown) or one that archives complete pages you visit as local files. If you are willing to install software to the computer also, there are tools that will archive your complete history and bookmarks. So you can go get whatever urls you have previously visited (assuming you saved them).

As to cloning the discussion to the fediverse, there were a few projects to try to acheive this but as far as im aware none really got off the ground. Have a search on github.com, gitlab.com and codeberg.org most of them were based out of one of those sites.