Could it be possible to have one major global instance that aggregates everything so it can be indexed by search engines? Would that work? Or do I not fully understand how federation works?
It becomes a central choke point of moderation. Who gets to decide what instances are part of global and which ones aren’t. Because a free for all isn’t going to end well. And then you’re back at Reddit.
I wonder if you could have an instance federated to every other instance just for archived purposes, to save the data on every other instance’s post and comment. Because copies of posts and comments are saved to federated instances, too, right? Or do I understand the tech wrong?
So it could have an admin team but no users, to prevent people worried about spammers and bots joining that instance to get around defederation rules. Maybe it just has a bot that crawls Lemmy, looking for instances to federate to. Could that work?
Right, but having a centralised search index thingy is better than none at all. Maybe there could be something where it’s a joint effort from admins from many of the biggest servers, idk if that would work.
Lemmy search already is quite excellent… at least here on lemm.ee, we don’t have many communities but tons of users subscribed to probably about everything on the lemmyverse so the servers have it all.
It might be interesting to team up with something like YaCy: Instances could operate as YaCy peers for everything they have. That is, integrate a p2p search protocol into ActivityPub itself so that also smaller instances can find everything. Ordinary YaCy instances, doing mostly web crawling, can in turn use posts here as interesting starting points.
I was worrying about precisely this. I’d be ok with blocking search engines if there was a better way of searching but AFAICT there isn’t federated search of any kind?
I’ve seen some when I appended “Lemmy” just like “Reddit”. But it relies on lemmy being in the domain name.
Also I assume even when people click on those results, they don’t get ranked much higher because it’s so many different domains while reddit is just one.
Most of the originalish content on lemmy are linux related stuff, memes and porn. The latter 2 are mostly image/video based, so you don’t search for that very frequently and easily. I can see that in the future it will become a very relevant source of info in linux admin and user circles.
I go back to r*ddit sometimes for some local content which is non existent on lemmy. I see that the tech related subs are mostly dead there, or at least only shadows of their former selfs. E.g. go to r/linux, sort by top all time. In the first 100 results you will barely find anything posted after the exodus.
Yeah, the notion that Lemmy is a Reddit replacement is misguided. It definitely doesn’t have the same Q&A balance Reddit does. It feels a lot more like 90s and early 2000s forums than the large-scale self-service link and customer service churn Reddit encourages.
Which I’m all for. I was never a Reddit guy and I do like it here. But in terms of how bad it is now that Reddit is not happy to host most of the actually useful online content for free… well, that’s a different conversation.
The problem with that is, lemmy.world is only one of many different instances. Too bad there isn’t a way to add a modifier that searches the entire fediverse.
from the top of my head, that won’t include lemm.ee, sopuli, beehaw, szmer.info, slrpnk.net, sh.itjust.works, or other threadiverse instances like kbin/mbin.
Appending (intext:“modlog” & “instances” & “docs” & “code” & “join lemmy”) to your search query will search most instances. Works with Google, Startpage, SearXNG afaik.
(Heh, when testing this sanitized URL from the thousand character monster it was before, Google asked me if I was a bot. I think parentheses and stuff make them suspicious.)
I’m inclined to think due to the nature of the platform, contents are constantly duplicated to the eyes of search engines, which hurts authoritativeness of each instance thereby hurts ranking.
tbh I’ve never seen a Lemmy link when searching for stuff. Is it too small to show up? Or do search engines not index Lemmy instances?
A lot of Fediverse admins are just normal people like you and me with a budget, and disallowing bots and spiders helps save bandwidth, and the budget.
Could it be possible to have one major global instance that aggregates everything so it can be indexed by search engines? Would that work? Or do I not fully understand how federation works?
That would defeat the purpose of federation.
It becomes a central choke point of moderation. Who gets to decide what instances are part of global and which ones aren’t. Because a free for all isn’t going to end well. And then you’re back at Reddit.
I wonder if you could have an instance federated to every other instance just for archived purposes, to save the data on every other instance’s post and comment. Because copies of posts and comments are saved to federated instances, too, right? Or do I understand the tech wrong?
So it could have an admin team but no users, to prevent people worried about spammers and bots joining that instance to get around defederation rules. Maybe it just has a bot that crawls Lemmy, looking for instances to federate to. Could that work?
You’re describing Meta’s plan but yes that could work.
Godamnit Meta… Lol
I prefer the Internet Archive plan than the Meta Plan.
Right, but having a centralised search index thingy is better than none at all. Maybe there could be something where it’s a joint effort from admins from many of the biggest servers, idk if that would work.
Lemmy search already is quite excellent… at least here on lemm.ee, we don’t have many communities but tons of users subscribed to probably about everything on the lemmyverse so the servers have it all.
It might be interesting to team up with something like YaCy: Instances could operate as YaCy peers for everything they have. That is, integrate a p2p search protocol into ActivityPub itself so that also smaller instances can find everything. Ordinary YaCy instances, doing mostly web crawling, can in turn use posts here as interesting starting points.
I just wish lemmy search itself wasn’t broken…
Gotta keep some things that feel like reddit.
All a spider needs is an instance to download everything.
I was worrying about precisely this. I’d be ok with blocking search engines if there was a better way of searching but AFAICT there isn’t federated search of any kind?
Really? I thought they were free and didn’t affect bandwidth.
Any data transit costs money. Both in the data transit itself and in the increased server resources to respond to the web queries in the first place.
Ah that makes sense not really familiar iwth this stuff so didn’t think it’s that intensive lol
bots take resources to serve just like any regular user
They usually only index text though
I’ve seen it a couple of times when searching on DDG.
I’ve seen some when I appended “Lemmy” just like “Reddit”. But it relies on lemmy being in the domain name.
Also I assume even when people click on those results, they don’t get ranked much higher because it’s so many different domains while reddit is just one.
Kagi has a button that lets you search fediverse forums. I haven’t tested it yet though.
Edit: yup, works like a charm!
Adding lemmy does nothing for me, it searches for Lemingrad or some shit.
Most of the originalish content on lemmy are linux related stuff, memes and porn. The latter 2 are mostly image/video based, so you don’t search for that very frequently and easily. I can see that in the future it will become a very relevant source of info in linux admin and user circles.
I go back to r*ddit sometimes for some local content which is non existent on lemmy. I see that the tech related subs are mostly dead there, or at least only shadows of their former selfs. E.g. go to r/linux, sort by top all time. In the first 100 results you will barely find anything posted after the exodus.
Yeah, the notion that Lemmy is a Reddit replacement is misguided. It definitely doesn’t have the same Q&A balance Reddit does. It feels a lot more like 90s and early 2000s forums than the large-scale self-service link and customer service churn Reddit encourages.
Which I’m all for. I was never a Reddit guy and I do like it here. But in terms of how bad it is now that Reddit is not happy to host most of the actually useful online content for free… well, that’s a different conversation.
deleted by creator
deleted by creator
You can always add “site:lemmy.world” to your search (remove the quotes). I commonly do that, as well as the same for reddit or stack overflow.
The problem with that is, lemmy.world is only one of many different instances. Too bad there isn’t a way to add a modifier that searches the entire fediverse.
yea i’ve been doing “inurl:lemmy” for that reason
from the top of my head, that won’t include lemm.ee, sopuli, beehaw, szmer.info, slrpnk.net, sh.itjust.works, or other threadiverse instances like kbin/mbin.
You’d miss instances that don’t use “lemmy” in the URL, but it’s at least a better solution than specifying a single instance.
Appending
(intext:“modlog” & “instances” & “docs” & “code” & “join lemmy”)
to your search query will search most instances. Works with Google, Startpage, SearXNG afaik.Very nice, thanks!
Was able to find this thread:
https://www.google.com/search?q=google+reddit+(intext:"modlog"+%26+"instances"+%26+"docs"+%26+"code"+%26+"join+lemmy")
(Heh, when testing this sanitized URL from the thousand character monster it was before, Google asked me if I was a bot. I think parentheses and stuff make them suspicious.)
One of the major problems with Lemmy is that many posts get deleted and that nukes the comment section (which is where most of the answers will be).
I wish Lemmy deleted posts closer to how Reddit deletes posts - the post content should be deleted, but leave the comments alone.
Searx will show Lemmy results, at least on some Searx instances.
Twice I have come across links to lemmy, definitely not the norm though.
I’m inclined to think due to the nature of the platform, contents are constantly duplicated to the eyes of search engines, which hurts authoritativeness of each instance thereby hurts ranking.