Yes, I’m certain I could final answers to all these questions via research, but I’m coming here as part of the Reddit diaspora. My guess is that there’s a benefit to others like me to have this discussion.
I can vaguely understand the federation concept, the idea that my account is hosted at an individual Lemmy server and that other servers trust that one to validate my account. What’s the network flow like? I’m posting this to the lemmy.ml /asklemmy community, but I’m composing it on the sh.itjust.works interface. I’m assuming sh.itjust.works hands this over to lemmy.ml. How does my browsing work? Is all of my traffic routed through sh.itjust.works?
Assuming there’s a mass influx of redditors, what does it look like as things fail? I’m assuming some servers can keep up under the load and some can’t. If sh.itjust.works goes down under the load, can I still browse other servers? Or, do those servers think I should have some token from sh.itjust.works, because my cookies say I’m still logged in, and I can’t even do that?
Are there easy mechanisms to allow me to grab my post history?
I’m assuming most (all?) Lemmy servers are hosted in home labs? The idea of Lemmy excites me, but the growth pain that could be coming scares me. Anybody using a CDN in front of their servers? That could be good, but with unconstrained growth, that could be costly, which is very bad.
I can imagine lots of different worse case scenarios, but I’m curious what those of you who run servers imagine for the best case scenario? A manageable growth that just gets more vibrant communities, which can’t ever lead to the breadth and variety of Reddit?
Also, for those running servers, have any of you experienced issues during this growth? What scares you?
What’s the network flow like? I’m posting this to the lemmy.ml /asklemmy community, but I’m composing it on the sh.itjust.works interface. I’m assuming sh.itjust.works hands this over to lemmy.ml. How does my browsing work? Is all of my traffic routed through sh.itjust.works?
- You register your account on
sh.itjust.works
, that’s where all the info you care about resides. Your list of subscribed communities resides there. When you read a post, it gets fetched out of the db onsh.itjust.works
(irrespective of where the home instance for that post’s community is… when you read it it comes out of the database on your home instance), and when you comment on a post, that gets written to the db on your home instance. Your home instance a standalone fully functioning thing. - When you subscribe to a remote community like this one, you tell your home instance "keep up to date with posts and comments for this community and let me know about them. Your home instance asynchronously gets all those updates while you’re asleep or whatever so it can show them to you out of its local database when you come back. If more users on
sh.itjust.works
subscribe to the same community… there’s no incremental overhead. All ya’lls instance is ALREADY subscribed to that sub. So other users on your instance can sub to it for free, it’s already in the instance’s database.
Assuming there’s a mass influx of redditors, what does it look like as things fail?
- If
lemmy.ml
(where this community is homed) falls over from being overloaded or just is broken for whatever reason, your instance is unaffected. You can still read posts and make comments. This community however… is affected. New posts and comments for this community might come through intermitently or not at all for you (and everyone in the lemmyverse) because the community’s home server isn’t working well enough to reliably deliver them over federated replication. You can still read older posts and comments that have already been synced to your home instance, but new ones might not arrive. You might also see weird stuff like being able to see new comments from othersh.itjust.works
users on this community, since those get written to your db before getting federated back to the community’s home server. But mostly updates from other instances stop or get unreliable. - If
sh.itjust.works
falls over for some reason… well… that sucks for you. You can’t log in or browse anything on it. You can still visit this sub at https://lemmy.ml/c/asklemmy/ as long aslemmy.ml
is working and you’ll be able to see the posts and comments that other accounts make. But you’ll be an anonymous read-only browser, you won’t be able to post or comment untilsh.itjust.works
comes back online (or you make a new account elsewhere and lose all your comment history and subscription list).
Are there easy mechanisms to allow me to grab my post history?
There’s a github issue for this, but it’s not done yet: https://github.com/LemmyNet/lemmy/issues/506.
I’m assuming most (all?) Lemmy servers are hosted in home labs?
I don’t think that’s a good assumption.
lemmy.ml
is hosted on OVH, a cloud provider. My home instance onlemmy.world
is hosted by admins that run something like a 32 CPU mastodon instance. Most instances with over 100 users are running on some kind of probably modest but “real” cloud instance. The admins are volunteers, but often smart technical folks paying for small but real compute infrastructure.The idea of Lemmy excites me, but the growth pain that could be coming scares me. Anybody using a CDN in front of their servers? That could be good, but with unconstrained growth, that could be costly, which is very bad.
Anticipating growing pains isn’t wrong, it’s probably gonna happen. But the devs are gonna find and work on the biggest performance problems so that people can viably run bigger instances, and instance admins are gonna run bigger hardware and ask for donations or run patreons to cover the cost. In my opinion, the bigger worry is that Lemmy will fizzle… not that it will spectacularly explode. As long as people join and contribute and are interested, we’ll find a way to improve scalability and performance. The death knell would be if people get bored and leave, but compute capacity won’t be the problem in that scenario.
Thanks. That was an incredibly detailed response that answers the questions I was asking.
Doesn’t the fact that every Lemmy server has a copy of every federated post mean that if Lemmy takes off, only a few people with strong donation feeds can afford to survive?
If there’s an active forum (sub-lemmy?) on a server that has to spin down, the history stays on the remaining active ones, but I assume the only option is forking?
Moderation can only happen on the server hosting a forum, or each server can moderate posts in that server’s db?
Doesn’t the fact that every Lemmy server has a copy of every federated post mean that if Lemmy takes off, only a few people with strong donation feeds can afford to survive?
It’s not precisely true that every Lemmy instance stores every post. A given Lemmy instance will store a given post if and only if:
- A user on the storing instance is “interested” in that post. Being interested has a somewhat complicated definition that I myself don’t fully understand, but two examples of being interested include being subscribed to the community BEFORE the post was made (importantly, the storing instance doesn’t fetch the forever backlog of posts just because someone subscribes… rather… it asks to receive FUTURE posts), or a user searches a post by url.
- Subject to some caching policy. I just learned this today from another comment, and maybe it doesn’t apply to Lemmy, or doesn’t apply yet… but we already know that the storing instance doesn’t fetch historical posts going back forever. It could also decide to forget posts older than a certain time.
The first of these is most important though, because it means that posts and comments that no one is interested in don’t get shipped around the federated network. And this leads to the property that the size/cost of a Lemmy instance is going to depend on the size of the “active” usage. A single user Lemmy instance subscribing to a handful of communities will always be small and cheap, because it doesn’t subscribe to much content. A bigger Lemmy instance need not scale to the entirely of content in the lemmyverse, but rather to the “active set” of posts and comments its users interact with this month. That could get big, but what the Lemmy devs are saying (sorry no link, I’ve read too many posts lately to remember all my sources) is that user-traffic browsing the local DB of the Lemmy instance is dwarfing the replication load, which is great news because user browsing is much easier to optimize than federated replication.
If there’s an active forum (sub-lemmy?) on a server that has to spin down, the history stays on the remaining active ones, but I assume the only option is forking?
(FYI, the thing you subscribe to is called a
community
in Lemmy. Some folks say sublemmy, but this is a redditism that isn’t used in the code or official docs. It’s a “community”, which is why the url for a community ishxxp://my.lemmy.social/c/mycommunity
. The “c” in the middle stands for community.)Well, we’ve already talked about caching and expiry. It’s not clear to me than any Lemmy instance other than the one that hosts the community is required to keep the ENTIRE post/comment history (though yeah the active/recent ones will be all over the federated network).
I haven’t lived through a major instance shutdown, maybe an old-timer can weigh in here. Speculating, I’d think there would be 2 options:
- BEFORE the old instance shuts down (or using a third big community like
!lemmy@lemmy.ml
to coordinate), make a new community on a different server and the mods post telling everyone to subscribe there. The new community would be… well… new. It wouldn’t have the old posts, it would be made from scratch. The only things that would bind it to the old community are the mods that come over, the users that follow them, and the culture. - Optionally, using database tricks, or a migration tool (that I don’t think currently exists, but could almost certainly be created)… after doing all of the above… someone direct db access on both instances (aka an admin) could copy the the old posts to the new community. This might be a terrible idea for federation reasons, or it might be prohibitively complicated because of the db schema… but it feels to me like it COULD work. I’m not aware of it having been done before, or of any tooling that makes it “easier” for someone who isn’t pretty strong at doing complex data migrations in Postgres.
But the crux of your question is a real concern. If the admins of a major Lemmy instance hosting a bunch of big communities gets bored and shuts down, or gets pissed and takes their toys and goes home… I believe that event would be fairly disruptive but not an existential threat to the lemmyverse unless it happens frequently and nobody responds by making tools to do community migrations. But it’s my belief that if this happened once or twice people would improve the capabilities around community migration.
Doesn’t the fact that every Lemmy server has a copy of every federated post mean that if Lemmy takes off, only a few people with strong donation feeds can afford to survive?
Yes I’ve seen something like that on mastodon already. Though the caching is scaled by time so you can just say to cache only last 24 hours (or less) which will scale down storage requirements.
Also, didn’t know Ruud was running a .world lemmy instance. Cool!
Hero response right here!
Hi, sorry I’m late to responding, but could you clarify what you mean by
My home instance on lemmy.world is hosted by admins that run something like a 32 CPU mastodon instance.
What do you mean by “my home instance on lemmy.world”?
You are correct, the real problem is if Lemmy will fizzle out and die. I have faced some problems such as screens loading infinitely, sign up mechanisms not responding etc. The best way would be for everyone to host their instance (a small one should be fine for individual/small groups) alongside the bigger servers. I would like to do it myself when I can.
Thanks!
What do you mean by “my home instance on lemmy.world”?
I mean that my account, with the username
PriorProject
is hosted on the instance calledlemmy.world
. So that instance is “home” for my account. When commenting on a community like!asklemmy@lemmy.ml
, the home instance for that community islemmy.ml
, and I’m relying on federation between those two instances to shuttle posts/comments back and forth so I can see them and people on other instances can see my comments.If you were confused because we were discussing “home labs”, I confusingly used “home” in a different way. I don’t mean that
lemmy.world
is hosted out of my house (it’s hosted on a cloud provider, and I have nothing to do with running/hosting lemmy world… I just happened to sign up for an account there), I mean that instance is the home for my account. I probably could have found a clearer way to say that initially, but the ambiguity didn’t occur to me till you just responded.The best way would be for everyone to host their instance (a small one should be fine for individual/small groups) alongside the bigger servers.
It’s worth noting that it’s possible to overtune a federated network on single-user instances as well. The advantage of a shared instance is that servers hosting communities can replicate posts/comments from that community to a server, and that server can handle browse traffic for hundreds or thousands of users. If each one of those users instead ran their own single-user instance, the server hosting the community would have to work much harder to deliver hundreds or thousands of copies of those posts/comments… and if some users went inactive but left their lemmy server running… maybe nobody is even reading those posts.
It definitely IS possible to overtune a federated network both for too few servers that are each too big, and too many servers that are each too small. There is a goldilocks zone of a medium number of servers that are medium in size. But Lemmy is a long way from too many single-user instances (the devs have commented that replication traffic is not a bottleneck), and there’s also a lot of tuning they can do to make large instances run smoothly. So Lemmy has room to both host bigger servers, and more of them.
Thanks for the comment! That cleared it up. Great to know that running a single-user instance won’t hurt Lemmy. I kind of wish the developers would have gone for
k3s
instead of Docker to self-host Lemmy but I suppose that’s fine for now.
I’m wondering if Lemmy (and maybe the fediverse generally) has the potential to offer incoming Reddit mods something Reddit can’t: compensation.
Obviously it won’t be huge, but I feel like there’s a greater chance that a Lemmy user will make at least small one-off or even regular donations to help keep their communities and/or instances running.
Like if I’m running a server, voluntarily paying $50/month as, say, the mod of a 20M+ user subreddit might, even if 20 users contribute $2.5, that’s a full month’s worth of server time paid for.
As I say, the scale would be quite low, but wonder if it could be an interesting idea to try out, even if just as a proof-of-concept.
There’s definitely scope for this.
I know I would never pay for facebook / twitter / reddit. Not 1 cent, but I’m happily contributing to fosstodon on patreon each month - and I don’t think I’m alone.
I think there’s a not-insignificant segment of the population that are weary enough of the advertising revenue model that we’re happy to pay to avoid it. The expectation that everything on the net should be free has ruined the net IMO.
In a maybe ironic twist, I’m more than happy to pay for sites that are truly free but completely unwilling to pay for “free” sites backed by ads, tracking, and corporate bullshit. I subscribed to the Lemmy dev’s Patreon and I sponsor a few open source projects on GitHub but I would never pay for Twitter Blue, Reddit Gold, Discord Nitro, etc. I want to reward good behavior, not support bad. I also supported Mastodon for a while on Patreon but when they changed the Mastodon onboarding process to make it more centrallized I pulled back on that. I don’t want to reward restriction of the openness the Fediverse provides, even if some subset of users can’t figure it out.
- You register your account on
the idea that my account is hosted at an individual Lemmy server and that other servers trust that one to validate my account
I can’t stress highly enough how much this isn’t how it works.
You basically never directly interact with other servers. Instead, when someone on your host site first subscribes to a community hosted on a other site, your instance pulls in some recent posts from that remote site and then requests that all future content from that group be forwarded along to it. Then, people on your local site interact with that mirrored content, and your local site sends local additions back to the original host for syncing.
Your account only exists locally. You’re always reading locally, and you’re always acting locally. Everything else is servers mirroring and forwarding content.
Thanks. Based on some of the other answers, particularly in https://sh.itjust.works/comment/12511, I know understand better.
I appreciate everyone helping to explain some pretty basic questions in such detail.
We’re all learning. And I want to stress – and maybe should have in my original reply – that I wasn’t really trying to criticize. My reply was just going to be so far down, I figured it’d only be noticed by a handful of people.
The way you intuited things is the way a lot of people have intuited things on the Fediverse. There were big blow-ups on Mastodon about this 6 months ago, as users spoke out about instance admins defederating from other Mastodon instances who hosted accounts that chronically broke local rules.
It became clear that people interpreted these defederations as denying new local users’ access to “parts of Mastodon”, as if there were was some central object called “Mastodon” and local admins were being nannies about letting some of their users wander over to some parts of it, when in reality it was a case of the local admins refusing to host content from other websites that gave shelter to individuals who were abusive or otherwise breaking rules that existed on the local site.
If that makes sense.
If you can grok that we’re basically all on different, independent web forums, and there’s just an implicit agreement between the forums to cross-post and share content, you can better grok why somethings that will happen here happen.
If you can grok that we’re basically all on different, independent web forums, and there’s just an implicit agreement between the forums to cross-post and share content, you can better grok why somethings that will happen here happen.
I’m old enough to have participated in Usenet before web servers existed, with the idea that different Usenet servers carried different feeds. Now that I better understand it, that model is closer than my original understanding. I also realize it’s not a completely accurate model, since there’s no central hierarchy to the fediverse like Usenet had, but at least it works to get me to understand the idea that all interactions are going through the server I’m pointed at and that posts originating from other servers across the fediverse are being replicated to my server so I can interact.
Noice. It’s really nice seeing the Reddit migration bring people to the fediverse who experienced the Internet prior to, like, 2012, rather than shunned it, like the Twitter migration brought.
The type of user that is alienated enough by the API changes and capable of actually creating a Lemmy account surely has significant overlap with internet veterans and generally intelligent people. Which is fantastic news, I remember trying voat.co and it was just overrun by nazis and complete idiots. So far on Lemmy my experience has been the opposite. I’m not old enough to be a Usenet veteran, but I did experience the age of the BBS. I remember spending a lot of time on Totse back in the day.
That’s true. The Twitter exodus was triggered by people hating a guy, and it’s easy for me, in the excitement of watching this space grow, to forget that this is a different situation.
Oh, I didn’t take your comment as critical. I was asking because there’s lots I don’t understand. You clarified a basic misunderstanding I had. I appreciate it.
What’s the network flow like? I’m posting this to the lemmy.ml /asklemmy community, but I’m composing it on the sh.itjust.works interface. I’m assuming sh.itjust.works hands this over to lemmy.ml. How does my browsing work? Is all of my traffic routed through sh.itjust.works?
So, you post and it federates from your instance to lemmy.ml where the group you’re posting to resides.
Then the group basically “boosts” your post, and anyone that follows the group (ie, anyone that is subscribed to it) sees the “boost”, which the lemmy interface then displays as a post in the group. But if you follow the group from mastodon/calckey etc, instead of a threadiverse app, then it would just appear as a regular boost by the group you posted to.
Backfilling of data isn’t really a thing as such. Basically, your instance is only aware of content that has federated to it, and it only starts federating to your instance when someone first subscribes to a particular group.
That being said, the devs have mentioned that when lemmy federates the group actor, the API also sends the last 20 posts to the group. I don’t know how often groups are updated, and whether this applies every time it’s refreshed or not though…
Also coming from the sh.itjust.works instance.
My question is how the broader/larger communities are going to be handled. Which server has the responsibility of hosting large general purpose discussion communities? Because it seems like servers are actively incentivized to limit their size and you end up with many smaller mirrored communities but no realistic path to a reddit style overarching open forum about movies, or soccer, for example.
… it seems like servers are actively incentivized to limit their size and you end up with many smaller mirrored communities but no realistic path to a reddit style overarching open forum about movies, or soccer, for example.
This is not really true. You’re not limited to participating in communities (aka subreddits) on the server where your account is. My account is on
lemmy.world
, yours is onsh.itjust.works
, we’re both commenting in a community that’s homed tolemmy.ml
. The community can be much larger than the server, you don’t need to be an instance with lots of registered users in order to host a large community.It’ll be interesting to see how similar communities from different instances will work out. I’ve noticed a lot of Technology instances from beehaw.org, lemmy.ml, lemmy.ca, and others.
How will the fediverse as a whole determine which Technology instance is the “main” one? I assume it’ll be whichever one has the largest subscriber count accumulated over time.
/r/Tech, /r/Technology, /r/TechNews and others all exist. Which one is the “main” one?
In theory they should balance out kind of evenly. In practice, I can see some servers becoming significantly larger, which probably won’t be horrible, just not taking full advantage of the federation concept
Thank you for the response. EDIT: I read another comment (also from you) about the way federation actually works and I was totally wrong in my assumptions so I deleted them. I suppose I should be leaving these questions up to the people building the platform, rather than a layperson such as myself.
Can’t help but ask though, is the federation somewhat comparable to a blockchain? In a blockchain each member has a full copy of the chain, and in the federation each instance has a copy of all parts of the chain it cares about. If one peer goes down, the other peers can continue to function. Cool.
This comment should be stickied or included in some sort of tutorial though.
https://lemmy.world/comment/20357
I found another thread discussing this topic and most people seem to expect that there will be natural consolidation and fragmentation of communities over time. There was also the suggestion that individual communities could federate such that posts from two different meme communities in different instances could be merged into the same feed on the user end, which sounds like a pretty good compromise.
Can’t help but ask though, is the federation somewhat comparable to a blockchain?
Both ecosystems involve large number of independent peers that must coordinate without a leader and both were designed in the same decade. There are some similarities. There are also some big differences. I wouldn’t say that either needs to be more like the other than it is, they address different use-cases.
… individual communities could federate such that posts from two different meme communities in different instances could be merged into the same feed on the user end…
I’m not real convinced this is all that useful for the steady state.
technology@beehaw.org
andtechnology@lemmy.ml
are not meaningfully different than/r/technology
and/r/tech
(both of which are real subreddits, despite what many are saying… Reddit is FULL of duplicative subreddits. It’s just that one usually is much bigger and dominates). The main case I could see for what you’re describing would be community transfer or shutdown. If admins are killing a server, or mods of the smaller community give up… it might be cool to migrate subs/posts en-masse to a new community and shut down the old one. There there’s just no need to have two local communities that both remain active but are somehow “merged” in federation. That is indistinguishable from the scenario where everyone subs to one of the two communities without regard for where the community is hosted.I largely agree with your assessment. Its not necessarily a problem in the long run to have multiple local communities. I guess I’m still trying to wrap my head around how this is all going to work.
I still think that it’s too confusing to newcomers at the moment. For instance, I want to talk about soccer, but I have no idea where to find such a community. I’m not going to start a community because I’m lazy and I assume it will fail. Or perhaps I want to look at memes, and I find a meme community but it’s relatively small and inactive. There are 10 other meme communities in different instances, and the combined content would have satiated me, but there is no easy way for me view or find them, so I simply give up.
It’s crucial to direct new users to active communities before they abandon the platform entirely. Within the context of the current reddit exodus, we should be trying to harness the seeds of as many communities as possible, so that they might take root and grow on this platform. The current structure of the fediverse is not very conducive to doing so.
The solution might be as simple as a good tutorial with a list of recommended communities. Btw you’re a legend, I really appreciate you taking the time to explain and discuss this. I also wrote an entire response longer than this and it was lost to the ether(my spotty wifi), so I had to rewrite this as best as I could.
I still think that it’s too confusing to newcomers at the moment. For instance, I want to talk about soccer, but I have no idea where to find such a community.
Yeah, agree. I don’t think there are easy answers though. There is a /communities/ url at each instance, which would be a lot more useful if it was populated with the list of all communities the instance federated with. I think think the devs were nervous about that list being too expensive to federate properly, but it doesn’t feel to me like it should be a major problem.
Have you found https://browse.feddit.de/ yet? That is the Big List of communities. It’s still quite obtuse how you subscribe to them if you’re the first on your server to do so… but it’s at least a place to look.
There’s also (just as of today)
lemmy.directory
. This is an instance where somebody is attempting to subscribe to every lemmy community in order to create an /r/all equivalent. So browsing https://lemmy.directory/home/data_type/Post/listing_type/All/sort/Active/page/1 will show you the firehose of posts from which you might pick out some communities that interest you. And https://lemmy.directory/communities/listing_type/All/page/1 should actually be a complete list of all communities as well, via the native lemmy community browser.These are all suggestions to help you personally, but they don’t invalidate your critique that community discovery is like… way too hard. It’s true.
Edit: Here’s a baller tutorial on community setup and discovery that includes how to subscribe to a remote community that no one else on your instance has found yet: https://sh.itjust.works/post/9162
Btw you’re a legend, I really appreciate you taking the time to explain and discuss this.
No sweat, I’m just trying to help people find ways to stick around my making their first days a little less confusing. Because the system overall isn’t easy, it helps to have a buddy while you’re getting your sea legs. Just trying to be that buddy in the hopes that it makes for a more lively place for me too once people get settled.
Cheers, mate.