I already get rate-limited like crazy on lemmy and there are only like 60,000 users on my instance. Is each instance really just one server or are there multiple containers running across several hosts? I’m concerned that federation will mean an inconsistent user experience. Some instances many be beefy, others will be under resourced… so the average person might think Lemmy overall is slow or error-prone.
Reddit has millions of users. How the hell is this going to scale? Does anyone have any information about Lemmy’s DB and architecture?
I found this post about Reddit’s DB from 2012. Not sure if Lemmy has a similar approach to ensure speed and reliability as the user base and traffic grows.
https://kevin.burke.dev/kevin/reddits-database-has-two-tables/
The database isn’t really the problem in the current state of things. The server is because:
Tl;dr: It’s trying to do everything and not that well. So users suffer because they have to share resources with non-UI related tasks.
The database suffer because it has to do an insert of 1 object X 50 times in a second when it could do it once for all 50 items.
Federation suffers because you can’t offload it to a seperate machine farm whose job will be to receive and send ActivityPub requests and send/read data from the correct queues to do so.
Federation also does a lot of live HTTP connects to other peers. It looks up users for messages. The whole design is very resource intensive, one single vote, comment, post at a time. There is also a lot of boilerplate JSON overhead in sending something as simple as a single vote.
Thanks for the breakdown. I was aware of most of the issues with Lemmy. I noticed the weird layout of the database, so yeah…
My database question however was in regards to Reddit, not Lemmy. Why use a relational database (“tables”) if you’re going to disregard most of the things a relational database does anyway… The “everything is a thing” line was why I questioned myself why Reddit wouldn’t use something like MongoDB instead.