I already get rate-limited like crazy on lemmy and there are only like 60,000 users on my instance. Is each instance really just one server or are there multiple containers running across several hosts? I’m concerned that federation will mean an inconsistent user experience. Some instances many be beefy, others will be under resourced… so the average person might think Lemmy overall is slow or error-prone.

Reddit has millions of users. How the hell is this going to scale? Does anyone have any information about Lemmy’s DB and architecture?

I found this post about Reddit’s DB from 2012. Not sure if Lemmy has a similar approach to ensure speed and reliability as the user base and traffic grows.

https://kevin.burke.dev/kevin/reddits-database-has-two-tables/

  • Irisos@lemmy.umainfo.live
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    1 year ago

    The database isn’t really the problem in the current state of things. The server is because:

    • Until 0.18 there was no caching (for the UI) and the poorly implemented websockets
    • The developers have admited that they aren’t proficient in SQL, in which case, why not using an ORM instead? Sure, they aren’t perfect but they will do better than the average developer at scale.
    • There is no queue system for activityPub requests
    • Because there is no queue, user requests and federation have the same priority when it shouldn’t and one can bottleneck the other
    • Live inserts are used meaning that regardless of the DB used, performance is going to be killed since inserting data 1 at a time several times a second is a major waste of resource

    Tl;dr: It’s trying to do everything and not that well. So users suffer because they have to share resources with non-UI related tasks.

    The database suffer because it has to do an insert of 1 object X 50 times in a second when it could do it once for all 50 items.

    Federation suffers because you can’t offload it to a seperate machine farm whose job will be to receive and send ActivityPub requests and send/read data from the correct queues to do so.

    • BitOneZero @ .world@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      1
      ·
      1 year ago

      Federation also does a lot of live HTTP connects to other peers. It looks up users for messages. The whole design is very resource intensive, one single vote, comment, post at a time. There is also a lot of boilerplate JSON overhead in sending something as simple as a single vote.

    • Netto Hikari@social.fossware.space
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      Thanks for the breakdown. I was aware of most of the issues with Lemmy. I noticed the weird layout of the database, so yeah…

      My database question however was in regards to Reddit, not Lemmy. Why use a relational database (“tables”) if you’re going to disregard most of the things a relational database does anyway… The “everything is a thing” line was why I questioned myself why Reddit wouldn’t use something like MongoDB instead.