Ugh.

  • RoundSparrow
    link
    fedilink
    arrow-up
    1
    arrow-down
    3
    ·
    edit-2
    11 months ago

    No. Care to explain please?

    On Saturday July 22, 2023… the SysOp of Lemmy.ca got so frustrated with constant overload crashes they cloned their PostgreSQL database and ran AUTO_EXPLAIN on it. They found 1675 rows being written to disk (missive I/O, PostgreSQL WAL activity) for every single UPDATE SQL to a comment/post. They shared details on Github and the PostgreSQL TRIGGER that Lemmy 0.18.2 and earlier had was scrutinized.

    • sabreW4K3@lemmy.tf
      link
      fedilink
      arrow-up
      3
      ·
      11 months ago

      You’ve become fixated on this issue but if you look at the original bug, phiresky says it’s fixed in 0.18.3

      • RoundSparrow
        link
        fedilink
        arrow-up
        1
        ·
        11 months ago

        The issue isn’t who fixed it it, the issue is the lack of testing to find these bugs. It was there for years before anyone noticed it was hammering PostgreSQL on every new comment and post to update data that the code never read back.

        There have been multiple data overrun situations, wasting server resources.

        • sabreW4K3@lemmy.tf
          link
          fedilink
          arrow-up
          2
          ·
          11 months ago

          But now Lemmy has you and Phiresky looking over the database and optimizing things so things like this should be found a lot quicker. I think you probably underestimate your value and the gratitude people feel for your insight and input.

    • favrionOP
      link
      fedilink
      arrow-up
      3
      arrow-down
      1
      ·
      11 months ago

      In layman’s terms please?

      • fiat_lux@kbin.social
        link
        fedilink
        arrow-up
        3
        ·
        11 months ago

        Every time you perform an action like commenting, you expect it to maybe update a few things. The post will increase the number of comments so it updates that, your comment is added to the list so those links are created, your comment is written to the database itself, etc. Each action has a cost, let’s say it costs a dollar every update. Then each comment would cost $3, $1 for each action.

        What if instead of doing 3 things each time you posted a comment, it did 1300 things. And it did the same for everyone else posting a comment. Each comment now costs $1300. You would run out of cash pretty quickly unless you were a billionaire. Using computing power is like spending cash, and lemmy.world are not billionaires.

      • RoundSparrow
        link
        fedilink
        arrow-up
        1
        arrow-down
        4
        ·
        edit-2
        11 months ago

        What are you asking for? lemmy.ml is the official developers server, and it crashes constantly, every 10 minutes it ERROR out, for 65 days in a row.

    • r00ty@kbin.life
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      11 months ago

      I don’t know that it’s a DB design flaw if we’re talking about federation messages to other instances inboxes (which created rows of that magnitude for updates does sound like federation messages outbound to me). Those need to be added somewhere. On kbin, if installed using the instructions as-is, we’re using rabbitmq (but there is an option to write to db). But failures do end up hitting sql still and rabbit is still storing this on the drive. So unless you have a dedicated separate rabbitmq server it makes little difference in terms of hits to storage.

      It’s hard to avoid storing them somewhere, you need to be able to know when they’ve been sent or if there are temporary errors store them until they can be sent. There needs to be a way to recover from a crash/reboot/restart of services and handle other instances being offline for a short time.

      EDIT: Just read the issue (it’s linked a few comments down) it actually looks like a weird pgsql reaction to a trigger. Not based on the number of connected instances like I thought.

        • r00ty@kbin.life
          link
          fedilink
          arrow-up
          1
          ·
          11 months ago

          Yep I read through it in the end. Looks like they were applying changes to all rows in a table instead of just one on a trigger. The first part of my comment was based on reading comments here. I’d not seen the link to the issue at that stage. Hence the edit I made.