Over the past 48 hours I have been glued to my screen trying to figure out how to make Beehaw more robust during this Reddit exodus.

My eyes are burning but I am thankful for so much financial support as well as the work of two sysadmins that have given us all a breath of fresh air.

One of the sysadmins was up until 2:30 am helping us out as a volunteer. I am so very grateful for persons such as this.

Thank you all for your continued support and patience.

  • GuyDudeman
    link
    fedilink
    English
    arrow-up
    26
    ·
    1 year ago

    You guys are amazing and are building something great here. Thank you for all you’re doing. Where can I make a donation?

  • PenguinCoder@beehaw.org
    link
    fedilink
    English
    arrow-up
    19
    ·
    1 year ago

    Thanks for sticking around during this influx and working your sysass off to improve the performance.

    Sleep is definitely well earned especially for the sysad who stayed so late.

  • anji@lemmy.anji.nl
    link
    fedilink
    arrow-up
    18
    ·
    1 year ago

    Thanks Beehaw admins & mods & sysadmins. Even before this week it felt like such a welcoming community with good principles. Beehaw made me want to try Lemmy again. And it’s been fun. We appreciate you.

    Get ready to call on more help with scaling, because if Reddit shuts down their API in July looking back this week may seem like a small event in comparison.

  • Pigeon@beehaw.org
    link
    fedilink
    English
    arrow-up
    15
    ·
    1 year ago

    Thank you for all your hard work and for being so welcoming even though we are causing you problems!

    I gotta say, it’s so nice to think this place is run by people and not by megamaniacal megacorp data miners out to exploit everyone.

  • argv_minus_one@beehaw.org
    link
    fedilink
    English
    arrow-up
    14
    ·
    1 year ago

    It seems to be working a lot better now than it was earlier today, so I’d say you and your sysadmins did a great job. Thank you!

  • Mindless_Enigma@beehaw.org
    link
    fedilink
    English
    arrow-up
    12
    ·
    1 year ago

    Thank you all for the work you’ve put in to keep things up and running! I can’t imagine how many fires can pop up when a user base quadruples nearly overnight. Massive kudos to everyone that puts effort in to keep this place going. Go get that well deserved rest.

  • veroxii
    link
    fedilink
    English
    arrow-up
    12
    ·
    1 year ago

    Where are the bottlenecks? Frontend servers or on the db? I have a lot of experience running postgres at scale.

      • argv_minus_one@beehaw.org
        link
        fedilink
        English
        arrow-up
        10
        ·
        edit-2
        1 year ago

        This must be the one. Those are some monster queries.

        I’m no database expert, but I wonder if it would be wise to break those up into multiple queries instead of joins. Joining post with person and community would result in a ton of duplicate data, wouldn’t it?

        I’m actually interested in what people have to say about this, because I have a project that’s kind of sensitive to database query performance, and I’m worried that I’ll find out about some performance bottleneck the hard way like Beehaw just did. The more I learn about the subject, before my project goes to production, the better!

        • veroxii
          link
          fedilink
          English
          arrow-up
          10
          ·
          1 year ago

          No, joins are always faster. If you ultimately need to combine the data for the app, the database will be faster than your code can do it, since that’s what it was built to do.

          • argv_minus_one@beehaw.org
            link
            fedilink
            English
            arrow-up
            3
            ·
            1 year ago

            Any idea why those queries are slow, then, if not because of all the duplicate data? Missing indices or something?

            • veroxii
              link
              fedilink
              English
              arrow-up
              8
              ·
              1 year ago

              Looking at the query I think it only returns a single row per post. So not really duplicate data. It all looks very straight forward and you’d think all the “_id” and “id” columns are indexed.

              I asked for an EXPLAIN ANALYZE plan to see what really happens and where the most time is spent.

              If it’s indexes we’ll see quickly. It might strangely be in the WHERE clause. Not sure what Hot_rank()'s implementation is. But we’ll find that out too if we can get the plan timings. Without looking at the numbers it’s all just guessing.

              And I can’t run them myself since I don’t have access to a busy instance with their amount of production data. It’s the thing about databases - what runs fast in dev, doesn’t always translate to real workloads.

              • argv_minus_one@beehaw.org
                link
                fedilink
                English
                arrow-up
                9
                ·
                1 year ago

                It’s the thing about databases - what runs fast in dev, doesn’t always translate to real workloads.

                Yeah, that’s what really scares me about database programming. I can have something work perfectly on my dev machine, but I’ll never find out how well it works under a real-world workload, and my employer really doesn’t like it when stuff blows up in a customer-visible way.

                I decided to write a stress-test tool for my project that generates a bunch of test data and then hits the server with far more concurrent requests than I expect to see in production any time soon. Sure enough, the first time I ran it, my application crashed and burned just like Beehaw did. Biggest problem: I was using serializable transactions everywhere, and with lots of concurrent requests, they’d keep failing and retrying over and over, never making progress.

                That’s a lesson I’m glad I didn’t learn in production…but it makes me wonder what lessons I will learn in production.

        • darkfoe@lemmy.serverfail.party
          link
          fedilink
          English
          arrow-up
          9
          ·
          1 year ago

          I’m no dev on the project myself, and I haven’t studied that query enough to know, but yeah they are some monster queries. I’d have to fire up pgadmin and try them out on my personal instance to understand them better.

          But as for your curiosity, I had an issue with a microservice at my job that is very sensitive to database latency (makes one call, roughly 600 requests per second on average, up to 1200 in spikes.) We solved an issue with some of the joins going on by making a materialised view for what we knew didn’t change more than once per day, which we then scheduled with pg_cron to refresh concurrently (concurrently being key so we don’t lock out reads.) Reduced our query times significantly - ie, down to milliseconds vs up to 20 seconds.

          Really boils down to how often some data needs to change, so you can make some sort of way of caching it.

      • veroxii
        link
        fedilink
        English
        arrow-up
        7
        ·
        1 year ago

        Thanks. I’ve asked in there for some EXPLAIN ANALYZE plans, so we can see which parts are slow and why. I also think running pgbouncer or a similar connection pooler would be a good idea. Running a production public facing service with only 5 db connections is asking for trouble.

    • Helix 🧬@feddit.de
      link
      fedilink
      English
      arrow-up
      9
      ·
      edit-2
      1 year ago

      I had a first look last night and my preliminary findings are that the DB is actually fine, but the frontend (server.js from Lemmy-ui) and the Pict-rs service are the culprits.

      Pictrs doesn’t have lots of configuration options and often calls on exiftool to to… whatever. Pictrs also seems to have to deal with a few large pictures, so we reduced the upload size.

      Sadly the error users get is some JSON parse error, so this doesn’t really tell them the issue and they might try to upload again.

      Found a few things which we might want to open a bug report for, but we first have to set up a test server to check my suspicions.

      • darkfoe@lemmy.serverfail.party
        link
        fedilink
        English
        arrow-up
        5
        ·
        1 year ago

        I’ve got my personal instance (replying from it now) setup on a 6$/mo droplet, if we wanted to try some simple things.

        You and I I think are on different timezones, keep missing ya on discord!

  • animist@lemmy.one
    link
    fedilink
    English
    arrow-up
    11
    ·
    1 year ago

    Seriously thank you for all of this as a convert. I left digg for reddit when it went to shit and now I’ve left reddit for lemmy and mastodon. Happy to be in these communities

  • followthewhiterabbit@beehaw.org
    link
    fedilink
    English
    arrow-up
    10
    ·
    1 year ago

    Really appreciate the hard work.

    I think you’ve built a great community here, and the hard work isn’t going unnoticed. Thanks all!

  • lunasloth@beehaw.org
    link
    fedilink
    English
    arrow-up
    9
    ·
    1 year ago

    Echoing other comments in saying thanks for all your (collectively) hard work! I’m one of the new joiners, and grateful for the welcoming vibe here even despite the scramble we’re causing.