Lemmy.world had to shut down the front page and put up a message about the load and a graph. They seem to chalk it down to the nature of social media sites to attract attacks.

I’d hack up the Rust code to have self-awareness of concurrency with PostgreSQL and return a new busy error.

Federation connections, RSS feed, API - and any other method that is hitting the database needs to have a concurrency count in the Rust code and an error message system for busy.

I’d probably build a a class to help with this and once concurrency for an API is over 5 mark the high water with a timestamp and start doing logic based on elapsed time. If > 5 and elapsed time exceeds a threshold (say 1 minute), then return the busy error.

is Prometheus the right way to expose these numbers for operators wanting to know about the thresholds.? I’d probably add a dedicated log file to track concurrency thresholds and busy errors.

the front-end apps also need to be caching “Trending communities”, I think lemmy-ui is still pulling that live from PostgreSQL for every refresh of the page. I need to check if anyone has added that.

  • RoundSparrowOP
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    so, some work to do:

    1. rework the testing scripts so that they don’t actually delete data each run.
    2. can I use bash script to get pg_stat_statements between individual tests