I have been working with pg_stat_statements extension to PG and it give us a way to see the actual SQL statements being executed by lemmy_server and the number of times they are being called.

This has less overhead than cranking up logging and several cloud computing services enable it by default (example) - so I don’t believe it will have a significant slow down of the server.

A DATABASE RESTART WILL BE REQUIRED

It does require that PostgreSQL be restarted. Which can take 10 or 15 seconds, typically.

Debian / Ubuntu install steps

https://pganalyze.com/docs/install/self_managed/02_enable_pg_stat_statements_deb

Following the conventions of “Lemmy from Scratch” server install commands:

sudo -iu postgres psql -c "ALTER SYSTEM SET shared_preload_libraries = 'pg_stat_statements';"

Followed by a restart of the PostgreSQL service.

  • RoundSparrowOPM
    link
    fedilink
    arrow-up
    4
    ·
    edit-2
    2 years ago

    Once the extension is installed and enabled, you can reset the stats whenever you want and it will start to build a new list of queries and I assume reset the average execution time.

    I found a curious query that seems to take 10 or 11 seconds on my server:

    SELECT "person"."id", "person"."name", "person"."display_name", "person"."avatar", "person"."banned", "person"."published", "person"."updated", "person"."actor_id", "person"."bio", "person"."local", "person"."banner", "person"."deleted", "person"."inbox_url", "person"."shared_inbox_url", "person"."matrix_user_id", "person"."admin", "person"."bot_account", "person"."ban_expires", "person"."instance_id", "person_aggregates"."id", "person_aggregates"."person_id", "person_aggregates"."post_count", "person_aggregates"."post_score", "person_aggregates"."comment_count", "person_aggregates"."comment_score" FROM ("person" INNER JOIN "person_aggregates" ON ("person_aggregates"."person_id" = "person"."id"))
    WHERE (("person"."admin" = $1) AND ("person"."deleted" = $2)) ORDER BY "person"."published"
    

    This query is being called over 1000 times an hour with my instance only really doing incoming Federation content. Should this be optimized by doing a subSELECT on the WHERE clause before doing the INNER JOIN?

    • RoundSparrowOPM
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      IMPORTANT NOTE: I am rusty on my PostgreSQL, it’s been over 10 years since I designed and ran a major time-critical website with it. That is not “SECONDS”, it is “MILLISECONDS”, but it is still a slow query being called frequently.