Lemmy.world shared a slow query, in detail! Yey!
SELECT post . id, post . name, post . url, post . body, post . creator_id, post . community_id, post . removed, post . locked, post . published, post . updated, post . deleted, post . nsfw, post . embed_title, post . embed_description, post . thumbnail_url, post . ap_id, post . local, post . embed_video_url, post . language_id, post . featured_community, post . featured_local, person . id, person . name, person . display_name, person . avatar, person . banned, person . published, person . updated, person . actor_id, person . bio, person . local, person . private_key, person . public_key, person . last_refreshed_at, person . banner, person . deleted, person . inbox_url, person . shared_inbox_url, person . matrix_user_id, person . admin, person . bot_account, person . ban_expires, person . instance_id, community . id, community . name, community . title, community . description, community . removed, community . published, community . updated, community . deleted, community . nsfw, community . actor_id, community . local, community . private_key, community . public_key, community . last_refreshed_at, community . icon, community . banner, community . followers_url, community . inbox_url, community . shared_inbox_url, community . hidden, community . posting_restricted_to_mods, community . instance_id, community . moderators_url, community . featured_url, community_person_ban . id, community_person_ban . community_id, community_person_ban . person_id, community_person_ban . published, community_person_ban . expires, post_aggregates . id, post_aggregates . post_id, post_aggregates . comments, post_aggregates . score, post_aggregates . upvotes, post_aggregates . downvotes, post_aggregates . published, post_aggregates . newest_comment_time_necro, post_aggregates . newest_comment_time, post_aggregates . featured_community, post_aggregates . featured_local, post_aggregates . hot_rank, post_aggregates . hot_rank_active, post_aggregates . community_id, post_aggregates . creator_id, community_follower . id, community_follower . community_id, community_follower . person_id, community_follower . published, community_follower . pending, post_saved . id, post_saved . post_id, post_saved . person_id, post_saved . published, post_read . id, post_read . post_id, post_read . person_id, post_read . published, person_block . id, person_block . person_id, person_block . target_id, person_block . published, post_like . score, coalesce ( ( post_aggregates . comments - person_post_aggregates . read_comments ), post_aggregates . comments ) FROM ( ( ( ( ( ( ( ( ( ( ( ( post_aggregates INNER JOIN person ON ( post_aggregates . creator_id = ? . id ) ) INNER JOIN post ON ( post_aggregates . post_id = ? . id ) ) INNER JOIN community ON ( post_aggregates . community_id = ? . id ) ) LEFT OUTER JOIN community_person_ban ON ( ( post_aggregates . community_id = ? . community_id ) AND ( community_person_ban . person_id = ? . creator_id ) ) ) LEFT OUTER JOIN community_follower ON ( ( post_aggregates . community_id = ? . community_id ) AND ( community_follower . person_id = ? ) ) ) LEFT OUTER JOIN post_saved ON ( ( post_aggregates . post_id = ? . post_id ) AND ( post_saved . person_id = ? ) ) ) LEFT OUTER JOIN post_read ON ( ( post_aggregates . post_id = ? . post_id ) AND ( post_read . person_id = ? ) ) ) LEFT OUTER JOIN person_block ON ( ( post_aggregates . creator_id = ? . target_id ) AND ( person_block . person_id = ? ) ) ) LEFT OUTER JOIN community_block ON ( ( post_aggregates . community_id = ? . community_id ) AND ( community_block . person_id = ? ) ) ) LEFT OUTER JOIN post_like ON ( ( post_aggregates . post_id = ? . post_id ) AND ( post_like . person_id = ? ) ) ) LEFT OUTER JOIN person_post_aggregates ON ( ( post_aggregates . post_id = ? . post_id ) AND ( person_post_aggregates . person_id = ? ) ) ) LEFT OUTER JOIN local_user_language ON ( ( post . language_id = ? . language_id ) AND ( local_user_language . local_user_id = ? ) ) ) WHERE ( ( ( ( ( ( ( ( community . removed = ? ) AND ( community . deleted = ? ) ) AND ( post . removed = ? ) ) AND ( post . deleted = ? ) ) AND ( community . local = ? ) ) AND ( ( community . hidden = ? ) OR ( community_follower . person_id = ? ) ) ) AND ( post . nsfw = ? ) ) AND ( community . nsfw = ? ) ) ORDER BY post_aggregates . featured_local DESC, post_aggregates . hot_rank DESC, post_aggregates . published DESC LIMIT ? OFFSET ?
!lemmyscale@lemmy.world discovered today, but no posts…
SELECT post . id, post . name, post . url, post . body, post . creator_id, post . community_id, post . removed, post . locked, post . published, post . updated, post . deleted, post . nsfw, post . embed_title, post . embed_description, post . thumbnail_url, post . ap_id, post . local, post . embed_video_url, post . language_id, post . featured_community, post . featured_local, person . id, person . name, person . display_name, person . avatar, person . banned, person . published, person . updated, person . actor_id, person . bio, person . local, person . private_key, person . public_key, person . last_refreshed_at, person . banner, person . deleted, person . inbox_url, person . shared_inbox_url, person . matrix_user_id, person . admin, person . bot_account, person . ban_expires, person . instance_id, community . id, community . name, community . title, community . description, community . removed, community . published, community . updated, community . deleted, community . nsfw, community . actor_id, community . local, community . private_key, community . public_key, community . last_refreshed_at, community . icon, community . banner, community . followers_url, community . inbox_url, community . shared_inbox_url, community . hidden, community . posting_restricted_to_mods, community . instance_id, community . moderators_url, community . featured_url, community_person_ban . id, community_person_ban . community_id, community_person_ban . person_id, community_person_ban . published, community_person_ban . expires, post_aggregates . id, post_aggregates . post_id, post_aggregates . comments, post_aggregates . score, post_aggregates . upvotes, post_aggregates . downvotes, post_aggregates . published, post_aggregates . newest_comment_time_necro, post_aggregates . newest_comment_time, post_aggregates . featured_community, post_aggregates . featured_local, post_aggregates . hot_rank, post_aggregates . hot_rank_active, post_aggregates . community_id, post_aggregates . creator_id, community_follower . id, community_follower . community_id, community_follower . person_id, community_follower . published, community_follower . pending, post_saved . id, post_saved . post_id, post_saved . person_id, post_saved . published, post_read . id, post_read . post_id, post_read . person_id, post_read . published, person_block . id, person_block . person_id, person_block . target_id, person_block . published, post_like . score, coalesce ( ( post_aggregates . comments - person_post_aggregates . read_comments ), post_aggregates . comments ) FROM ( ( ( ( ( ( ( ( ( ( ( ( post_aggregates INNER JOIN person ON ( post_aggregates . creator_id = ? . id ) ) INNER JOIN post ON ( post_aggregates . post_id = ? . id ) ) INNER JOIN community ON ( post_aggregates . community_id = ? . id ) ) LEFT OUTER JOIN community_person_ban ON ( ( post_aggregates . community_id = ? . community_id ) AND ( community_person_ban . person_id = ? . creator_id ) ) ) LEFT OUTER JOIN community_follower ON ( ( post_aggregates . community_id = ? . community_id ) AND ( community_follower . person_id = ? ) ) ) LEFT OUTER JOIN post_saved ON ( ( post_aggregates . post_id = ? . post_id ) AND ( post_saved . person_id = ? ) ) ) LEFT OUTER JOIN post_read ON ( ( post_aggregates . post_id = ? . post_id ) AND ( post_read . person_id = ? ) ) ) LEFT OUTER JOIN person_block ON ( ( post_aggregates . creator_id = ? . target_id ) AND ( person_block . person_id = ? ) ) ) LEFT OUTER JOIN community_block ON ( ( post_aggregates . community_id = ? . community_id ) AND ( community_block . person_id = ? ) ) ) LEFT OUTER JOIN post_like ON ( ( post_aggregates . post_id = ? . post_id ) AND ( post_like . person_id = ? ) ) ) LEFT OUTER JOIN person_post_aggregates ON ( ( post_aggregates . post_id = ? . post_id ) AND ( person_post_aggregates . person_id = ? ) ) ) LEFT OUTER JOIN local_user_language ON ( ( post . language_id = ? . language_id ) AND ( local_user_language . local_user_id = ? ) ) ) WHERE ( ( ( ( ( ( ( ( community . removed = ? ) AND ( community . deleted = ? ) ) AND ( post . removed = ? ) ) AND ( post . deleted = ? ) ) AND ( community . local = ? ) ) AND ( ( community . hidden = ? ) OR ( community_follower . person_id = ? ) ) ) AND ( post . nsfw = ? ) ) AND ( community . nsfw = ? ) ) ORDER BY post_aggregates . featured_local DESC, post_aggregates . hot_rank DESC, post_aggregates . published DESC LIMIT ? OFFSET ?
I put a new post in !mediascale@lemmy.ml
Oblivious unpleasant answer to scaling
I’m surprised lemmy.world hasn’t brought up one of the issues…
Maybe they just keep overlooking. I don’t think it is the number of users that is as high as the amount of data.
Archiving off PostgreSQL older data could help. Lemmy was designed with live-websockets and instant notification of every new comment… which worked fine with very little data in PostgreSQL.
Lemmy.world database keeps growing. A quick-fix is to devise a way to move votes/posts/comments out of the tables say 45 days older. Even archive out person records once those posts associated with that person are moved out.
The “dirty little secret” is that smaller instances people started 30 days ago run faster because THEY HAVE NO DATA in PostgreSQL!
I’d rather we get specific AUTO_EXPLAIN details out of Lemmy.world, but it is maybe EASIER than it is getting people to fix Rust SQL ORM statements.
People seem to have been given the impression that a full Lemmy instance is kind of like a “super client” to read content from.
What is really needed is an API caching layer… off-loading from lemmy.world and other overloaded servers would be best if API clients didn’t download the same content over and over.
I don’t mention it as often in public, but lemmy.world is unable to keep up with database size. Errors are common.
Posting users have made about less content on Lemmy platform due to lemmy.world overloads: https://lemmy.world/post/2856311
better shot where I didn’t cover up 18 second response time:
Also, a person can block a community and also subscribe to a community, and the query looks at these independent. ToDo: add testing code to confirm API behavior when a user blocks a community they are subscribed to and vise-versa.
A denormalized table for person community_id to match multiple states could be of benefit. Especially in edge cases where a particular login account has a lot of blocks or subscribes to filter out posts and comments from.
API returning deleted comments
For anonymous users, has lemmy_server main branch diverged in behavior from 0.18 branch? Need to review testing scripts and run against both.
I don’t like the JOIN approach lemmy uses. They may as well denormalize it all into one table with 200 fields for a post
Back when I did SQL with DB2 and PostgreSQL and hardware was far more limited… I always thought defensively.
If you have 11,000 posts in a community and you are requesting New/Hot/Active - do an IN clause that whacks them off at the pass.
especially since most people are focused on hitting page 1 and maybe page 2 and 3. You gotta slice that data down. The problem exists on New, so there is no aging out with Hot/Active.
Open-ended queries with LIMIT and JOIN can tangle up. Especially since post-aggregates id isn’t post_id?
What exactly is the relationship of post_aggregates and post?
aggregates id: 503771 to post id 2133833. Both tables have 502204 records in them, equal. But why is post ID incrementing so far ahead? Is federation pounding on the index with failed INSERT statements?
CouldntCreatePost can be returned even on UPDATE of a created post. Is this wrapped in transaction?
What exactly is going on with federated code and CouldntCreatePost?