Some learning about JOIN filled queries like the SELECT that is used to list post_aggregates table when browsing postings on Lemmy.
https://dba.stackexchange.com/questions/155972/postgres-join-conditions-vs-where-conditions
“Logically, it makes no difference at all whether you place conditions in the join clause of an INNER JOIN or the WHERE clause of the same SELECT. The effect is the same. (Not the case for OUTER JOIN!)”
Learning about join_collapse_limit
Joins after Picking post list
Research:
More on LATERAL, including LIMIT before JOIN:
https://stackoverflow.com/questions/28550679/what-is-the-difference-between-a-lateral-join-and-a-subquery-in-postgresql
https://dba.stackexchange.com/questions/229214/how-to-apply-outer-limit-offset-and-filters-in-the-subquery-to-avoid-grouping-ovThe order of JOIN matters, and I don’t think Lemmy’s Rust code evolved at all with that in mind…
And the difference between logged-in JOIN vs. anonymous logged-out JOIN count when the Rust code builds the query…
Rust Diesel at least has a concept of session parameters, but I haven’t found how you can set your own…
https://docs.diesel.rs/master/diesel/result/enum.ConnectionError.html
ok, I created post_view_anonymous.rs and a new API path for listing posts anonymous. I stripped out every bit of code that referenced the logged-in user ID.
HTTP request{http.method=GET http.scheme=“http” http.host=lemmy-alpha:8541 http.target=/api/v3/post/listanon otel.kind=“server” request_id=0ab40fcb-81e1-4839-96c5-5a60b2f57176}:list_posts_anonymous{data=
Query(GetPosts { type_: None, sort: None, page: None, limit: None, community_id: None, community_name: None, saved_only: None, liked_only: None, disliked_only: None, moderator_view: None, auth: None })}: lemmy_db_views::post_view_anonymous: Post View Anon Query:
Query { sql: “SELECT "post"."id", "post"."name", "post"."url", "post"."body", "post"."creator_id", "post"."community_id", "post"."removed", "post"."locked", "post"."published", "post"."updated", "post"."deleted", "post"."nsfw", "post"."embed_title", "post"."embed_description", "post"."thumbnail_url", "post"."ap_id", "post"."local", "post"."embed_video_url", "post"."language_id", "post"."featured_community", "post"."featured_local", "person"."id", "person"."name", "person"."display_name", "person"."avatar", "person"."banned", "person"."published", "person"."updated", "person"."actor_id", "person"."bio", "person"."local", "person"."private_key", "person"."public_key", "person"."last_refreshed_at", "person"."banner", "person"."deleted", "person"."inbox_url", "person"."shared_inbox_url", "person"."matrix_user_id", "person"."admin", "person"."bot_account", "person"."ban_expires", "person"."instance_id", "community"."id", "community"."name", "community"."title", "community"."description", "community"."removed", "community"."published", "community"."updated", "community"."deleted", "community"."nsfw", "community"."actor_id", "community"."local", "community"."private_key", "community"."public_key", "community"."last_refreshed_at", "community"."icon", "community"."banner", "community"."followers_url", "community"."inbox_url", "community"."shared_inbox_url", "community"."hidden", "community"."posting_restricted_to_mods", "community"."instance_id", "community"."moderators_url", "community"."featured_url", ("community_person_ban"."id" IS NOT NULL), "post_aggregates"."id", "post_aggregates"."post_id", "post_aggregates"."comments", "post_aggregates"."score", "post_aggregates"."upvotes", "post_aggregates"."downvotes", "post_aggregates"."published", "post_aggregates"."newest_comment_time_necro", "post_aggregates"."newest_comment_time", "post_aggregates"."featured_community", "post_aggregates"."featured_local", "post_aggregates"."hot_rank", "post_aggregates"."hot_rank_active", "post_aggregates"."community_id", "post_aggregates"."creator_id", "post_aggregates"."controversy_rank" FROM (((("post_aggregates" INNER JOIN "person" ON ("post_aggregates"."creator_id" = "person"."id")) INNER JOIN "community" ON ("post_aggregates"."community_id" = "community"."id")) LEFT OUTER JOIN "community_person_ban" ON (("post_aggregates"."community_id" = "community_person_ban"."community_id") AND ("community_person_ban"."person_id" = "post_aggregates"."creator_id"))) INNER JOIN "post" ON ("post_aggregates"."post_id" = "post"."id")) WHERE (((((("community"."deleted" = $1) AND ("post"."deleted" = $2)) AND ("community"."removed" = $3)) AND ("post"."removed" = $4)) AND ("post"."nsfw" = $5)) AND ("community"."nsfw" = $6)) ORDER BY "post_aggregates"."featured_local" DESC , "post_aggregates"."hot_rank" DESC , "post_aggregates"."published" DESC LIMIT $7 OFFSET $8”,
binds: [false, false, false, false, false, false, 10, 0] }
I’m studying: LEFT OUTER JOIN “community_person_ban” ON ((“post_aggregates”.“community_id” = “community_person_ban”.“community_id”) AND (“community_person_ban”.“person_id” = “post_aggregates”.“creator_id”))) … and the SELECT field has (“community_person_ban”.“id” IS NOT NULL)
I’m concerned a ban-happy community ends up having thousands of rows of people they have banned, and every SELECT to list posts has to process that.
Revised Rust code…
HTTP request{http.method=GET http.scheme=“http” http.host=lemmy-alpha:8541 http.target=/api/v3/post/listanon otel.kind=“server” request_id=65f254ad-0569-4126-823a-620477fa73bc}:list_posts_anonymous{data=Query(GetPosts { type_: None, sort: None, page: None, limit: None, community_id: None, community_name: None, saved_only: None, liked_only: None, disliked_only: None, moderator_view: None, auth: None })}:
lemmy_db_views::post_view_anonymous: Post View Anon Query:
Query { sql: “SELECT "post"."id", "post"."name", "post"."url", "post"."body", "post"."creator_id", "post"."community_id", "post"."removed", "post"."locked", "post"."published", "post"."updated", "post"."deleted", "post"."nsfw", "post"."embed_title", "post"."embed_description", "post"."thumbnail_url", "post"."ap_id", "post"."local", "post"."embed_video_url", "post"."language_id", "post"."featured_community", "post"."featured_local", "person"."id", "person"."name", "person"."display_name", "person"."avatar", "person"."banned", "person"."published", "person"."updated", "person"."actor_id", "person"."bio", "person"."local", "person"."private_key", "person"."public_key", "person"."last_refreshed_at", "person"."banner", "person"."deleted", "person"."inbox_url", "person"."shared_inbox_url", "person"."matrix_user_id", "person"."admin", "person"."bot_account", "person"."ban_expires", "person"."instance_id", "community"."id", "community"."name", "community"."title", "community"."description", "community"."removed", "community"."published", "community"."updated", "community"."deleted", "community"."nsfw", "community"."actor_id", "community"."local", "community"."private_key", "community"."public_key", "community"."last_refreshed_at", "community"."icon", "community"."banner", "community"."followers_url", "community"."inbox_url", "community"."shared_inbox_url", "community"."hidden", "community"."posting_restricted_to_mods", "community"."instance_id", "community"."moderators_url", "community"."featured_url", "post_aggregates"."id", "post_aggregates"."post_id", "post_aggregates"."comments", "post_aggregates"."score", "post_aggregates"."upvotes", "post_aggregates"."downvotes", "post_aggregates"."published", "post_aggregates"."newest_comment_time_necro", "post_aggregates"."newest_comment_time", "post_aggregates"."featured_community", "post_aggregates"."featured_local", "post_aggregates"."hot_rank", "post_aggregates"."hot_rank_active", "post_aggregates"."community_id", "post_aggregates"."creator_id", "post_aggregates"."controversy_rank" FROM ((("post_aggregates" INNER JOIN "person" ON ("post_aggregates"."creator_id" = "person"."id")) INNER JOIN "community" ON ("post_aggregates"."community_id" = "community"."id")) INNER JOIN "post" ON ("post_aggregates"."post_id" = "post"."id")) WHERE (((((("community"."deleted" = $1) AND ("post"."deleted" = $2)) AND ("community"."removed" = $3)) AND ("post"."removed" = $4)) AND ("post"."nsfw" = $5)) AND ("community"."nsfw" = $6)) ORDER BY "post_aggregates"."featured_local" DESC , "post_aggregates"."hot_rank" DESC , "post_aggregates"."published" DESC LIMIT $7 OFFSET $8”
, binds: [false, false, false, false, false, false, 10, 0] }
“I just discovered join_collapse_limit has been preventing the PostgreSQL planner from finding a much better join order. In my case, increasing the limit to 10 (from the default of 8) allowed the planner to improve search time from ~30 secs to ~1 ms, which is much more acceptable.”
Docs “By default, this variable is set the same as from_collapse_limit, which is appropriate for most uses.”
Could very well apply to Lemmy!!
I think 8 is a concern for Lemmy!
FROM (((((((((((((“post_aggregates”
INNER JOIN “person” ON (“post_aggregates”.“creator_id” = “person”.“id”))
INNER JOIN “community” ON (“post_aggregates”.“community_id” = “community”.“id”))
LEFT OUTER JOIN “community_person_ban” ON ((“post_aggregates”.“community_id” = “community_person_ban”.“community_id”) AND (“community_person_ban”.“person_id” = “post_aggregates”.“creator_id”))) INNER JOIN “post” ON (“post_aggregates”.“post_id” = “post”.“id”))
LEFT OUTER JOIN “community_follower” ON ((“post_aggregates”.“community_id” = “community_follower”.“community_id”) AND (“community_follower”.“person_id” = $1)))
LEFT OUTER JOIN “community_moderator” ON ((“post”.“community_id” = “community_moderator”.“community_id”) AND (“community_moderator”.“person_id” = $2)))
LEFT OUTER JOIN “post_saved” ON ((“post_aggregates”.“post_id” = “post_saved”.“post_id”) AND (“post_saved”.“person_id” = $3)))
LEFT OUTER JOIN “post_read” ON ((“post_aggregates”.“post_id” = “post_read”.“post_id”) AND (“post_read”.“person_id” = $4)))
LEFT OUTER JOIN “person_block” ON ((“post_aggregates”.“creator_id” = “person_block”.“target_id”) AND (“person_block”.“person_id” = $5)))
LEFT OUTER JOIN “post_like” ON ((“post_aggregates”.“post_id” = “post_like”.“post_id”) AND (“post_like”.“person_id” = $6)))
LEFT OUTER JOIN “person_post_aggregates” ON ((“post_aggregates”.“post_id” = “person_post_aggregates”.“post_id”) AND (“person_post_aggregates”.“person_id” = $7)))
LEFT OUTER JOIN “community_block” ON ((“post_aggregates”.“community_id” = “community_block”.“community_id”) AND (“community_block”.“person_id” = $8)))
LEFT OUTER JOIN “local_user_language” ON ((“post”.“language_id” = “local_user_language”.“language_id”) AND (“local_user_language”.“local_user_id” = $9)))
WHEREI’m thinking for Lemmy…
SET geqo = on; SET geqo_threshold = 12; SET from_collapse_limit = 15; SET join_collapse_limit = 15;