RoundSparrowM to Lemmy Code / App Technical · 2 years ago

Lemmy_server 0.18.1 - Rust code, critical turning parameters and BOLO for symptoms, including log searches

6

5

Lemmy_server 0.18.1 - Rust code, critical turning parameters and BOLO for symptoms, including log searches

RoundSparrowM to Lemmy Code / App Technical · 2 years ago

6

From personal experience, I know Lemmy.ml, Beehaw.org, Lemmy.world are performing very badly. So far, I have not been able to convince any of hese big server operators to share in bulk their lemmy_server logging as to what is going on.

Tuning and testing is difficult because 1) the less data you have, the faster Lemmy becomes. The big servers have accumulated more data. 2) the less federation activity you have, the less likely you are to run into resource limits and timeout values. These big servers have large numbers of peer servers subscribing to communities.

Nevertheless, we need to do everything we can to try and help the project as a whole.

HTTP and Database Parameters

https://github.com/LemmyNet/lemmy/blob/0f91759e4d1f7092ae23302ccb6426250a07dab2/crates/db_schema/src/utils.rs#L45C1-L47C69

const FETCH_LIMIT_DEFAULT: i64 = 10;
pub const FETCH_LIMIT_MAX: i64 = 50;
const POOL_TIMEOUT: Option<Duration> = Some(Duration::from_secs(5));

https://github.com/LemmyNet/lemmy/blob/0f91759e4d1f7092ae23302ccb6426250a07dab2/src/lib.rs#L39

/// Max timeout for http requests
pub(crate) const REQWEST_TIMEOUT: Duration = Duration::from_secs(10);

See also that Lemmy Rust code has a 5-second default PostgreSQL connection timeout for pooling, and default of 5 pool instances. https://github.com/LemmyNet/lemmy/issues/3394

lemmy_server behavior

Exactly what gets logged in the Rust code if these values are too low? Can we run a less-important (testing) server with these values set to just 1 and look at what is being logged so we can notify server operators what to grep the logs for?

What are the symptoms?

What can we do to notify server operators that this is happening? Obviously a database resource suggests that using a database table to increase an error count might run into problems under heavy load. Can we have a connection to the database server with higher timeouts and a dedicated table (with no locks) outside the connection pool and have the error logic set a timestamp and count of when these resource limits are being hit in production?

Chat

RoundSparrowOPM
link
fedilink
arrow-up
1·
2 years ago
Another code value to get some tuning/behavior references on:

pub const FEDERATION_HTTP_FETCH_LIMIT: u32 = 50;

Lemmy Code / App Technical

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !lemmycode@lemmy.ml

The code and application behind Lemmy. Beta testing new releases, API coding, custom changes, adding new features, developers

See also: !lemmyperformance@lemmy.ml community, it’s not always clear which one to put a topic into. “lemmycode” I’m trying to be more into actual code change proposals.

!lemmydev@lemm.ee

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
1 user / week
1 user / month
1 user / 6 months
11 local subscribers
63 subscribers
20 Posts
17 Comments
Modlog

mods:
RoundSparrow