cross-posted from: https://beehaw.org/post/282116

We’ve posted a number of times about our increasing storage issues. We’re currently at the cusp of using 80% of the 25gb we have available in the current tier for the online service we run this instance on. This has caused some issues with the server crashing in recent days.

We’ve been monitoring and reporting on this progress occasionally, including support requests and comments on the main lemmy instance. Of particular note, it seems that pictures tend to be the culprit when it comes to storage issues.

The last time a discussion around pict-rs came up, the following comment stuck out to me as a potential solution

Storage requirements depend entirely on the amount of images that users upload. In case of slrpnk.net, there are currently 1.6 GB of pictrs data. You can also use s3 storage, or something like sshfs to mount remote storage.

Is there anyone around who is technically proficient enough to help guide us through potential solutions using “something like sshfs” to mount remote storage? As it currently exists, our only feasible option seems to be upgrading from $6/month to $12/month to double our current storage capacity (25GB -> 50 GB) which seems like an undesirable solution.

  • nutomicA
    link
    fedilink
    arrow-up
    7
    ·
    2 years ago

    If you search for sshfs you can find lots of different guides, like the one below. Basically you need a normal ssh login for another server, and use that with sshfs command to mount a remote folder to the pictrs folder on your Lemmy server.

    https://www.digitalocean.com/community/tutorials/how-to-use-sshfs-to-mount-remote-file-systems-over-ssh

    If this setup is slow then you can setup caching in nginx for image files on your fast server SSD. That way only a fixed amount of storage will be used to store frequently loaded images. Only images which are older or viewed less frequently will be slower to load as they need to be fetched from the remote server.

    https://docs.nginx.com/nginx/admin-guide/content-cache/content-caching/

  • poVoq@slrpnk.net
    link
    fedilink
    arrow-up
    4
    ·
    edit-2
    2 years ago

    Has Pict-rs implemented those changes to reduce image size already? My guess would be that maybe it is sufficient to just prune older large images with a script?

    Edit: looks like Pict-rs 4.0 is still in beta, but it should probably fix this issue to some extend, so only older images would need to be pruned or down-scaled.

  • wintermute@feddit.de
    link
    fedilink
    Deutsch
    arrow-up
    3
    ·
    2 years ago

    I saved a few gig by capping syslog files by setting
    SystemMaxUse=500M SystemMaxFileSize=50M

    in
    /etc/systemd/journald.conf

  • poVoq@slrpnk.net
    link
    fedilink
    arrow-up
    3
    ·
    edit-2
    2 years ago

    Besides my practical comment below: This is not something I can code myself, but I have been thinking if pict-rs implements some sort of image deduplication system, then it could be interesting for some Lemmy instances to form a data-storage collective and run a combined S3 backend via Garage for example:

    https://garagehq.deuxfleurs.fr/documentation/connect/apps/#lemmy

    I am willing to contribute storage (I have several TB), but I am somewhat bandwidth limited, so I need to be a bit careful with hosting too many images to not impact the other services that I run on the same connection.

    • Gaywallet (they/it)@beehaw.org
      link
      fedilink
      arrow-up
      3
      ·
      2 years ago

      I am willing to contribute storage (I have several TB), but I am somewhat bandwidth limited, so I need to be a bit careful with hosting too many images to not impact the other services that I run on the same connection.

      How would you accomplish this? I have plenty of bandwidth and plenty of storage I can subsection as a possible solution (hell even buying a raspberry pi and an old hard drive wouldn’t be all that expensive and potentially a fun project) but I really don’t even have an idea of how to connect this to the lemmy instance

      • poVoq@slrpnk.net
        link
        fedilink
        arrow-up
        2
        ·
        edit-2
        2 years ago

        See the link above. You can configure Pict-rs 4.0 (beta, unreleased) to a S3 compatible storage. The Garage project is a S3 compatible storage especially aimed at distributed self-hosters, but with some latency caveats aside something like Minio would probably also work.

        S3 storage allows to redirect users directly to a storage location (with a public IP) instead of the main server loading images from a storage location and serving that to the user itself. Kind of like a CDN works.

        • Gaywallet (they/it)@beehaw.org
          link
          fedilink
          arrow-up
          1
          ·
          2 years ago

          Is there any way to do this and avoid having to use S3? I don’t want a surprise bill from Amazon because we exceeded some thresholds they have on the free tier (nor do I want to have to make new free tiers every 12 months).

            • Gaywallet (they/it)@beehaw.org
              link
              fedilink
              English
              arrow-up
              1
              ·
              2 years ago

              Okay so I need to be sure I have something that can make sense of s3 calls to storage, I feel like we’re getting closer, just still way out of my own technological depth.

              • poVoq@slrpnk.net
                link
                fedilink
                arrow-up
                3
                ·
                edit-2
                2 years ago

                Pict-rs that Lemmy uses as the image storage is able to do S3 compatible storage API calls in the upcoming 4.0 version.

                There are also many self-hosted options that you can install on your server to provide a S3 compatible storage API. The probably best known open-source software for that is called Minio. It is however more meant for data-centers with fast low latency connections or local network only.

                The above mentioned Garage software is unique in that it is specifically designed to work in less than ideal networking conditions typical to self-hosted servers.

    • Gaywallet (they/it)@beehaw.org
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 years ago

      If it’s only used for images I’m not all that concerned… images not loading when the rest of the page loads really only matters when the focus of the post is a meme, and I’m not too concerned about those not loading.

      • Chris Remington@beehaw.orgOP
        link
        fedilink
        English
        arrow-up
        3
        ·
        2 years ago

        pict-rs, more than likely, handles all images on the site. That would include our personal icons, the community icons, community banners, our site logo, etc. Having all of that missing, or not loading properly, would make Beehaw look like shit. After decades of web development experience under my belt, I know for certain that people eat with their eyes.