Lemmy Pict-rs storage requirements?

I might take over one of these one-year-free hosted Lemmy instances on my server infrastructure, but I read several times now that Lemmy’s image hosting system Pict-rs is using a lot of storage quickly.

The server I could run this on is limited to 32gb ssd storage with no easy way to expand it.

Is there some way to limit the image storage use and automatically prune old images that are not user or community icons or such?

Aode (He/They)
link
fedilink
73 monatoj

pict-rs doesn’t keep track of how often it serves different images, so there’s not a good metric for pruning old images. That said, 0.4 will introduce functionality for cleaning up processed images (e.g. resizes/thumbnails), removing their files & metadata. If they are viewed again, they will be re-generated.

0.4 will also include the ability to scale down images on upload, rather than storing the original resolution. This is not yet implemented, but it’s on my roadmap.

All this said, it is already possible to use pict-rs with object storage (s3-compatible), rather than block storage. That’s a good option if your hosting provider offers it

poVoq
link
fedilink
43 monatoj

Actually S3 compatible interface might be interesting to link Pict-rs to Garage

Aode (He/They)
link
fedilink
22 monatoj

I am aware of garage, but haven’t tested it yet with pict-rs. It’s a cool project for sure

poVoq
creator
link
fedilink
3
edit-2
3 monatoj

That sounds promising. Any idea when 0.4 will be released?

Object-storage on large cloud providers is not an option for me for various reasons (privacy, legal etc.).

Aode (He/They)
link
fedilink
43 monatoj

I can only say “when it’s ready.” I think most of what I want to include in 0.4 is there, but I don’t have a ton of time to work on it currently. I might see if I can get my last feature changes in this weekend, then it will be a matter of ensuring the 0.3 -> 0.4 upgrade is smooth, and that storage migration is solid

Aode (He/They)
link
fedilink
12 monatoj

Update on this: I got the feature work done this weekend, so now I’ll be testing it a bunch for upgrades and storage migrations

@Echedenyan
admin
link
fedilink
13 monatoj

Is deduplication supported by re-using images already in storage if newly upload images share the same hash with them?

Aode (He/They)
link
fedilink
12 monatoj

Yes. It uses sha256 rather than perceptual hashing, but that’s Good Enough™️

@Echedenyan
admin
link
fedilink
12 monatoj

Why not SHA-512 or SHA3?

Aode (He/They)
link
fedilink
12 monatoj

I chose it at the start of the project 🤷

@Echedenyan
admin
link
fedilink
12 monatoj

Maybe is it worthy to make a smooth change to this in the future? https://en.wikipedia.org/wiki/SHA-3#Comparison_of_SHA_functions

@nutomic
admin
link
fedilink
43 monatoj

Storage requirements depend entirely on the amount of images that users upload. In case of slrpnk.net, there are currently 1.6 GB of pictrs data. You can also use s3 storage, or something like sshfs to mount remote storage.

poVoq
creator
link
fedilink
3
edit-2
3 monatoj

How much of that is cached from federated instances though? I can hardly imagine a low-traffic community like that uploaded 1.6GB of their own images already. If it is mostly cached then that can increase very quickly as new users subscribe to additional communities on other servers.

@nutomic
admin
link
fedilink
43 monatoj

There is no caching, images from other instances are loaded directly from the remote server by your browser.

poVoq
creator
link
fedilink
4
edit-2
3 monatoj

I see, well that is one risk less then. I guess with automatic down-scaling in pict-rs 0.4 it will be mostly solved as there will not be a bunch of 5mb direct uploads.

Edit: well thumbnails at least are definitely cached, larger images too, I just tested it on slrpnk.net Edit: odd, but not all of them. Something is strange… Ah I think I know what is happening… actual user uploads do not get cached, but images from linked websites do, even if the origin is a federated instance. But those website images are usually quite well optimized.

Dessalines
mod
admin
link
fedilink
23 monatoj

Now you know why there needs to be a decentralized picture storage hosting that works for the web, in the same way torrents do for even larger data like video.

You have tons of servers hosting the exact same pictures needlessly while sharing none of the hosting costs.

poVoq
creator
link
fedilink
43 monatoj

That was the original idea of IPFS, no? Just that they pivoted now to trying to sell you filecoins :(

Aode (He/They)
link
fedilink
33 monatoj

I’m not against including an ipfs layer in pict-rs, but the complexity would go way up. Federating an image between lemmy servers would require sending the ipfs uri between servers via activitypub, and then each receiving server sending that uri to pict-rs. pict-rs would then need to decide, on each server, if the ipfs-stored image matches that servers’ image requirements (max filesize, max dimensions, etc), and if it does, then that pict-rs server would request to pin the image. I don’t know exactly how ipfs pinning works, but ideally it would only be stored locally if it isn’t already stored in at least N other locations. If the remote image doesn’t match the local server’s configuration, it could either be rejected or downloaded & processed (resized etc).

Serving ipfs-stored images that aren’t replicated locally might also be slow, but I won’t know for sure unless I actually try building this out.

Cold Hotman
link
fedilink
23 monatoj

How would federated posts look if the original server went down? Just a 404 not found on the picture and the discussion left intact?

poVoq
creator
link
fedilink
33 monatoj

This seems to be the case right now, yes.

@Echedenyan
admin
link
fedilink
12 monatoj

The great thing is that deduplication would also be built-in with the IPFS layer, apart of the obvious advantages.

@Echedenyan
admin
link
fedilink
13 monatoj

Oh, you said it too.

Dessalines
mod
admin
link
fedilink
13 monatoj

Yeah I think so, but I have no idea how “trust” works in IPFS.

In torrents, you have to explicitly be seeding that torrent: if you don’t want to seed the file(s), you remove the torrent. With IPFS I think people can just throw whatever in there.

@Echedenyan
admin
link
fedilink
23 monatoj

Pictrs over IPFS could be a good start.

@Gaywallet@beehaw.org
link
fedilink
43 monatoj

We would be very interested in a better method for limitation on this as well - some kind of age and size limits or automatic pruning would be wonderful.

Support / questions about Lemmy.

  • 0 users online
  • 1 user / day
  • 1 user / week
  • 14 users / month
  • 121 users / 6 months
  • 787 subscribers
  • 405 Posts
  • 2.24K Comments
  • Modlog