I recently moved my files to a new zfs-pool and used that chance to properly configure my datasets.

This led me to discovering zfs-deduplication.

As most of my storage is used by my jellyfin library (~7-8Tb), which is mostly uncompressed bluray rips I thought I might be able to save some storage using deduplication in addition to compression.

Has anyone here used that for similar files before? What was your experience with it?

I am not too worried about performance. The dataset in question is rarely changed. Basically only when I add more media every couple of months. I also have overshot my cpu-target when originally configuring my server so there is a lot of headroom there. I have 32Gb of ram which is not really fully utilized either (but I also would not mind upgrading to 64 too much).

My main concern is that I am unsure it is useful. I suspect just because of the amount of data and similarity in type there would statistically be a lot of block-level duplication but I could not find any real world data or experiences on that.

  • friend_of_satan@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    16 hours ago

    I was also going to link this. I started using zfs 10-ish years ago and used dedup when it came out, and it was really not worth it except for archiving a bunch of stuff I knew had gigs of duplicate data. Performance was so poor.