What's with all these hip filesystems and how are they different?

rutrum@lm.paradisus.day · edit-2 1 year ago

What's with all these hip filesystems and how are they different?

teawrecks@sopuli.xyz · 1 year ago

So let me give an example, and you tell me if I understand. If you change 1MB in the middle of a 1GB file, the filesystem is smart enough to only allocate a new 1MB chunk and update its tables to say “the first 0.5GB lives in the same old place, then 1MB is over here at this new location, and the remaining 0.5GB is still at the old location”?

If that’s how it works, would this over time result in a single file being spread out in different physical blocks all over the place? I assume sequential reads of a file stored contiguously would usually have better performance than random reads of a file stored all over the place, right? Maybe not for modern SSDs…but also fragmentation could become a problem, because now you have a bunch of random 1MB chunks that are free.

I know ZFS encourages regular “scrubs” that I thought just checked for data integrity, but maybe it also takes the opportunity to defrag and re-serialize? I also don’t know if the other filesystems have a similar operation.

d3Xt3r@lemmy.nz · edit-2 1 year ago

Not OP, but yes, that’s pretty much how it works. (ZFS scrubs do not defrgment data however).

Fragmentation isn’t really a problem for several reasons.

Some (most?) COW filesystems have mechanisms to mitigate fragmentation. ZFS, for instance, uses a special allocation strategy to minimize fragmentation and can reallocate data during certain operations like resilvering or rebalancing.
ZFS doesn’t even have a traditional defrag command. Because of its design and the way it handles file storage, a typical defrag process is not applicable or even necessary in the same way it is with other traditional filesystems
Btrfs too handles chunk allocation effeciently and generally doesn’t require defragmentation, and although it does have a defrag command, it’s almost never used by anyone, unless you have a special reason to (eg: maybe you have a program that is reading raw sectors of a file, and needs the data to be contiguous).
Fragmentation is only really an issue for spinning disks, however, that is no longer a concern for most spinning disk users because:
- Most home users who still have spinning disks use it for archival/long term storage/media that rarely changes (eg: photos, movies, other infrequently accessed data), so fragmentation rarely occurs here and even if it does, it’s not a concern.
- Power users typically have a DAS or NAS setup where spinning disks are in a RAID config with striping, so the spread of data across multiple sectors actually has an advantage for averaging out read times (so no file is completely stuck in the slow regions of a disk), but also, any performance loss is also generally negated because a single file can typically be read from two or more drives simultaneously, depending on the redundancy config.
Enterprise users also almost always use a RAID (or similar) setup, so the same as above applies. They also use filesystems like ZFS which employs heavy caching mechanisms, typically backed by SSDs/NVMes, so again, fragmentation isn’t really an issue.

teawrecks@sopuli.xyz · 1 year ago

Cool, good to know. I’d be interested to learn how they mitigate fragmentation, though. It’s not clear to me how COW could mitigate the copy cost without fragmentation, but I’m certain people smarter than me have been thinking about the problem for my whole life. I know spinning disks have their own set of limitations, but even SSDs perform better on sequential reads over random reads, so it seems like the preference would still be to not split a file up too much.

Atemu · 1 year ago

Because of its design and the way it handles file storage, a typical defrag process is not applicable or even necessary in the same way it is with other traditional filesystems

It absolutely is and it’s one of the biggest missing “features” in ZFS. If you use ZFS for performance-critical stuff, you have to make sure there’s always like >30% free space remaining because otherwise performance is likely to tank due to fragmentation.

When fragmentation happens to a significant degree (and it can happen even with a ton of free space), you’re fucked because data is sorta written in stone in ZFS. You have to re-create the entire dataset if it fragments too much. Pretty insane.

Btrfs too handles chunk allocation effeciently and generally doesn’t require defragmentation

Hahaha no. Fragmentation is the #1 performance issue in btrfs next to transaction syncs. You’re recommended to do frequent free-space defragmentation on btrfs using filtered balances to prevent it.

Additionally, in-file random writes cause a ton of fragmentation and kill performance in tasks which require it (VMs, DBs etc.).

it does have a defrag command, it’s almost never used by anyone

That’s absolutely false. As elaborated above, fragmentation is a big issue in btrfs. If you run a DB on CoW btrfs, you must manually defragment it to have any decent performance. (Autodefrag exists but it’s borderline broken.)

Additionally, changing transparent compression is also done via defragmentation.

Fragmentation is only really an issue for spinning disks

This is false; fragmentation slows down SSDs aswell. The only difference is that, with SSDs, random IO is usually just one order of magnitude slower than sequential rather than HDDs’ two or three. It doesn’t affect SSDs nearly as much as HDDs but it still affects them; significantly so.