WebDAV has been around a lot longer and does many of the same things as object storage. It also has support for random access read/writes where object storage requires you to download, edit, and re-upload the whole file. Seems like a no-brainer if you wanted to offer cloud storage to customers.

I thought maybe supporting large uploads was the draw, but WebDAV can support chunking, so you don’t need to allocate extra server resources to accommodate large files.

I use both daily, and WebDAV just seems like it does everything better: object storage feels like throwing files in a junk drawer and WebDAV more like an organized filing cabinet.

Aside from Nextcloud and a few FOSS applications, the only big thing I recall that adopted WebDAV was Frontpage back in the day.

So, what am I missing? What makes object storage so compelling that it became ubiquitous while WebDAV is practically a legacy spec?

  • hperrin@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    8 months ago

    To give you a real answer, from someone who loves WebDAV and has written a WebDAV server with an S3 backend, object storage is easier/possible to run at scale and serves a different purpose.

    Object storage is and always has been based on a key-value model. You put a key and value in, and later you can request that key to get that value. It technically has no concept of hierarchy. WebDAV supports so much more than that. WebDAV has collections (hierarchy), live and dead properties (S3 has something similar to these), methods like MOVE and PROPFIND, and a system of hierarchical locking (depth 1 locking on a collection and depth infinity locking on an entire namespace).

    This means that in order to build a WebDAV server, you need to know a lot of information about what exists in the data storage. S3 is a lot “dumber” in that regard. The funny thing is S3 has added functionality that essentially rewrites most of WebDAV in a more convoluted form. Whereas on WebDAV you can just propfind a collection with depth 1, on S3 you need to list keys with a prefix and delimiter, then make additional requests for any other props you may need.

    Unfortunately, the one thing WebDAV is missing that users of S3 often need is the concept of partial listing. In S3, when you list keys, you tell it how many keys you want back, then it will only give you that many keys max. If there are more keys that it didn’t give you, it will tell you the results are truncated and give you a continuation token. You can use this token in your next request to continue listing keys.

    This is where the “at scale” thing comes in. If you have hundreds of millions of keys in a bucket, getting them all back at once would certainly break your system, and probably would tax the server unnecessarily. So basically the answer is S3 is designed for scale.

    That being said, S3 is not really designed for humans to interact with. This is where the “different purpose” thing comes in. It doesn’t have a real concept of hierarchy, just common prefixes and delimiters. So something like renaming a directory would require copying every object with that prefix to a new key, then deleting the originals (which is what my S3 adapter does for my WebDAV server). S3 is more meant to be used with something like UUIDs or hashes for keys. Keys that don’t change. WebDAV is designed more like a file system.

    I hope that explains it well.

    PS: Two minor corrections, WebDAV itself does not support random writes. That’s a separate RFC that’s not part of WebDAV, but is perfectly compatible, and many WebDAV servers offer that functionality. Also S3 does support random read requests via the Range header.

  • key@lemmy.keychat.org
    link
    fedilink
    English
    arrow-up
    0
    ·
    8 months ago

    S3 succeeded due to the scaling capabilities and the ability to abstract completely away from a server or disk. The straight forward Key/Value nature of the s3api was a big assistance in achieving the scaling and adoptability.

    Comparing it to WebDav seems like comparing apples and… an orange smoothie.

      • blakemiller@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        8 months ago

        Couldn’t say for sure but WebDAV probably would be clunky if fronted by a distributed database. The beauty of S3 is you add more servers, add more disks, and bam you’ve got more S3. That happens most easily when the metadata system sitting in the front can expand easily. I don’t know how easy that would be to plumb up with WebDAV. Whether or not one was better here, S3 ultimately won because it’s a primitive API that was essentially impossible to fuck up.