• LalSalaamComradeOP
      link
      fedilink
      English
      arrow-up
      3
      ·
      19 days ago

      No, I am interested in only implementations. I’ve come across a few such, for example:

      • the Pascal string (also probably known as the 2-byte length string)
      • the alt-byte terminated (used by CDC and ZX80)
      • the bit method (in pre-60s era mainframes)
      • the record method (aka the struct method you were talking about, which is probably the default C++/Rust implementation), etc.

      Are there any other custom data structures that are faster and also at the same time safer than the default? What about ropes?

      • Arthur BesseA
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        18 days ago

        CBOR uses variable-sized length prefixes. Strings zero to 23 bytes long require just one byte of overhead, after that it becomes two bytes for strings up to length 255, and 3 bytes of overhead for strings up to 65535. Above that, it requires 5 bytes of overhead, which is probably enough for strings up to at least a few hundred GB, though I didn’t test that far.

        click to see how i empirically determined those numbers

        $ python -c 'import cbor; overhead=0; print({ length:overhead for length in range(65537) if overhead < (overhead:=len(cbor.dumps("a"*length))-length) })'

        {0: 1, 24: 2, 256: 3, 65536: 5}