• LalSalaamComradeOP
    link
    fedilink
    English
    arrow-up
    3
    ·
    19 days ago

    No, I am interested in only implementations. I’ve come across a few such, for example:

    • the Pascal string (also probably known as the 2-byte length string)
    • the alt-byte terminated (used by CDC and ZX80)
    • the bit method (in pre-60s era mainframes)
    • the record method (aka the struct method you were talking about, which is probably the default C++/Rust implementation), etc.

    Are there any other custom data structures that are faster and also at the same time safer than the default? What about ropes?

    • Arthur BesseA
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      18 days ago

      CBOR uses variable-sized length prefixes. Strings zero to 23 bytes long require just one byte of overhead, after that it becomes two bytes for strings up to length 255, and 3 bytes of overhead for strings up to 65535. Above that, it requires 5 bytes of overhead, which is probably enough for strings up to at least a few hundred GB, though I didn’t test that far.

      click to see how i empirically determined those numbers

      $ python -c 'import cbor; overhead=0; print({ length:overhead for length in range(65537) if overhead < (overhead:=len(cbor.dumps("a"*length))-length) })'

      {0: 1, 24: 2, 256: 3, 65536: 5}