• KillingTimeItself@lemmy.dbzer0.com
    cake
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 months ago

    For you to pirate, there was already an archival copy.

    is it not the case that the more archival copies there are of something the more likely it is to survive?

    There is a rather simple paradox, in the world of online and digital archival where, unless you archive it, nobody else has any reason to archive it. I could simply not archive any of the stuff i have archived, under the pretense that someone else probably already archived it, but that’s just a guess and i have no idea whether or not that’s the case.

    Once i archive something, it’s possible someone else has already archived it, but i being a known archiver of that material (or not, most archives are private) also substantiates that same paradox.

    And besides, let’s say i am archiving, how am i supposed to verify the integrity of my archival copy? Am i not supposed to consume it? That’s the most effective and reliable way to determine the integrity of an archive. Sure i could use hashes or checksums, but those are only are reliable as the original creation of the hash/checksum.

    • AndrasKrigare@beehaw.org
      link
      fedilink
      arrow-up
      1
      ·
      5 months ago

      is it not the case that the more archival copies there are of something the more likely it is to survive?

      No, it is not. Compare 10,000,000 copies of something that only live on some random people’s phones or 1 copy in the library of Congress where it is someone’s job to manage and preserve it. 50 years from now I think it’s way more likely that the Library of Congress one is still around than the random ones.

      Am i not supposed to consume it? That’s the most effective and reliable way to determine the integrity of an archive. Sure i could use hashes or checksums, but those are only are reliable as the original creation of the hash/checksum.

      No. Consuming it is neither efficient nor reliable. How would you even know when you consume it that it is the original?

      And none of this justifies the piracy itself as opposed to buying it and archiving it? Or if you don’t have the capabilities or means, buying a copy and then pirating that said copy as the archive.

      • KillingTimeItself@lemmy.dbzer0.com
        cake
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 months ago

        No, it is not. Compare 10,000,000 copies of something that only live on some random people’s phones or 1 copy in the library of Congress where it is someone’s job to manage and preserve it. 50 years from now I think it’s way more likely that the Library of Congress one is still around than the random ones.

        now compare 10 billion copies of something that people have archived across the world all over the internet, in various different states. Now compare it to the exactly zero copies that the library of congress has because it’s a random fucking video game, and the library of congress doesn’t generally archive those. Also most of their shit is physical. I.E. difficult to access.

        No. Consuming it is neither efficient nor reliable. How would you even know when you consume it that it is the original?

        you’re not wrong, but it’s also important to remember that you should test backups, this also means you should do some amount of consumption on your archived content to make sure it’s functional and working appropriately.

        How do i know it’s original? Simple, thanks to general internet consensus and the archival work of other people, it’s easy to cross reference. For example, there is a known unreleased boards of canada album “play by numbers” that was never released, only snippets of songs were released, however at some point someone compiled a “play by numbers” album that was fake, and then released it, it’s commonly known among boc fans looking into archived material that it exists, and is out there. There was a recent hoax done by binasty where he faked hooper bay, and people thought it was legit, and then he revealed it. Again, it’s community consensus. These things are much easier to do now, than they are in the future from now.

        A lot of archivists have strict standards around how they archive things as well, generally it’s more about the content itself, rather than it’s relevance to any one particular thing.

        And none of this justifies the piracy itself as opposed to buying it and archiving it? Or if you don’t have the capabilities or means, buying a copy and then pirating that said copy as the archive.

        you must be relatively privileged if you think that’s trivially accessible. Why do you think lost media is a thing? How does one archive that? What about unreleased media? That’s literally impossible.

        Sure you could buy a copy and then pirate it, it’s a valid strategy that a lot of people engage in. But in my case my primary target for archival work is YT content, it’s mostly what i watch, and i find it to be an interesting space to work in. I’ve considered archiving blu rays. But it just doesn’t seem feasible for me. For one thing i’d need a bluray drive and those are upwards of 100 USD. I’d have to stuff that in one of my machines, which would be rather tedious and time consuming. I’d need ripping software, MakeMKV exists (the discs are encrypted and it’s the onyl software out there that decrypts them), but it’s just one thing, and if that ever fucking explodes we’re dead in the water for a bit. It’s technically paid software, so the license for it is another 60 USD. Though it’s in “free access beta” right now, so there’s that. I’d also need physical media to archive. That gets expensive really quickly, shorter shows are often about 50 USD for a boxset. Assuming it’s any good that adds quite a bit already, larger box sets are easily 100 USD. Hard to find boxsets are going to be hundreds of USD. Movies are quite a bit cheaper.

        Oh but we’re not done yet, not only is it a rather expensive endeavor. You also have to invest time and hardware into demystifying the fuckery they engage in with these releases. It’s not uncommon for movies to have random bullshit files that don’t exist, names that don’t make any fucking sense, and broken metadata. Same for shows, although it’s worse, because it rips as one big giant chunk of video, which you then have to split up manually you would think metadata makes that easy, but no, it’s broken too. I’ve seen timestamps in metadata that regularly send you to a scene with a power pole in them. Almost like they paid some poor fuck to make bullshit timestamps throughout it just to piss us off or something.

        And once you’re done segmenting the content, you also have to transode it, unless you want to store the raw uncompressed files, which usually means using hardware encoding, doing that to a modern standard is going to require at least an arc a380 or whatever that card is, which retails for 150 USD, though i hear it was going used for 90 bucks a while ago, unsure if that’s still true, or an nvidia GPU which are famously really cheap and easy to get a hold of. There are probably dedicated hw accelerators out there, but those are usually for professional work, so good luck with that. You could do software encoding, but if you want reasonable file sizes, and at reasonable quality levels, in HEVC encoding or similar, you’re going to be waiting for weeks minimum.

        Granted a few those are just the name of the game, it should come as no surprise to you why people don’t fucking like doing this shit. I’d be more open to spending the money on it if it wasn’t such a fucking disaster and they didn’t try to fight us every fucking step of the way.

        Oh btw, yt archival is rather trivial, you either paste a link into yt-dlp and wait, or you stuff it into something like tube archivist, and let it do it’s own thing. It’s really just that simple.