• empireOfLove2@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      46
      ·
      edit-2
      8 months ago

      I know the last time this came up there was a lot of user resistance to the torrent scheme. I’d be willing to seed 200-500gb but having minimum torrent archive sizes of like 1.5TB and larger really limits the number of people willing to give up that storage, as well as defeats a lot of the resiliency of torrents with how bloody long it takes to get a complete copy. I know that 1.5TB takes a massive chunk out of my already pretty full NAS, and I passed on seeding the first time for that reason.

      It feels like they didn’t really subdivide the database as much as they should have…

      • maxprime
        link
        fedilink
        English
        arrow-up
        27
        arrow-down
        1
        ·
        8 months ago

        There are plenty of small torrents. Use the torrent generator and tell the script how much space you have and it will give you the “best” (least seeded) torrents whose sum is the size you give it. It doesn’t have to be big, even a few GB is suitable for some smaller torrents.

        • empireOfLove2@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          23
          arrow-down
          1
          ·
          edit-2
          8 months ago

          Almost all the small torrents that I see pop up are already seeded relatively good (~10 seeders) though, which reinforces the fact that A. the torrents most desperately needing seeders are the older, largest ones and B. large torrents don’t attract seeders because of unreasonable space requirements.

          Admittedly, newer torrents seem to be split into 300gb or less pieces, which is good, but there’s still a lot of monster torrents in that list.

    • GravitySpoiled
      link
      fedilink
      English
      arrow-up
      6
      ·
      8 months ago

      Thx.

      Do you know how useful it is to host such a torrent? Who is accessing the content via that torrent?

      • maxprime
        link
        fedilink
        English
        arrow-up
        7
        ·
        8 months ago

        Anyone who wants to. I think a lot of LLM trainers access them.

        • GravitySpoiled
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          8 months ago

          Doesn’t sound like I should host some of it. I’d be more down to host it for endusers

  • umbrella
    link
    fedilink
    English
    arrow-up
    30
    arrow-down
    1
    ·
    8 months ago

    how big is the database?

    books can’t be that big, but i’m guessing the selection is simply huge?

    • xrtxn@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      50
      arrow-down
      1
      ·
      8 months ago

      The selection is literally all books that can be found on the internet.

          • tsonfeir@lemm.ee
            link
            fedilink
            English
            arrow-up
            13
            arrow-down
            1
            ·
            8 months ago

            Shit, my synology has more than that… alas, it is full of movie “archives”

              • tsonfeir@lemm.ee
                link
                fedilink
                English
                arrow-up
                7
                ·
                8 months ago

                Well, it’s not just a single synology, it’s got a bunch of expansion units, and there are multiple host machines.

            • umbrella
              link
              fedilink
              English
              arrow-up
              1
              ·
              edit-2
              8 months ago

              wait what? how expensive is it to buy and run? is it practical at all, what are the common snags? always wanted to get into doing some archiving.

              • tsonfeir@lemm.ee
                link
                fedilink
                English
                arrow-up
                7
                ·
                8 months ago

                It’s an investment. It’s like the price of a small car. But it was built over time, so not like one lump sum.

                Originally, it was to have easier access to my already insane Blu-ray collection. But I started getting discs from Redbox, rental stores, libraries, etc. they are full rip, not that compressed PB stuff. Now there are like 3000 movies and fuck knows how many tv shows.

                A lot of my effort was to have the best release available. Or, have things that got canceled. Like the Simpsons episode with MJ, which is unavailable to stream.

                Snags… well, synology is sooo easy. Once you figure out how you want you drives set up, there’s nothing to it.

                Whatever you do, always have redundant drives. Yes, you lose space, but eventually one of them is gonna die and you don’t want to lose data.

                • redcalcium@lemmy.institute
                  link
                  fedilink
                  English
                  arrow-up
                  11
                  ·
                  8 months ago

                  You should write a will instructing your family to send those disks to the internet archive for preservation if something happened to you.

          • AmbiguousProps@lemmy.today
            link
            fedilink
            English
            arrow-up
            5
            ·
            8 months ago

            Correct me if I’m wrong, but they only index shadow libraries and do not host any files themselves (unless you count the torrents). So, you don’t need 900+ TB of storage to create a mirror.

        • Pussista@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          4
          ·
          8 months ago

          I imagine a couple of terabytes at the very least, though, I could be underestimating how many books have got deDRMed so far.

      • umbrella
        link
        fedilink
        English
        arrow-up
        2
        ·
        8 months ago

        oh i actually tought it was way more! there wasnt a single book i wanted (or even tought to look up) that i didnt actually find in there.

  • HeartyOfGlass@lemm.ee
    link
    fedilink
    English
    arrow-up
    26
    ·
    8 months ago

    Could anyone broad-stroke the security requirements for something like this? Looks like they’ll pay for hosting up to a certain amount, and between that and a pipeline to keep the mirror updated I’d think it wouldn’t be tough to get one up and running.

    Just looking for theory - what are the logistics behind keeping a mirror like this secure?