Hi everybody!! I have the archive of everything ever posted on r/GenZedong, and I would like to eventually get it online for when the subreddit eventually gets banned (inevitable lol).
would people use it? do people want it? what does everybody think?
uwu
(actual photo of me at the genzedong archive)
As someone who only found r/GenZedong when the war started because it was one of the few subr*ddits that had a decent take on the war, I would love to see some of the epic moments and content from the glory days. I missed out because I wasn’t based enough at the time 😔
Don’t you mean this machine, comrade, not that IBM nonsense?
epically based! yes indeed
deleted by creator
I love that it has an archive, there was a lot of information and whatnot stored on it that I wanted to eventually read all the way through. Thank you for your hard work comrade! o7
of course!
Hey, thanks for doing this! It was so long ago when I asked you to do it during the chaotic times.
If you want to host it as a file to download, you could compress the whole thing as a 7Z file, using 7-Zip or PeaZip, to make it compact. FreeARC helps even more with shaving down total file size.
Or do you want to reimage it as a Lemmygrad archive community? For that I would suggest performing bulk compression on images and videos to save bandwidth.
probably going to do the lemmy archive community idea. i could provide it as a download, but it would be massive lol, it’s a lot of content.
I for one would be interested in a full copy. I could throw it up as a torrent on a seedbox for a while as if anyone else wants it as well.
Same as @knfrmity@lemmygrad.ml , I would like to know the 7Z compressed sizes for text only posts, images and videos. Might want to grab text only because there is a lot of nice content from various points in time.
you could compress the whole thing as a 7Z file, using 7-Zip or PeaZip, to make it compact.
Zstandard for speed or Brotli for compression ratio would probably work better.
Do Zstandard and Brotli have higher compression than 7Z LZMA2, or FreeARC’s ARC format? The latter 2 top efficiency charts, from my archival compression knowledge of the past 10-12 years. Once you encode/package anything, the bandwidth and storage savings are harvested forever.
I did a few tests. I tried compressing a config file with a bunch of algorithms at their highest compression levels. This is what I got:
2880 traefik.nomad 1088 traefik.nomad.gz 1078 traefik.nomad.zst 1100 traefik.nomad.xz 1219 traefik.nomad.7z 918 traefik.nomad.brotli
traefik.nomad
is the original. As you can see, Zstandard and Brotli have the best compression ratios. Zstandard is also insanely fast, capable of around 500 MB/sec/core.This is not the only time I’ve tested this. I’ve done it with videos, images, random text files, documents, etc., and Brotli always wins in compression ratio, while Zstandard always wins in speed.
Configuration text files are not the only type of files. You could use PPMd in 7-Zip for it. You need to use a variety of files to benchmark. Which and what were the versions of the compression binaries you used? Did you try FreeARC on it?
I saw this: https://peazip.github.io/fast-compression-benchmark-brotli-zstandard.html
These do not have FreeARC which has even more compression but at little extra time cost. After this, we have PAQ and ZPAQ at various levels, which are impractical for both compression and decompression time efficiency.
Configuration text files are not the only type of files.
I know. As I’ve mentioned, I have used many types of files in the past, and the general trend is that Brotli has the best compression ratios while Zstandard has the best speeds. I just used a config file as it is very compressible.
Which and what were the versions of the compression binaries you used?
- gzip 1.12
- zstd 1.5.2
- xz 5.2.6 (liblzma 5.2.6)
- p7zip 17.04
- brotli 1.0.9
Running on Arch Linux
Did you try FreeARC on it?
I did not try FreeARC as it is abandoned and I can’t seem to figure out how to use its command-line utility.
Also, worth mentioning, is that 7zip and FreeARC are both archiving formats that use modified versions of existing compression algorithms, so I wouldn’t really compare them to the other algorithms on the list.
FreeARC has a GUI for Windows that is runnable under Wine with no performance penalty. You can learn commandline if you wish to avoid Wine usage.
p7zip 17.04 is very old now. There is 7-Zip 22.01 available since 3 months. Although it should not affect the benchmarking too much, it is something to note. The binaries of 7-Zip 22.01 should be available in PeaZip’s newest version as well.
7Z and ARC are not merely modified versions of existing algorithms, but are traditional compression formats that are geared to be more flexible than Google and Facebook created TAR based counterparts here. This is not a correct way to dismiss comparison, when some are superior in application than others.
Non-traditional compressors are made as an attempt to fine tune and allow Tar-Gzip or Tar-Bzip2 as distribution formats, and to monopolise the archival compression space. They fail at it because of immaturity and worse optimisations.
You can see in the PeaZip benchmark article above how 7Z is far superior even with just default compression settings. This simply changed to unbeatable levels in terms of ratio with higher compression levels like Maximum or Ultra.
Also, abandonment of FreeARC does not mean it is unsuitable for benchmarking or real world usage.
holy shit thats awesome
Thank you for your hard work 🙌
Didn’t you (or maybe someone else) already upload it somewhere? Either way, it would be nice to have instead of having to try Reveddit
I used to but I stopped hosting it because it was clunky (i wrote it in PHP lmfao)
I think a lemmy instance would be much nicer looking. I think either a GZD archive community here or a dedicated archive instance that is federated would be nice.
an archive community/instance would be pretty sweet
ya it would but i’m a terrible web admin so we’ll see if it happens lol
Does it include the images and videos as well? That seems like it’d take up quite a lot of disk space
You’re saying the sub hasn’t been banned yet?
It hasn’t been banned, it’s been quarantined. That means that people with Reddit accounts that have verified emails can still browse, comment, etc., but it has limited features and people who aren’t logged in cannot access it at all.
Can anyone please share the link of the archive with me? I remember coming across it a while ago but I didn’t save the URL anywhere unfortunately.
alright, here is what I will do. for the time being, I will make the archive available in an sqlite format. it will have text and comments + URLs for videos and images
in a second archive I will include images by submission I’d
in a third archive I will place videos
I should have time to get it together and up on my site during the last week of this month>