Codeberg was asking about this. The linked toot by a commenter points to :

SEqlite

These are CC-BY-SA 4.0 remixes of the Stack Exchange Creative Commons Data Dumps. 100% Unendorsed by Stack Exchange, Inc.

They are minimal. They provide the data you probably care about and the data you need to comply with the original license in SQLite format.

  • AJ Sadauskas
    link
    fedilink
    1710 days ago

    @lemmyreader Here’s a starting point for a fediverse StackExchange: Make sure it’s interoperable with Lemmy.

    Now, you may not get the full feature set on Lemmy, but you should be able to interact with it from Lemmy as if it’s a group on there.

    #StackExchange #Fediverse #Coding

  • @yo_scottie_oh
    link
    English
    1510 days ago

    Honest question: Why?

    IMHO stack exchange is basically reddit/lemmy with hand cuffs because no threaded discussions and every other question is closed as off topic. I don’t understand what another stack exchange would buy anybody.

    I guess one thing stack exchange does well is “related questions” and tagging, but… I dunno. (shrugs)

    • DessalinesA
      link
      1910 days ago

      With a few more additions, lemmy could serve as a good replacement. We already have a Forum / NewComments sort which is perfect for question / answer type communities. We could add a feature to make default sorts for specific communities, so they would feel less fast, or possibly a sort that brings zero comment posts (IE meaning unanswered), to the top.

      The reputation and “accepted answer” features from SO are a lot less important than threaded comments can be, especially since questions often need new answers every year, making the “accepted answer” pointless.

      • @Die4Ever@programming.dev
        link
        fedilink
        4
        edit-2
        9 days ago

        Especially with Lemmy getting support for plugins soon, I don’t see the need for making a new platform

        A new sorting method for “unanswered” is a cool idea. I’m not sure if it’s quite as simple as just finding posts with 0 comments, because people can put additional questions in the comments but it’s still unanswered. Also how do you sort them for posts with the same number of comments/answers. But this is definitely something that a plugin could handle.

        I saw someone else suggested we could just put “[unanswered]” in the title and then edit the title to “[answered]”

      • @sabreW4K3@lazysoci.al
        link
        fedilink
        210 days ago

        Default sort would be great. Especially for sports events. But I don’t want Lemmy to become an answer repository, keep it as a link aggregator

      • @ramble81@lemm.ee
        link
        fedilink
        710 days ago

        And guess what, it can be done just as easily, if not, more easily on a federated instance. You don’t gain at real additional control over your data (and no putting “covered under license X” is about as realistic as those Facebook posts saying “I don’t give anyone access to my posts”).

        I’ve said this before and I’ll say it again, realistically the only way to control your data from AI is a DRM type solution which everyone fundamentally hates.

    • @LostWon@lemmy.ca
      link
      fedilink
      210 days ago

      Useful constraints would focus discussion to keep questions/replies brief, relevant, and hopefully helpful, wouldn’t they? I just wonder how up and downvoting would work since that would go very differently from Lemmy.

      • BlueÆther
        link
        fedilink
        110 days ago

        I just wonder how up and downvoting would work since that would go very differently from Lemmy.

        how so?

        • @LostWon@lemmy.ca
          link
          fedilink
          19 days ago

          I’m sure this has been solved already but I’m just wondering how you ensure people are voting based on the helpfulness and/or merit of the response. That’s the ideal on Lemmy but it’s obviously not always the case here. Presumably, you’d have to be logged in on the other platform to vote but you can just see the discussion from Lemmy, I guess?

  • @maegul
    link
    English
    910 days ago

    Oohhh. Seeding the alternative with all the old data, if possible, could be an awesome move here!

      • @DaseinPickle@leminal.space
        link
        fedilink
        210 days ago

        It seems to matter for the users at Stack Overflow. And why should anybody give anything for free to the crooks in Silicon Valley. All they do is create technology designed to extract value out of people and give as little as possible back.

        • Because that’s the nature of FOSS. The good news is, if they trained on you data that’s licensed CC BY-SA (as all SO content is), then you can request their source code, and they legally must provide it.

          This is a good thing.

    • @velox_vulnus
      link
      English
      210 days ago
      1. In the Fediverse, everyone gets access to the data. However, if privacy is what’s bugging you, then you’re free to use a forum - which is going to be archived by someone on the internet, so in a way, the stuff you post on the internet is not going to be private - there’s nothing that can be done about it, except for going under a pseudonym. However, the same cannot be said for Stack Exchange. Will they let you parse their site for free, when Reddit and other private platforms are charging money for the same? They’re using 16 years worth of free volunteer work to make lots of bucks.

      2. In their quest for integrating AI, now the new site will vomit verbal diarrhea. Humans don’t do that. These language models are absolutely terrible in their tasks. They can’t replace humans, at least for now, we know it.

      3. Earlier, the site was free, and their means of earning was through some sort of enterprise solution, but now that they’re going to add AI, it is going to be very resource-intensive. Who is paying for all of that? We have to, from our own pockets, for low quality answers, with no respect to the question asked by the user? Yeah, welcome to paywall 2.0!

      4. Their lofy model will use answers from 2010s to train their data, most of which isn’t applicable in today’s time. Will you be using X11 configs for Wayland on Linux? Or GTK+ solutions for GTK4?

      • @DaseinPickle@leminal.space
        link
        fedilink
        310 days ago

        It’s not about privacy. It’s about AI companies stealing other peoples work and knowledge and profiting. Like what they did with artists. And I think that’s bothering a lot of people. It’s kind of sad that we cannot exchange information with each other for free, without some Silicon Valley crooks taking advantage and trying to convert other people’s good will into profit.

        These LLMs are also polluting the web with AI junk and slop. The web is absolutely tainted with shitty ChatGPT text and images, making it harder and harder to find authentic information. I think a lot of people don’t want to contribute with that.

      • @velox_vulnus
        link
        English
        210 days ago

        No, it can’t be. I may be using robots.txt on, say, lemmy.ml, but those posts will still be broadcasted on lemmy.world, or hexbear.net.