• uberstar
    link
    fedilink
    arrow-up
    6
    ·
    7 hours ago

    I tried DeepSeek, and immediately fell in love… My only nitpick is that images have to have text on them, otherwise it complains, but for the price of free, I’m basically just asking for too much. Contemporaries be damned.

  • I Cast Fist@programming.dev
    link
    fedilink
    arrow-up
    10
    ·
    10 hours ago

    Come on, OP, Altman is still a billionaire. If he got out of the game right now, with OpenAi still unprofitable, he’d still have enough wealth for a dozen generations.

  • Sabre363@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    6
    ·
    11 hours ago

    We doing paid promotions or something on Lemmy now? You sure seem to be pushing this DeepSeek thing pretty hard, op.

      • Sabre363@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        7
        arrow-down
        6
        ·
        10 hours ago

        None of this has anything to do with the model being open source or not, plenty of other people have already disputed that claim.

        • Grapho
          link
          fedilink
          arrow-up
          11
          ·
          8 hours ago

          It’s a model that outperforms the other ones in a bunch of areas with a smaller footprint and which was trained for less than a twentieth of the price, and then it was released as open source.

          If it were European or US made nobody would deem it suspicious if somebody talked about it all month, but it’s a Chinese breakthrough and god forbid you talk about it for three days

        • ☆ Yσɠƚԋσʂ ☆OP
          link
          fedilink
          arrow-up
          9
          arrow-down
          3
          ·
          9 hours ago

          It has everything to do with the tech being open. You can dispute it all you like, but the fact is that all the code and research behind it is open. Anybody could build a new model from scratch using open data if they wanted to. That’s what matters.

          • Sabre363@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            5
            arrow-down
            8
            ·
            8 hours ago

            I’m commenting on the odd nature of the post and your behavior in the comments, pointing out that it comes across as more a shallow advertisement than a sincere endorsement, that is all. I don’t know enough about DeepSeek to discuss it meaningfully, nor do I have enough evidence to decide upon its open source status.

              • Sabre363@sh.itjust.works
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                2
                ·
                3 hours ago

                You might have a far more positive interaction with the community if you learned to listen first before jumping on the defensive

                • ☆ Yσɠƚԋσʂ ☆OP
                  link
                  fedilink
                  arrow-up
                  2
                  ·
                  2 hours ago

                  Pretty much all my interactions with the community here have been positive, aside from a few toxic trolls such as yourself. Maybe take your own advice there champ.

  • glimmer_twin [he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    26
    arrow-down
    1
    ·
    edit-2
    1 day ago

    Altman didn’t really make his money from tech. He’s basically a magic bean seller. He’ll be fine no matter what happens to AI. He’ll find a new grift and new suckers (famously one born every minute after all)

  • Sem
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    28
    ·
    1 day ago

    Deepseek collects and process all the data you sent to their LLN even from API calls. It is a no-go for most of businesses applications. For example, OpenAI and Anyhropic do not collect or process anyhow data sent via API and there is an opy-ouy button in their settings that allows to avoid processing of the data sent via UI.

    • ☆ Yσɠƚԋσʂ ☆OP
      link
      fedilink
      arrow-up
      25
      ·
      1 day ago

      DeepSeek is an open source project that anybody can run, and it’s performant enough that even running the full model is cheap enough for any company to do.

      • shawn1122@lemm.ee
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        6
        ·
        edit-2
        3 hours ago

        Since it’s open source is there a way for companies to adjust so it doesn’t intentionally avoid saying anything bad about China?

          • Ajen@sh.itjust.works
            link
            fedilink
            arrow-up
            3
            arrow-down
            5
            ·
            8 hours ago

            That doesn’t mean it’s straightforward, or even possible, to entirely remove the censorship that’s baked into the model.

            • ☆ Yσɠƚԋσʂ ☆OP
              link
              fedilink
              arrow-up
              3
              ·
              6 hours ago

              It doesn’t mean it’s easy, but it is certainly possible if somebody was dedicated enough. At the end of the day you could even use the open source code DeepSeek published and your own training data to train a whole new model with whatever biases you like.

              • Ajen@sh.itjust.works
                link
                fedilink
                arrow-up
                1
                arrow-down
                2
                ·
                5 hours ago

                “It’s possible, you just have to train your own model.”

                Which is almost as much work as you would have to do if you were to start from scratch.

                • ☆ Yσɠƚԋσʂ ☆OP
                  link
                  fedilink
                  arrow-up
                  2
                  ·
                  4 hours ago

                  It’s obviously not since the whole reason DeepSeek is interesting is the new mixture of experts algorithm that it introduces. If you don’t understand the subject then maybe spend a bit of time learning about it instead of adding noise to the discussion?

            • Grapho
              link
              fedilink
              arrow-up
              6
              ·
              7 hours ago

              People saying truisms that confirm their biases about shit they clearly know nothing about? I thought I’d left reddit.

      • blarth@thelemmy.club
        link
        fedilink
        arrow-up
        1
        arrow-down
        13
        ·
        edit-2
        21 hours ago

        It should be repeated: no American corporation is going to let their employees put data into DeepSeek.

        Accept this truth. The LLM you can download and run locally is not the same as what you’re getting on their site. If it is, it’s shit, because I’ve been testing r1 in ollama and it’s trash.

        • ☆ Yσɠƚԋσʂ ☆OP
          link
          fedilink
          arrow-up
          12
          ·
          13 hours ago

          It should be repeated: anybody can run DeepSeek themselves on premise. You have absolutely no clue what you’re talking about. Keep on coping there though, it’s pretty adorable.

    • fl42v
      link
      fedilink
      arrow-up
      38
      ·
      1 day ago

      You can run 'em locally, tho, if their gh page is to be believed. And this way you can make sure nothing gets even sent to their servers, and not just believe nothing is processed.

    • hungrybread [comrade/them]@hexbear.net
      link
      fedilink
      English
      arrow-up
      29
      ·
      edit-2
      1 day ago

      I’m too lazy to look for any of their documentation about this, but it would be pretty bold to believe privacy or processing claims from OpenAI or similar AI orgs, given their history flouting copyright.

      Silicon valley more generally just breaks laws and regulations to “disrupt”. Why wouldn’t an org like OpenAI at least leave a backdoor for themselves to process API requests down the road as a policy change? Not that they would need to, but it’s not uncommon for a co to leave an escape hatch in their policies.

    • ☆ Yσɠƚԋσʂ ☆OP
      link
      fedilink
      arrow-up
      32
      arrow-down
      2
      ·
      1 day ago

      Because it’s an open source project that’s destroying the whole closed source subscription AI model.

      • The Octonaut@mander.xyz
        link
        fedilink
        arrow-up
        8
        arrow-down
        6
        ·
        1 day ago

        I don’t think you or that Medium writer understand what “open source” means. Being able to run a local stripped down version for free puts it on par with Llama, a Meta product. Privacy-first indeed. Unless you can train your own from scratch, it’s not open source.

        Here’s the OSI’s helpful definition for your reference https://opensource.org/ai/open-source-ai-definition

        • ☆ Yσɠƚԋσʂ ☆OP
          link
          fedilink
          arrow-up
          11
          arrow-down
          7
          ·
          1 day ago

          You can run the full version if you have the hardware, the weights are published, and importantly the research behind it is published as well. Go troll somewhere else.

          • The Octonaut@mander.xyz
            link
            fedilink
            arrow-up
            5
            arrow-down
            7
            ·
            1 day ago

            All that is true of Meta’s products too. It doesn’t make them open source.

            Do you disagree with the OSI?

            • Grapho
              link
              fedilink
              arrow-up
              7
              arrow-down
              1
              ·
              7 hours ago

              What makes it open source is that the source code is open.

              My grandma is as old as my great aunts, that doesn’t transitively make her my great aunt.

              • The Octonaut@mander.xyz
                link
                fedilink
                arrow-up
                1
                arrow-down
                4
                ·
                7 hours ago

                A model isn’t an application. It doesn’t have source code. Any more than an image or a movie has source code to be “open”. That’s why OSI’s definition of an “open source” model is controversial in itself.

                • Grapho
                  link
                  fedilink
                  arrow-up
                  3
                  arrow-down
                  1
                  ·
                  6 hours ago

                  It’s clear you’re being disingenuous. A model is its dataset and its weights too but the weights are also open and if the source code was as irrelevant as you say it is, Deepseek wouldn’t be this much more performant, and “Open” AI would have published it instead of closing the whole release.

              • The Octonaut@mander.xyz
                link
                fedilink
                arrow-up
                9
                arrow-down
                3
                ·
                edit-2
                1 day ago

                The data part. ie the very first part of the OSI’s definition.

                It’s not available from their articles https://arxiv.org/html/2501.12948v1 https://arxiv.org/html/2401.02954v1

                Nor on their github https://github.com/deepseek-ai/DeepSeek-LLM

                Note that the OSI only ask for transparency of what the dataset was - a name and the fee paid will do - not that full access to it to be free and Free.

                It’s worth mentioning too that they’ve used the MIT license for the “code” included with the model (a few YAML files to feed it to software) but they have created their own unrecognised non-free license for the model itself. Why they having this misleading label on their github page would only be speculation.

                Without making the dataset available then nobody can accurately recreate, modify or learn from the model they’ve released. This is the only sane definition of open source available for an LLM model since it is not in itself code with a “source”.

      • Hnery@feddit.org
        link
        fedilink
        arrow-up
        2
        arrow-down
        6
        ·
        21 hours ago

        So… as far as I understand from this thread, it’s basically a finished model (llama or qwen) which is then fine tuned using an unknown dataset? That’d explain the claimed 6M training cost, hiding the fact that the heavy lifting has been made by others (US of A’s Meta in this case). Nothing revolutionary to see here, I guess. Small improvements are nice to have, though. I wonder how their smallest models perform, are they any better than llama3.2:8b?

        • ☆ Yσɠƚԋσʂ ☆OP
          link
          fedilink
          arrow-up
          4
          arrow-down
          2
          ·
          12 hours ago

          What’s revolutionary here is the use of mixture-of-experts approach to get far better performance. While it has 671 billion parameters overall, it only uses 37 billion at a time, making it very efficient. For comparison, Meta’s Llama3.1 uses 405 billion parameters used all at once. It does as well as GPT-4o in the benchmarks, and excels in advanced mathematics and code generation. It also has 128K token context window means it can process and understand very long documents, and processes text at 60 tokens per second, twice as fast as GPT-4o.