• NutWrench
    link
    fedilink
    English
    arrow-up
    10
    ·
    22 hours ago

    The “1 trillion” never existed in the first place. It was all hype by a bunch of Tech-Bros, huffing each other’s farts.

        • Phoenicianpirate@lemm.ee
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 day ago

          Thank you very much. I did ask chatGPT was technical questions about some… subjects… but having something that is private AND can give me all the information I want/need is a godsend.

          Goodbye, chatGPT! I barely used you, but that is a good thing.

      • Mongostein@lemmy.ca
        link
        fedilink
        arrow-up
        9
        arrow-down
        6
        ·
        2 days ago

        Yeah, but you have to run a different model if you want accurate info about China.

        • Alsephina
          link
          fedilink
          English
          arrow-up
          2
          ·
          41 minutes ago

          Unfortunately it’s trained on the same US propaganda filled english data as any other LLM and spits those same talking points. The censors are easy to bypass too.

        • Phoenicianpirate@lemm.ee
          link
          fedilink
          English
          arrow-up
          5
          ·
          1 day ago

          Yeah but China isn’t my main concern right now. I got plenty of questions to ask and knowledge to seek and I would rather not be broadcasting that stuff to a bunch of busybody jackasses.

          • Mongostein@lemmy.ca
            link
            fedilink
            arrow-up
            1
            arrow-down
            2
            ·
            1 day ago

            I agree. I don’t know enough about all the different models, but surely there’s a model that’s not going to tell you “<whoever’s> government is so awesome” when asking about rainfall or some shit.

        • boomzilla@programming.dev
          link
          fedilink
          arrow-up
          5
          ·
          edit-2
          2 days ago

          I watched one video and read 2 pages of text. So take this with a mountain of salt. From that I gathered that deepseek R1 is the model you interact with when you use the app. The complexity of a model is expressed as the number of parameters (though I don’t know yet what those are) which dictate its hardware requirements. R1 contains 670 bn Parameter and requires very very beefy server hardware. A video said it would be 10th of GPUs. And it seems you want much of VRAM on you GPU(s) because that’s what AI crave. I’ve also read 1BN parameters require about 2GB of VRAM.

          Got a 6 core intel, 1060 6 GB VRAM,16 GB RAM and Endeavour OS as a home server.

          I just installed Ollama in about 1/2 an hour, using docker on above machine with no previous experience on neural nets or LLMs apart from chatting with ChatGPT. The installation contains the Open WebUI which seems better than the default you got at ChatGPT. I downloaded the qwen2.5:3bn model (see https://ollama.com/search) which contains 3 bn parameters. I was blown away by the result. It speaks multiple languages (including displaying e.g. hiragana), knows how much fingers a human has, can calculate, can write valid rust-code and explain it and it is much faster than what i get from free ChatGPT.

          The WebUI offers a nice feedback form for every answer where you can give hints to the AI via text, 10 score rating thumbs up/down. I don’t know how it incooperates that feedback, though. The WebUI seems to support speech-to-text and vice versa. I’m eager to see if this docker setup even offers APIs.

          I’ll probably won’t use the proprietary stuff anytime soon.

          • tooclose104@lemmy.ca
            link
            fedilink
            arrow-up
            5
            ·
            2 days ago

            Apparently phone too! Like 3 cards down was another post linking to instructions on how to run it locally on a phone in a container app or termux. Really interesting. I may try it out in a vm on my server.

    • CeeBee_Eh@lemmy.world
      link
      fedilink
      arrow-up
      14
      arrow-down
      6
      ·
      2 days ago

      I asked it about Tiananmen Square, it told me it can’t answer that because it can only respond with “harmless” responses.

      • vga@sopuli.xyz
        link
        fedilink
        arrow-up
        3
        ·
        edit-2
        1 day ago

        They’d need to do some pretty fucking advanced hackery to be able to do surveillance on you just via the model. Everything’s possible I guess, but … yeah perhaps not.

        If they could do that, essentially nothing you do on your computer would be safe.

  • SocialMediaRefugee
    link
    fedilink
    arrow-up
    59
    ·
    2 days ago

    This just shows how speculative the whole AI obsession has been. Wildly unstable and subject to huge shifts since its value isn’t based on anything solid.

    • ByteJunk@lemmy.world
      link
      fedilink
      arrow-up
      9
      arrow-down
      2
      ·
      2 days ago

      It’s based on guessing what the actual worth of AI is going to be, so yeah, wildly speculative at this point because breakthroughs seem to be happening fairly quickly, and everyone is still figuring out what they can use it for.

      There are many clear use cases that are solid, so AI is here to stay, that’s for certain. But how far can it go, and what will it require is what the market is gambling on.

      If out of the blue comes a new model that delivers similar results on a fraction of the hardware, then it’s going to chop it down by a lot.

      If someone finds another use case, for example a model with new capabilities, boom value goes up.

      It’s a rollercoaster…

      • WoodScientist@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        14
        arrow-down
        4
        ·
        2 days ago

        There are many clear use cases that are solid, so AI is here to stay, that’s for certain. But how far can it go, and what will it require is what the market is gambling on.

        I would disagree on that. There are a few niche uses, but OpenAI can’t even make a profit charging $200/month.

        The uses seem pretty minimal as far as I’ve seen. Sure, AI has a lot of applications in terms of data processing, but the big generic LLMs propping up companies like OpenAI? Those seems to have no utility beyond slop generation.

        Ultimately the market value of any work produced by a generic LLM is going to be zero.

        • UndercoverUlrikHD@programming.dev
          link
          fedilink
          arrow-up
          3
          arrow-down
          11
          ·
          2 days ago

          It’s difficult to take your comment serious when it’s clear that all you’re saying seems to based on ideological reasons rather than real ones.

          Besides that, a lot of the value is derived from the market trying to figure out if/what company will develop AGI. Whatever company manages to achieve it will easily become the most valuable company in the world, so people fomo into any AI company that seems promising.

          • Jhex@lemmy.world
            link
            fedilink
            arrow-up
            14
            ·
            2 days ago

            Besides that, a lot of the value is derived from the market trying to figure out if/what company will develop AGI. Whatever company manages to achieve it will easily become the most valuable company in the world, so people fomo into any AI company that seems promising.

            There is zero reason to think the current slop generating technoparrots will ever lead into AGI. That premise is entirely made up to fuel the current “AI” bubble

            • Leg@sh.itjust.works
              link
              fedilink
              arrow-up
              1
              ·
              1 day ago

              They may well lead to the thing that leads to the thing that leads to the thing that leads to AGI though. Where there’s a will

              • Jhex@lemmy.world
                link
                fedilink
                arrow-up
                1
                arrow-down
                1
                ·
                1 day ago

                sure, but that can be said of literally anything. It would be interesting if LLM were at least new but they have been around forever, we just now have better hardware to run them

                • NιƙƙιDιɱҽʂ@lemmy.world
                  link
                  fedilink
                  arrow-up
                  1
                  arrow-down
                  1
                  ·
                  edit-2
                  23 hours ago

                  That’s not even true. LLMs in their modern iteration are significantly enabled by transformers, something that was only proposed in 2017.

                  The conceptual foundations of LLMs stretch back to the 50s, but neither the physical hardware nor the software architecture were there until more recently.

        • NιƙƙιDιɱҽʂ@lemmy.world
          link
          fedilink
          arrow-up
          3
          arrow-down
          5
          ·
          2 days ago

          Language learning, code generatiom, brainstorming, summarizing. AI has a lot of uses. You’re just either not paying attention or are biased against it.

          It’s not perfect, but it’s also a very new technology that’s constantly improving.

          • Toofpic@feddit.dk
            link
            fedilink
            arrow-up
            2
            arrow-down
            1
            ·
            1 day ago

            I decided to close the post now - there is place for any opinion, but I can see people writing things which are completely false however you look at them: you can dislike Sam Altman (I do), you can worry about China’s interest in entering the competition now and like that (I do), but the comments about LLM being useless while millions of people use it daily for multiple purposes sound just like lobbying.

    • Cowbee [he/they]
      link
      fedilink
      arrow-up
      25
      ·
      2 days ago

      On the brightside, the clear fragility and lack of direct connection to real productive forces shows the instability of the present system.

      • leftytighty@slrpnk.net
        link
        fedilink
        English
        arrow-up
        10
        arrow-down
        1
        ·
        2 days ago

        And no matter how many protectionist measures that the US implements we’re seeing that they’re losing the global competition. I guess protectionism and oligarchy aren’t the best ways to accomplish the stated goals of a capitalist economy. How soon before China is leading in every industry?

        • Cowbee [he/they]
          link
          fedilink
          arrow-up
          12
          arrow-down
          1
          ·
          2 days ago

          This conclusion was foregone when China began to focus on developing the Productive Forces and the US took that for granted. Without a hard pivot, the US can’t even hope to catch up to the productive trajectory of China, and even if they do hard pivot, that doesn’t mean they even have a chance to in the first place.

          In fact, protectionism has frequently backfired, and had other nations seeking inclusion into BRICS or more favorable relations with BRICS nations.

    • Eatspancakes84@lemmy.world
      link
      fedilink
      arrow-up
      6
      arrow-down
      1
      ·
      2 days ago

      That’s the thing: if the cost of AI goes down , and AI is a valuable input to businesses that should be a good thing for the economy. To be sure, not for the tech sector that sells these models, but for all of the companies buying these services it should be great.

  • Doomsider@lemmy.world
    link
    fedilink
    arrow-up
    79
    arrow-down
    1
    ·
    2 days ago

    Wow, China just fucked up the Techbros more than the Democratic or Republican party ever has or ever will. Well played.

    • kshade@lemmy.world
      link
      fedilink
      arrow-up
      10
      ·
      2 days ago

      It’s kinda funny. Their magical bullshitting machine scored higher on made up tests than our magical bullshitting machine, the economy is in shambles! It’s like someone losing a year’s wages in sports betting.

      • Naia@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        4
        ·
        2 days ago

        Just because people are misusing tech they know nothing about does not mean this isn’t an impressive feat.

        If you know what you are doing, and enough to know when it gives you garbage, LLMs are really useful, but part of using them correctly is giving them grounding context outside of just blindly asking questions.

        • kshade@lemmy.world
          link
          fedilink
          arrow-up
          5
          ·
          2 days ago

          It is impressive, but the marketing around it has really, really gone off the deep end.

    • UnderpantsWeevil@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      ·
      2 days ago

      Democrats and Republicans have been shoveling truckload after truckload of cash into a Potemkin Village of a technology stack for the last five years. A Chinese tech company just came in with a dirt cheap open-sourced alternative and I guarantee you the American firms will pile on to crib off the work.

      Far from fucking them over, China just did the Americans’ homework for them. They just did it in a way that undercuts all the “Sam Altman is the Tech Messiah! He will bring about AI God!” holy roller nonsense that was propping up a handful of mega-firm inflated stock valuations.

      Small and Mid-cap tech firms will flourish with these innovations. Microsoft will have to write the last $13B it sunk into OpenAI as a lose.

    • Valmond@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      1 day ago

      Didn’t donald add like $500B for AI? Seems it’salmost enough to pay the -$600B nVidia lost…

  • PlutoniumAcid@lemmy.world
    link
    fedilink
    arrow-up
    20
    ·
    2 days ago

    So if the Chinese version is so efficient, and is open source, then couldn’t openAI and anthropic run the same on their huge hardware and get enormous capacity out of it?

    • AdrianTheFrog@lemmy.world
      link
      fedilink
      English
      arrow-up
      9
      ·
      2 days ago

      OpenAI could use less hardware to get similar performance if they used the Chinese version, but they already have enough hardware to run their model.

      Theoretically the best move for them would be to train their own, larger model using the same technique (as to still fully utilize their hardware) but this is easier said than done.

    • Jhex@lemmy.world
      link
      fedilink
      arrow-up
      12
      arrow-down
      2
      ·
      2 days ago

      Not necessarily… if I gave you my “faster car” for you to run on your private 7 lane highway, you can definitely squeeze every last bit of the speed the car gives, but no more.

      DeepSeek works as intended on 1% of the hardware the others allegedly “require” (allegedly, remember this is all a super hype bubble)… if you run it on super powerful machines, it will perform nicer but only to a certain extend… it will not suddenly develop more/better qualities just because the hardware it runs on is better

      • merari42@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        1 day ago

        Didn’t deepseek solve some of the data wall problems by creating good chain of thought data with an intermediate RL model. That approach should work with the tried and tested scaling laws just using much more compute.

      • PlutoniumAcid@lemmy.world
        link
        fedilink
        arrow-up
        4
        ·
        2 days ago

        This makes sense, but it would still allow a hundred times more people to use the model without running into limits, no?

    • Yggnar@lemmy.world
      link
      fedilink
      arrow-up
      3
      ·
      2 days ago

      It’s not multimodal so I’d have to imagine it wouldn’t be worth pursuing in that regard.

  • Clent@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    26
    ·
    2 days ago

    No surprise. American companies are chasing fantasies of general intelligence rather than optimizing for today’s reality.

    • Naia@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      23
      ·
      2 days ago

      That, and they are just brute forcing the problem. Neural nets have been around for ever but it’s only been the last 5 or so years they could do anything. There’s been little to no real breakthrough innovation as they just keep throwing more processing power at it with more inputs, more layers, more nodes, more links, more CUDA.

      And their chasing a general AI is just the short sighted nature of them wanting to replace workers with something they don’t have to pay and won’t argue about it’s rights.

      • supersquirrel@sopuli.xyz
        link
        fedilink
        arrow-up
        2
        ·
        edit-2
        2 days ago

        Also all of these technologies forever and inescapably must rely on a foundation of trust with users and people who are sources of quality training data, “trust” being something US tech companies seem hell bent on lighting on fire and pissing off the yachts of their CEOs.

  • JOMusic
    link
    fedilink
    English
    arrow-up
    53
    arrow-down
    2
    ·
    2 days ago

    and it’s open-source!

  • Arehandoro
    link
    fedilink
    arrow-up
    58
    arrow-down
    4
    ·
    2 days ago

    Nvidia’s most advanced chips, H100s, have been banned from export to China since September 2022 by US sanctions. Nvidia then developed the less powerful H800 chips for the Chinese market, although they were also banned from export to China last October.

    I love how in the US they talk about meritocracy, competition being good, blablabla… but they rig the game from the beginning. And even so, people find a way to be better. Fascinating.

    • shawn1122@lemm.ee
      link
      fedilink
      English
      arrow-up
      28
      ·
      2 days ago

      You’re watching an empire in decline. It’s words stopped matching its actions decades ago.

    • Breve@pawb.social
      link
      fedilink
      arrow-up
      16
      arrow-down
      1
      ·
      edit-2
      2 days ago

      Don’t forget about the tariffs too! The US economy is actually a joke that can’t compete on the world stage anymore except by wielding their enormous capital from a handful of tech billionaires.

    • Joe Dyrt
      link
      fedilink
      arrow-up
      12
      ·
      2 days ago

      Joke’s on them: I never started a subscription!

    • Zink@programming.dev
      link
      fedilink
      arrow-up
      2
      ·
      2 days ago

      I don’t have one to cancel, but I might celebrate today by formatting the old windows SSD in my system and using it for some fast download cache space or something.

  • toothbrush@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    128
    ·
    edit-2
    3 days ago

    One of those rare lucid moments by the stock market? Is this the market correction that everyone knew was coming, or is some famous techbro going to technobabble some more about AI overlords and they return to their fantasy values?

    • themoonisacheese@sh.itjust.works
      link
      fedilink
      arrow-up
      102
      arrow-down
      1
      ·
      3 days ago

      It’s quite lucid. The new thing uses a fraction of compute compared to the old thing for the same results, so Nvidia cards for example are going to be in way less demand. That being said Nvidia stock was way too high surfing on the AI hype for the last like 2 years, and despite it plunging it’s not even back to normal.

      • jacksilver@lemmy.world
        link
        fedilink
        arrow-up
        32
        arrow-down
        6
        ·
        3 days ago

        My understanding is it’s just an LLM (not multimodal) and the train time/cost looks the same for most of these.

        I feel like the world’s gone crazy, but OpenAI (and others) is pursing more complex model designs with multimodal. Those are going to be more expensive due to image/video/audio processing. Unless I’m missing something that would probably account for the cost difference in current vs previous iterations.

        • will_a113
          link
          fedilink
          English
          arrow-up
          39
          ·
          3 days ago

          The thing is that R1 is being compared to gpt4 or in some cases gpt4o. That model cost OpenAI something like $80M to train, so saying it has roughly equivalent performance for an order of magnitude less cost is not for nothing. DeepSeek also says the model is much cheaper to run for inferencing as well, though I can’t find any figures on that.

          • jacksilver@lemmy.world
            link
            fedilink
            arrow-up
            5
            arrow-down
            3
            ·
            3 days ago

            My main point is that gpt4o and other models it’s being compared to are multimodal, R1 is only a LLM from what I can find.

            Something trained on audio/pictures/videos/text is probably going to cost more than just text.

            But maybe I’m missing something.

            • will_a113
              link
              fedilink
              English
              arrow-up
              24
              ·
              3 days ago

              The original gpt4 is just an LLM though, not multimodal, and the training cost for that is still estimated to be over 10x R1’s if you believe the numbers. I think where R 1 is compared to 4o is in so-called reasoning, where you can see the chain of though or internal prompt paths that the model uses to (expensively) produce an output.

              • jacksilver@lemmy.world
                link
                fedilink
                arrow-up
                5
                arrow-down
                2
                ·
                edit-2
                3 days ago

                I’m not sure how good a source it is, but Wikipedia says it was multimodal and came out about two years ago - https://en.m.wikipedia.org/wiki/GPT-4. That being said.

                The comparisons though are comparing the LLM benchmarks against gpt4o, so maybe a valid arguement for the LLM capabilites.

                However, I think a lot of the more recent models are pursing architectures with the ability to act on their own like Claude’s computer use - https://docs.anthropic.com/en/docs/build-with-claude/computer-use, which DeepSeek R1 is not attempting.

                Edit: and I think the real money will be in the more complex models focused on workflows automation.

              • veroxii@aussie.zone
                link
                fedilink
                arrow-up
                4
                ·
                2 days ago

                Holy smoke balls. I wonder what else they have ready to release over the next few weeks. They might have a whole suite of things just waiting to strategically deploy

          • Zaktor@sopuli.xyz
            link
            fedilink
            English
            arrow-up
            9
            arrow-down
            1
            ·
            3 days ago

            And the data is not available. Knowing the weights of a model doesn’t really tell us much about its training costs.

      • davelA
        link
        fedilink
        English
        arrow-up
        6
        ·
        3 days ago

        If AI is cheaper, then we may use even more of it, and that would soak up at least some of the slack, though I have no idea how much.

  • protist@mander.xyz
    link
    fedilink
    English
    arrow-up
    106
    ·
    3 days ago

    Emergence of DeepSeek raises doubts about sustainability of western artificial intelligence boom

    Is the “emergence of DeepSeek” really what raised doubts? Are we really sure there haven’t been lots of doubts raised previous to this? Doubts raised by intelligent people who know what they’re talking about?

    • floofloof@lemmy.ca
      link
      fedilink
      English
      arrow-up
      31
      ·
      edit-2
      3 days ago

      Ah, but those “intelligent” people cannot be very intelligent if they are not billionaires. After all, the AI companies know exactly how to assess intelligence:

      Microsoft and OpenAI have a very specific, internal definition of artificial general intelligence (AGI) based on the startup’s profits, according to a new report from The Information. … The two companies reportedly signed an agreement last year stating OpenAI has only achieved AGI when it develops AI systems that can generate at least $100 billion in profits. That’s far from the rigorous technical and philosophical definition of AGI many expect. (Source)