• brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      6 days ago

      For RAG data? It works.

      But its too slow for the weights. What generative models fundamentally do is run a full pass through the multi-gigabyte weights for every ‘word’ or diffusion step, so even 128-bit DDR5 like you find on desktop CPUs is too slow.