• bassomitron@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    11 months ago

    Do you know if there are any plans to quantize it? I’d love to test it, but my 3090 can’t handle 70b models without quantization, unfortunately.

    • midnight@kbin.social
      link
      fedilink
      arrow-up
      3
      ·
      edit-2
      11 months ago

      There are quantized versions on hugging face. There’s a q2 version, but idk how well that performs

    • ffhein@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      11 months ago

      Only quantized versions of the model were leaked. If you see any unquantized version of it then it’s something which was recreated from these, and not the original model. People have also requanted it from GGUF to EXL2 and probably other formats too.