• Arthur BesseA
    link
    21 year ago

    Unless I’m missing something this person is just using https://github.com/tatsu-lab/stanford_alpaca

    It’s nice of the researchers at Stanford to be making it easy for people to use it offline, but this is not in any way “the full ChatGPT knowledge set” as this finance bro (he plans to make investment decisions with it, lmao) on twitter says it is.

    It is actually Facebook’s 7B-parameter LLaMA model (which is the small one - they also have a 65B parameter version which requires tens of thousands of dollars of GPUs to run locally) fine-tuned on just 52K example inputs and outputs for the OpenAI’s text-davinci-003 model (which was released 8 months before ChatGPT) and this small training set for fine-tuning was, maybe somewhat surprisingly, enough to make it follow instruction prompts.

    You can try it here: https://crfm.stanford.edu/alpaca/ and you will see it is nowhere close to ChatGPT.

    • ☆ Yσɠƚԋσʂ ☆OP
      link
      21 year ago

      Yeah, this has to be pretty limited compared to actual ChatGPT. RAM required for the scale of models for GPT3/4 are well beyond consumer hardware. 13B parameter model takes about 8gb, and GPT3/4 use around 165B as I recall.

      I think the way to get something comparable would be using an approach like Bloom where you have a torrent style distributed system effectively creating a crowd sourced supercomputer.