• lightstream
    link
    fedilink
    arrow-up
    22
    arrow-down
    10
    ·
    1 year ago

    just fancy phone keyboard text prediction.

    …as if saying that somehow makes what chatGPT does trivial.

    This response, which I wouldn’t expect from anyone with true understanding of neural nets and machine learning, reminds me of the attempt in the 70s to make a computer control a robot arm to catch a ball. How hard could it be, given that computers at that time were already able to solve staggeringly complex equations? The answer was, of course, “fucking hard”.

    You’re never going to get coherent text from autocomplete and nor can it understand any arbitrary English phrase.

    ChatGPT does both those things. You can pose it any question you like in your own words and it will respond with a meaningful and often accurate response. What it can accomplish is truly remarkable, and I don’t get why anybody but the most boomer luddite feels this need to rubbish it.

    • ylai
      link
      fedilink
      arrow-up
      6
      ·
      edit-2
      1 year ago

      …as if saying that somehow makes what chatGPT does trivial.

      That is moving the goalpost. @RickyRigatoni is quite correct that the structure of an autoregressive LLM like (Chat)GPT is, well, autoregressive, i.e. to predict the next word. It is not a statement about triviality until you shifted the goalpost.

      What was genuinely lost in the conversation was how the loss function of a LLM is not the truthfulness. The loss function is for the most part, as you noted below, “coherence,” or that it could have been a plausible completion of the text. Only with RLHF there is some weak guidance on truthfulness, which is far meager than the training loss for pure plausibility.

      You’re never going to get coherent text from autocomplete and nor can it understand any arbitrary English phrase.

      Because those are small models. GPT-3 was already trained on the equivalent text volume that would required > 100 years reading by a human, which is a good size to generate the statistical model, but ridiculous for any sign of “intelligence” or “knowing” what is correct.

      Also, “coherence” is not the goal of normal autocomplete for input, which is scored by producing each next word ranked by frequency, and not playing “the long game” in reaching coherence (e.g. involving a few rare words to get the text flow going). Though both are autoregressive, the training losses are absolutely not the same.

      And if you had not veered off-topic with your 1970s reference from text generation, you might know that the Turing test was demonstratively passable even without neural networks back then, let alone plausible text generation:

      https://en.wikipedia.org/wiki/PARRY

      • hglman
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        2
        ·
        1 year ago

        The original comment is dismissive and clearly ment to be trivializing of the capacity of LLMs. You’re the one being dishonest in your response.

        Your whole post, and a large class of arguments about the capacity of these systems rest on it is designed to do something, so therefore it cannot be more than that. That is not a valid conclusion, emergent behavior exists. Is that the case here? Maybe. Does that mean LLMs are alive or something if they display emergent behavior, no.

        • ylai
          link
          fedilink
          arrow-up
          2
          ·
          edit-2
          1 year ago

          The original comment is dismissive and clearly ment to be trivializing of the capacity of LLMs.

          The trivializing is clearly your personal interpretation. In my response, I was even careful to delineate the arguments between autogressive LLM vs. training for plausibility or truthfulness.

          You’re the one being dishonest in your response. Your whole post, and a large class of arguments about the capacity of these systems rest on it is designed to do something

          My “whole post” is evidently not all about capacity. I had five paragraphs, only a single one discussed model capacity, vs. two for instance about the loss functions. So who is being “dishonest” here?

          […] emergent behavior exists. Is that the case here? Maybe.

          So you have zero proof but still happily conjecture that “emergent behavior” — which you do not care to elaborate how you want to prove — exists. How unsurprising.

          “Emergent behavior” is a worthless claim if the company that trains the model is now even being secretive what training sample was used. Moreover, it became known through research that OpenAI is nowadays basically overtraining straight away on — notably copyrighted, explaining why OpenAI is being secretive — books to make their LLM sound “smart.”

          https://www.theregister.com/2023/05/03/openai_chatgpt_copyright/

          • hglman
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            1 year ago

            The existence of emergent behavior is irrelevant; judgment based on your views about how its made will be flawed. It is not a basis for scientific analysis. Only evidence and observation are.