• Ulu-Mulu-no-die@lemmy.world
    link
    fedilink
    arrow-up
    5
    ·
    edit-2
    1 year ago

    I don’t think being a customer would work either, language models are still on the training, noone knows exactly how users queries are used, that’s a big no no for every company having to protect their secrets.

    A self-hosted instance is a much better solution, if not the only “safe” one from that point of view, we’ll get there.

    • MentalEdge@sopuli.xyz
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      1 year ago

      Interaction data does not become training data, unless you want it to.

      I know that how a piece of software created using machine learning works, is an unknowable, but training data and interaction data are not the same thing. ChatGPT in particular is designed to be restored to a known good start state, only using query data for context awareness within a given sessions. Not to train itself.

      Each query simply includes all previous queries, for context. That’s part of why it becomes increasingly erratic the longer a session goes on.

      And unless you do train with a given piece of data, that data is not entered into the LLM in any way. Not even the undefined unknowable way.