• Communist
      link
      fedilink
      211 months ago

      It’s not, this method of analysis is terrible, they’re just asking gpt4 to grade the responses, not actually testing anything beyond that.

      • @aponigricon@beehaw.org
        link
        fedilink
        English
        211 months ago

        That doesn’t necessarily invalidate the point they’re making. Other forms of analysis, strikingly, provide pretty much completely equivalent results.

    • @fruitywelsh
      link
      111 months ago

      I know I personally like open assistant more just because you get less blocks