• Doombot1@lemmy.one
    link
    fedilink
    arrow-up
    11
    ·
    11 months ago

    Interesting that the article ends with “The new ChatGPT catcher even performed well with introductions from journals it wasn’t trained on”. Isn’t that the whole point? If you just judge a model based on what it was trained on, you just get a biased model. I can’t remember the exact word for it but it’s essentially over-relying on your own dataset. So of course it will get near-100% accuracy on what it was trained with. I’d be curious to see what the accuracy on other papers is.

  • Aurenkin@sh.itjust.works
    link
    fedilink
    arrow-up
    7
    ·
    edit-2
    11 months ago

    Smells like bullshit. The graphs they showed in the source paper with their accuracy at like 100% for every test seem even more like bullshit. Did they run the model over the training data or what?

    Maybe I’m wrong but text is just way too high signal to noise medium to be able to tell if it was written by an AI. The false positives would be high enough that it’s effectively useless. Does anyone have another perspective on this? If I’m missing some nuance here I’d love to understand more.

    • Communist
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      11 months ago

      It is very easy to get to those numbers if you don’t include the rate of false positives. That is all there is to this, really.