Claude was being judgy, so I called it out. It immediately caved. Is verbal abuse a valid method of circumventing LLM censorship??

  • froztbyte@awful.systems
    link
    fedilink
    English
    arrow-up
    2
    ·
    5 hours ago

    is when the blighted thing pretends it’s anything at all like a person and …

    one of the helpful things that I think is worth reminding people of is that the thing described here is entirely the result of overt intent and choice on the part of humans shaping the fucking thing

    it isn’t an emergent property

    from the curation of data going into training to shaping “preferred output” (by tuning whatever properties my apply in xyz llm execution pipeline) this shit is all Super Polite by fucking design

    exact same shit applies for the prompts responding “oh yeah I’m totes alive! look, I feel doubted that you even asked! :sad:” - that shit is curated and engineered

    but I do hear where you’re coming from on the anger. 100%.