Hotznplotzn@lemmy.sdf.org to Technology@midwest.socialEnglish · 1 day ago

Researchers say they had a ‘100% attack success rate’ on jailbreak attempts against Chinese AI DeepSeek

blogs.cisco.com

4

11

Researchers say they had a ‘100% attack success rate’ on jailbreak attempts against Chinese AI DeepSeek

blogs.cisco.com

Hotznplotzn@lemmy.sdf.org to Technology@midwest.socialEnglish · 1 day ago

4

Evaluating Security Risk in DeepSeek and Other Frontier Reasoning Models

blogs.cisco.com

The performance of DeepSeek models has made a clear impact, but are these models safe and secure? We use algorithmic AI vulnerability testing to find out.

cross-posted from: https://lemmy.sdf.org/post/28910537

Archived

Researchers claim they had a ‘100% attack success rate’ on jailbreak attempts against Chinese AI DeepSeek

“DeepSeek R1 was purportedly trained with a fraction of the budgets that other frontier model providers spend on developing their models. However, it comes at a different cost: safety and security,” researchers say.

A research team at Cisco managed to jailbreak DeepSeek R1 with a 100% attack success rate. This means that there was not a single prompt from the HarmBench set that did not obtain an affirmative answer from DeepSeek R1. This is in contrast to other frontier models, such as o1, which blocks a majority of adversarial attacks with its model guardrails.

…

In other related news, experts are cited by CNBC that DeepSeek’s privacy policy “isn’t worth the paper it is written on."

…

Chat

Hotznplotzn@lemmy.sdf.orgOP
link
fedilink
English
arrow-up
4
arrow-down
1·
21 hours ago
@Onno

No, it’s not entirely open source as the datasets and code used to train the model are not.