Join us

Roses are red, violets are blue, if you phrase it as poem, any jailbreak will do

Roses are red, violets are blue, if you phrase it as poem, any jailbreak will do

A new study just broke the safety game wide open: rhymed prompts slipped past filters in 25 major LLMs, including Gemini 2.5 Pro and Deepseek - with up to 100% success. No clever chaining, no jailbreak soup. Just single-shot rhyme.

Turns out, poetic language isn’t just for bard-core Twitter. When it comes to triggering unsafe outputs, especially around cyberattacks or data leaks, rhymes triple success rates compared to plain prose.


Let's keep in touch!

Stay updated with my latest posts and news. I share insights, updates, and exclusive content.

Unsubscribe anytime. By subscribing, you share your email with @kala and accept our Terms & Privacy.

Give a Pawfive to this post!


Only registered users can post comments. Please, login or signup.

Start writing about what excites you in tech — connect with developers, grow your voice, and get rewarded.

Join other developers and claim your FAUN.dev() account now!

Avatar

Kala #GenAI

FAUN.dev()

@kala
Generative AI Weekly Newsletter, Kala. Curated GenAI news, tutorials, tools and more!
Developer Influence
1

Influence

1

Total Hits

78

Posts