Activity - I think if the 2nd LLM has ever seen the actual prompt, then no, you could just...

Welcome to Incremental Social! Learn more about this project here!
Check out lemmyverse to find more communities to join from here!

teawrecks , 2 months ago

I think if the 2nd LLM has ever seen the actual prompt, then no, you could just jailbreak the 2nd LLM too. But you may be able to create a bot that is really good at spotting jailbreak-type prompts in general, and then prevent it from going through to the primary one. I also assume I'm not the first to come up with this and OpenAI knows exactly how well this fares.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...