• A_A@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    6 hours ago

    One of 6 described methods :
    The model is prompted to explain refusals and rewrite the prompt iteratively until it complies.

  • Cornpop@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    ·
    9 hours ago

    This is so stupid. You shouldn’t have to “jailbreak” these systems. The information is already out there with a google search.

  • meyotch@slrpnk.net
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    1
    ·
    11 hours ago

    My own research has made a similar finding. When I am taking the piss and being a random jerk to a chatbot, the bot much more frequently violates their own terms of service. Introducing non-sequitur topics after a few rounds really seems to ‘confuse’ them.