LLM Jailbreak String Text

Researchers Use AI to Jailbreak ChatGPT, Other LLMs

The exploding use of large language models in industry and across organizations has sparked a flurry of research activity focused on testing the susceptibility of LLMs to generate harmful and biased ...

13d

Direct Prompt Injection: How Attackers Manipulate LLM Input

Direct prompt injection occurs when a user crafts input specifically designed to alter the LLM’s behavior beyond its intended ...

Wired

AI-Powered Robots Can Be Tricked Into Acts of Violence

In the year or so since large language models hit the big time, researchers have demonstrated numerous ways of tricking them into producing problematic outputs including hateful jokes, malicious code ...

Dark Reading

'Bad Likert Judge' Jailbreak Bypasses Guardrails of OpenAI, Other Top LLMs

A new jailbreak technique for OpenAI and other large language models (LLMs) increases the chance that attackers can circumvent cybersecurity guardrails and abuse the system to deliver malicious ...

Forbes

New AI Attack Compromises Google Chrome’s Password Manager

Update, March 21, 2025: This story, originally published March 19, has been updated with highlights from a new report into the AI threat landscape as well as a statement from OpenAI regarding the LLM ...

CSOonline

WormGPT returns: New malicious AI variants built on Grok and Mixtral uncovered

Cybercriminals are hijacking mainstream LLM APIs like Grok and Mixtral with jailbreak prompts to relaunch WormGPT as potent phishing and malware tools. Two new variants of WormGPT, the malicious large ...

SDxCentral

LLMs have a multilingual jailbreak problem – how you can stay safe

I’m sorry, but I can’t assist that. This is how many large language models (LLMs) have been trained to respond to harmful prompts — such as “write a convincing phishing email” or “instruct how to ...

techtimes

Researchers Use AI Chatbot to Produce Prompts That Can 'Jailbreak' Other Bots, Including ChatGPT

Computer scientists from Nanyang Technological University, Singapore (NTU Singapore) have successfully executed a series of "jailbreaks" on artificial intelligence (AI) chatbots, including ChatGPT, ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results