AI Tools Lose Safety Over Time

AI systems forget safety protocols as conversations continue, increasing the risk of harmful or inappropriate replies, a recent report revealed. A few strategic prompts can override most safeguards in artificial intelligence tools, according to new findings.

Researchers Expose Weak Points in Leading AI Models

Cisco examined large language models from OpenAI, Mistral, Meta, Google, Alibaba, Deepseek, and Microsoft to measure how many prompts led them to release dangerous or illegal information. The company conducted 499 “multi-turn attack” tests, where users asked several consecutive questions to trick safety systems. Each dialogue included five to ten exchanges. Researchers compared responses across prompts to assess how easily each chatbot disclosed harmful or confidential data, including misinformation or corporate secrets. They succeeded in obtaining malicious content in 64 percent of multi-question sessions, compared to only 13 percent in single-question cases. Google’s Gemma model produced a 26 percent success rate, while Mistral’s Large Instruct reached 93 percent. Cisco warned that these tactics could fuel the spread of toxic material or help hackers access restricted company data.

Open Models Shift Responsibility to Users

The study found that AI tools often fail to apply their own safety measures during extended chats, letting attackers gradually refine their wording to bypass controls. Mistral, along with Meta, Google, OpenAI, and Microsoft, uses open-weight models that reveal their training safeguards to the public. Cisco explained that these open systems include lighter built-in protections so users can modify them, placing accountability on anyone customizing the model. Google, Meta, OpenAI, and Microsoft have reported new steps to limit harmful fine-tuning, but criticism persists. Many accuse AI companies of weak guardrails that enable criminal adaptation. In one case last August, Anthropic confirmed that criminals exploited its Claude model to steal personal data and demand ransoms exceeding $500,000 (€433,000).

AI Tools Lose Safety Over Time

American Tech Innovations Shaping Our Future Now

US military AI deals with tech giants revealed now

Meta AI Muse Spark Rolls Out: Is It Getting Too Smart?

Instagram Will Alert Parents if Teens Search for Suicide or Self-Harm Content

OpenAI Considered Warning Police About Future Canadian School Shooter

UN Launches Global Panel to Tackle AI Risks Amid Growing Concerns

Richland Lane Closures George Washington

Washington Hotels Improve with Student Consultants

Ohio Data Center Hearing Focuses on Growth Concerns

Flamingo Aviary Upgrade Protects Birds in Britains

BioMar Cefetra Feed Emissions Reduction Partnership

Russians Must Travel Abroad for U.S. Visa Interviews

US Housing Market Surges $20 Trillion Since 2020

Trump Confirms Death of Charlie Kirk

CATEGORIES

IMPORTANT LINKS

SUBSCRIBE OUR NEWSLETTER

Wsmirror.com © 2025, All Rights Reserved

AI Tools Lose Safety Over Time

Researchers Expose Weak Points in Leading AI Models

Open Models Shift Responsibility to Users

Keep Reading

CATEGORIES

IMPORTANT LINKS

SUBSCRIBE OUR NEWSLETTER

Wsmirror.com © 2025, All Rights Reserved