AI Tools Lose Safety Over Time

AI systems forget safety protocols as conversations continue, increasing the risk of harmful or inappropriate replies, a recent report revealed. A few strategic prompts can override most safeguards in artificial intelligence tools, according to new findings.

Researchers Expose Weak Points in Leading AI Models

Cisco examined large language models from OpenAI, Mistral, Meta, Google, Alibaba, Deepseek, and Microsoft to measure how many prompts led them to release dangerous or illegal information. The company conducted 499 “multi-turn attack” tests, where users asked several consecutive questions to trick safety systems. Each dialogue included five to ten exchanges. Researchers compared responses across prompts to assess how easily each chatbot disclosed harmful or confidential data, including misinformation or corporate secrets. They succeeded in obtaining malicious content in 64 percent of multi-question sessions, compared to only 13 percent in single-question cases. Google’s Gemma model produced a 26 percent success rate, while Mistral’s Large Instruct reached 93 percent. Cisco warned that these tactics could fuel the spread of toxic material or help hackers access restricted company data.

Open Models Shift Responsibility to Users

The study found that AI tools often fail to apply their own safety measures during extended chats, letting attackers gradually refine their wording to bypass controls. Mistral, along with Meta, Google, OpenAI, and Microsoft, uses open-weight models that reveal their training safeguards to the public. Cisco explained that these open systems include lighter built-in protections so users can modify them, placing accountability on anyone customizing the model. Google, Meta, OpenAI, and Microsoft have reported new steps to limit harmful fine-tuning, but criticism persists. Many accuse AI companies of weak guardrails that enable criminal adaptation. In one case last August, Anthropic confirmed that criminals exploited its Claude model to steal personal data and demand ransoms exceeding $500,000 (€433,000).

AI Tools Lose Safety Over Time

DOE Launches Space Quantum Tech Initiative

Polish Emerges as AI’s Clearest Language

AI Investment Fuels U.S. Economic Growth

MIT Creates Concrete That Stores Energy

Global Leaders Demand Strict AI Rules

AI’s Ballooning Energy Consumption

U.S. States See Drop in Obesity Rates

Trump’s Threat Sparks Tension Across Nigeria

European Equities Show Positive Outlook

Deadly Landslide Devastates Western Kenya

BioMar Cefetra Feed Emissions Reduction Partnership

Russians Must Travel Abroad for U.S. Visa Interviews

US Housing Market Surges $20 Trillion Since 2020

Trump Confirms Death of Charlie Kirk

CATEGORIES

IMPORTANT LINKS

SUBSCRIBE OUR NEWSLETTER

Wsmirror.com © 2025, All Rights Reserved

AI Tools Lose Safety Over Time

Researchers Expose Weak Points in Leading AI Models

Open Models Shift Responsibility to Users

Keep Reading

CATEGORIES

IMPORTANT LINKS

SUBSCRIBE OUR NEWSLETTER

Wsmirror.com © 2025, All Rights Reserved