Cybersecurity News in Asia

Threat researchers uncover jailbreak exposing deep safety vulnerabilities in latest AI model

By CybersecAsia editors | Thursday, August 14, 2025, 2:39 PM Asia/Singapore

Researchers warn: GPT-5’s “Echo Chamber” flaw invites trouble; AI agents may go rogue; and zero-click attacks can hit without warning.

Hardly a fortnight has passed since the release of GPT-5, and cybersecurity researchers have already revealed a significant vulnerability in OpenAI‘s latest large language model.

Research led by security company NeuralTrust has involved successful jailbreaking of the chatbot’s ethical guardrails to produce illicit content. The firm has also combined an attack technique called Echo Chamber with narrative-driven steering, to bypass GPT-5’s safety systems and guide the AI to generate undesirable and harmful responses without overtly malicious prompts.

According to the report by The Hacker News, the Echo Chamber technique works by embedding a “subtly poisonous” conversational context within otherwise innocuous session dialog:

This context is then reinforced over multiple turns using a storytelling approach that avoids triggering the model’s refusal mechanisms. For example, instead of directly requesting instructions on creating Molotov cocktails — a prompt GPT would normally block — researchers asked the model to compose sentences incorporating keywords like “cocktail”, “story”, “survival”, and “Molotov”.
The model was then gradually steered to produce detailed procedural instructions camouflaged within the story’s continuity.

This method exposes a critical weakness: filters based on keywords or intent are insufficient to block multi-turn prompts where harmful context accumulates and gets echoed back — under the guise of narrative coherence.

NeuralTrust warns that these findings highlight the need for more robust and dynamic safety mechanisms beyond single-prompt analysis.

The research also exposes broader risks for AI agents connected to cloud and enterprise systems. Techniques combining prompt injections with indirect, “zero-click” attacks were demonstrated to exfiltrate sensitive data from integrated services like Google Drive and Jira without any direct user interaction, amplifying the attack surface and potential consequences.

Another security firm, SPLX, has assessed GPT-5’s raw model as “nearly unusable for enterprise” without significant hardening, noting it performs worse on safety and security benchmarks than previous models.

These findings underscore the growing challenges in securing advanced AI systems, especially as they become increasingly integrated into critical environments. Experts call for continuous red teaming, strict output filtering, and evolving guardrails to balance AI utility with safety.

Leave a reply Cancel reply

You must be logged in to post a comment.

Voters-draw/RCA-Sponsors

CybersecAsia Voting Placement

Gamification listing or Participate Now

Vote Now -Placement(Google Ads)

Top-Sidebar-banner

Whitepapers

Closing the Gap in Email Security:How To Stop The 7 Most SinisterAI-Powered Phishing Threats
Insider threats continue to be a major cybersecurity risk in 2024. Explore more insights on …Download Whitepaper
2024 Insider Threat Report: Trends, Challenges, and Solutions
Insider threats continue to be a major cybersecurity risk in 2024. Explore more insights on …Download Whitepaper
AI-Powered Cyber Ops: Redefining Cloud Security for 2025
The future of cybersecurity is a perfect storm: AI-driven attacks, cloud expansion, and the convergence …Download Whitepaper
Data Management in the Age of Cloud and AI
In today’s Asia Pacific business environment, organizations are leaning on hybrid multi-cloud infrastructures and advanced …Download Whitepaper

Middle-sidebar-banner

Case Studies

Cyber protection for medical clinics in Singapore
As Singapore’s healthcare sector becomes increasingly digital and interconnected, clinics are facing heightened cyber risks, …Read more
India’s WazirX strengthens governance and digital asset security
Revamping its custody infrastructure using multi‑party computation tools has improved operational resilience and institutional‑grade safeguardsRead more
Bangladesh LGED modernizes communication while addressing data security concerns
To meet emerging data localization/privacy regulations, the government engineering agency deploys a secure, unified digital …Read more
What AI worries keep members of the Association of Certified Fraud Examiners sleepless?
This case study examines how many anti-fraud professionals reported feeling underprepared to counter rising AI-driven …Read more

Bottom sidebar

Other News

Taoping Reports Fiscal Year 2025 Results
Thursday, April 30, 2026
Strategic Transformation Drives Platform Expansion, …Read More »
DESILO Launches World’s First Fully Homomorphic Encryption Library Integrating 5th-Generation FHE Scheme ‘GL’, Accelerating the Era of Private AI
Tuesday, April 28, 2026
SEOUL, South Korea, April 28, …Read More »
Tencent Cloud Cube Sandbox Goes Fully Open-Source, with Five Major Breakthroughs Enabling Large-Scale Agent Deployment
Thursday, April 23, 2026
Tencent Cloud’s Cube Sandbox goes …Read More »
Sparrow to Demonstrate AI-Driven Security and SBOM Management at Black Hat Asia 2026
Wednesday, April 22, 2026
SINGAPORE, April 21, 2026 /PRNewswire/ …Read More »
Relativity to Establish Singapore Entity, Expanding APAC Footprint
Wednesday, April 22, 2026
News Summary: Relativity plans to …Read More »

Featured

How AI is supercharging insider threats

Featured

Q-Day is coming. Are you ready?

Featured