OpenAI has been informed about seven proof-of-concept flaws/bugs in their generative AI chatbot, and here are the ways to mitigate them.
Recent research by one cybersecurity firm into the architecture of ChatGPT (versions 4o to 5) has uncovered seven Proof-of-Concept (PoC) risks for users relying on AI tools for communication, research, and business.
These primarily involve potentially exploitable vulnerabilities in how the AI models handle external web content, stored conversation memories, and safety checks designed to prevent misuse.
At the core of these issues is a type of attack called indirect prompt injection. This technique involves embedding hidden instructions inside external sources such as online articles, blog comments, or search results. When the chatbot accesses these sources during its browsing or answering processes, it may unknowingly execute unauthorized commands. Attackers can trigger these compromises in several PoC scenarios:
- through “0-click” attacks, where simply asking a question causes the AI to consume injected prompts from indexed web content without any user interaction beyond the query
- “1-click” attacks that leverage malicious URLs that, when clicked, prompt the AI to carry out unintended behaviors immediately
- persistent injection attacks where harmful instructions are stored in the chatbot’s long-term memory feature, causing ongoing unauthorized activity across multiple sessions until the memory is cleared
Three other risks
Another proof-of-concept vulnerability involves the possibility of threat actors bypassing the platform’s safety validation for URLs. Attackers can exploit trusted link wrappers, such as links from well-known search engines, to conceal malicious destinations, circumventing built-in filtering mechanisms.
Next is a conversation-injection bug that allows potential attackers to input ‘conversational instructions’ through the chatbot’s dual-system structure, where one system handles web browsing and the other conversation. Malicious users can covertly influence responses without direct user input.
Finally, attackers may also exploit bugs that hide malicious content inside code blocks or markdown formatting, concealing harmful commands from users while being executed by the AI.
Mitigation tips
The disclosure of the discovery of these seven flaws/bugs were made recently by Tenable security specialists. OpenAI has acknowledged the findings, and the firm is working on fixes. According to their spokesperson: “Individually, these flaws seem small — but together they form a complete attack chain… It shows that AI systems aren’t just potential targets; they can be turned into attack tools that silently harvest information from everyday chats or browsing.”
While some of the disclosed PoC risks have already been addressed at this point, others remain at the research and testing stage awaiting preemptive resolution. In the meantime, here are some tips for mitigate the risks:
- Treat AI tools as active attack surfaces requiring continuous security assessment
- Monitor AI-generated outputs for abnormal or suspicious behavior that could potentially indicate prompt injection or manipulation
- Audit any AI integration points such as browsing features, memory storage, and external link resolutions to ensure safety mechanisms are effective
- Implement governance and data usage policies to control what information is fed into AI systems, minimizing exposure of sensitive data
- Regularly review and clear AI memory features where possible, to remove persistent injected instructions
- Test AI systems rigorously against known injection and evasion techniques to identify vulnerabilities before attackers do
- Educate users about risks of clicking unknown URLs or feeding sensitive information to AI without safeguards
Understanding these emerging threats and following proactive security practices is essential for both organizations and individuals to safeguard privacy and ensure AI tools operate as intended, without becoming vectors for data leakage or manipulation.
Users of other GenAI models should also consider applying these mitigation strategies, as indirect prompt injection and memory exploitation risks are common challenges in AI systems with browsing and memory capabilities.


