Identify evolving security threats to AI models

Artificial intelligence (AI) has quickly become the cornerstone of technological and business innovation, permeating every industry and fundamentally transforming the way we interact with the world. AI tools now streamline decision-making, optimize operations and enable new personalized experiences.

However, this rapid expansion is accompanied by a complex and growing threat landscape, which combines traditional cybersecurity risks with unique AI-specific vulnerabilities. These emerging risks may include data manipulation, adversarial attacks, and exploitation of machine learning models, each of which has serious potential impacts on privacy, security, and trust.

As AI continues to become deeply integrated into critical infrastructure, from healthcare to finance to national security, it is crucial for organizations to adopt a proactive, multi-layered defense strategy. By remaining vigilant and continually identifying and addressing these vulnerabilities, businesses can protect not only their AI systems, but also the integrity and resilience of their broader digital environments.

Kasimir Schulz

Senior Security Researcher at HiddenLayer.

New threats facing AI models and users

As the use of AI grows, so does the complexity of the threats it faces. Some of the most pressing threats concern trust in digital content, backdoors intentionally or unintentionally built into templates, traditional security vulnerabilities exploited by attackers, and new techniques that cleverly circumvent existing protections. Additionally, the rise of deepfakes and synthetic media further complicates the landscape, creating challenges around verifying the authenticity and integrity of AI-generated content.

Trust in digital content: As AI-generated content gradually becomes indistinguishable from real images, companies are implementing safeguards to stop the spread of misinformation. What happens if a vulnerability is discovered in one of these protections? Manipulating watermarks, for example, allows adversaries to falsify the authenticity of images generated by AI models. This technique can add or remove invisible watermarks that mark content as AI-generated, thereby undermining trust in the content and promoting misinformation, a scenario that can lead to serious social consequences.

Backdoors in templates: Due to the open source nature of AI models through sites like Hugging Face, a frequently reused model containing a backdoor could have serious supply chain implications. A cutting-edge method developed by our Synaptic Adversarial Intelligence (SAI) team, called “ShadowLogic,” allows adversaries to implant hidden, codeless backdoors into neural network models in any modality. By manipulating the model’s computer graph, attackers can compromise its integrity without detection, retaining the backdoor even as a model is refined.

Integrating AI into high-impact technologies: AI models like Google’s Gemini have proven susceptible to indirect rapid injection attacks. Under certain conditions, attackers can manipulate these models to produce misleading or harmful responses, and even trick them into calling APIs, underscoring the continued need for vigilant defense mechanisms.

Traditional security vulnerabilities: Common vulnerabilities and exposures (CVEs) in AI infrastructure continue to plague organizations. Attackers often exploit weaknesses in open source frameworks, making it essential to proactively identify and address these vulnerabilities.

New attack techniques: While traditional security vulnerabilities still pose a significant threat to the AI ecosystem, new attack techniques are emerging almost daily. Techniques such as Knowledge Return Oriented Prompting (KROP), developed by the SAI team at HiddenLayer, present a significant challenge to AI security. These new methods allow adversaries to bypass conventional security measures built into large language models (LLMs), opening the door to unintended consequences.

Identify vulnerabilities before adversaries do

To combat these threats, researchers must stay ahead of the curve, anticipating the techniques that bad actors may employ, often before these adversaries even recognize potential opportunities for impact. By combining proactive research with innovative, automated tools designed to expose hidden vulnerabilities in AI frameworks, researchers can discover and disclose new common vulnerabilities and exposures (CVEs). This responsible approach to vulnerability disclosure not only strengthens individual AI systems, but also fortifies the entire industry by raising awareness and establishing baseline protections to combat known and emerging threats.

Identifying vulnerabilities is only the first step. Equally critical is translating academic research into practical, deployable solutions that work effectively in real-world production environments. This bridge between theory and application is illustrated in projects in which HiddenLayer’s SAI team adapted academic knowledge to address real-world security risks, emphasizing the importance of making research actionable and ensuring that Defenses are robust, scalable and adaptable to evolving threats. By transforming basic research into operational defenses, the industry not only protects AI systems, but also builds resilience and trust in AI-driven innovation, protecting users and organizations against a landscape of rapidly evolving threats. This proactive, multi-layered approach is essential to enabling secure and reliable AI applications that can withstand current and future adversarial techniques.

Innovating for safer AI systems

Security around AI systems can no longer be an afterthought; it must be woven into the fabric of AI innovation. As AI technologies advance, so do the methods and motivations of attackers. Threat actors are increasingly working to exploit weaknesses specific to AI models, from adversarial attacks that manipulate model output to data poisoning techniques that degrade model accuracy. To address these risks, the industry is moving toward integrating security directly into the AI development and deployment phases, making it an integral part of the AI lifecycle. This proactive approach promotes safer environments for AI and mitigates risks before they manifest, reducing the likelihood of unexpected disruptions.

Researchers and industry leaders are accelerating efforts to identify and thwart evolving vulnerabilities. As AI research moves from theoretical exploration to practical application, new attack methods are rapidly moving from academic discourse to real-world implementation. Adopting “security by design” principles is essential to establishing a security-first mindset that, while not foolproof, elevates the baseline protection of AI systems and industries who depend on it. As AI revolutionizes industries from healthcare to finance, integrating robust security measures is essential to support sustainable growth and foster trust in these transformative technologies. Embracing security not as a barrier but as an enabler for responsible progress will ensure that AI systems are resilient, reliable and equipped to withstand the dynamic and sophisticated threats they face, paving the way for future advancements to the both innovative and secure.

We’ve compiled a list of the best identity management software.

This article was produced as part of TechRadarPro’s Expert Insights channel, where we feature the best and brightest minds in today’s technology industry. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you’re interested in contributing, find out more here:

Must Read

Leave a Comment Cancel Reply