With the rapid expansion of AI technologies, agentic AI is rapidly moving from experimentation to deployment on a scale larger than ever before. As a result, these systems have been given far greater autonomy to perform tasks with limited human input, much to the delight of enterprise magnates.
Companies such as Microsoft, Google, Anthropic, and OpenAI are increasingly developing agentic AI systems capable of automating vulnerability detection, incident response, code analysis, and other security tasks traditionally handled by human teams.
The appeal of using agentic AI as a first line of defence is palpable, as cybersecurity teams face mounting pressure from the growing volume of attacks. According to the Microsoft Digital Defense Report 2025, the company now detects more than 600 million cyberattacks daily, ranging from ransomware and phishing campaigns to identity attacks. Additionally, the International Monetary Fund has also warned that cyber incidents have more than doubled since the COVID-19 pandemic, potentially triggering institutional failures and incurring enormous financial losses.
To add insult to injury, ransomware groups such as Conti, LockBit, and Salt Typhoon have shown increased activity from 2024 through early 2026, targeting critical infrastructure and global communications, as if aware of the upcoming cybersecurity fortifications and using a limited window of time to incur as much damage as possible.
In such circumstances, fully embracing agentic AI may seem like an ideal answer to the cybersecurity challenges looming on the horizon. Systems capable of autonomously detecting threats, analysing vulnerabilities, and accelerating response times could significantly strengthen cyber resilience.
Yet the same autonomy that makes these systems attractive to defenders could also be exploited by malicious actors. If agentic AI becomes a defining feature of cyber defence, policymakers and companies may soon face a more difficult question: how can they maximise its benefits without creating an entirely new layer of cyber risk?
Why cybersecurity is turning to agentic AI
The growing interest in agentic AI is not simply driven by the rise in cyber threats. It is also a response to the operational limitations of modern security teams, which are often overwhelmed by repetitive tasks that consume time and resources.
Security analysts routinely handle phishing alerts, identity verification requests, vulnerability assessments, patch management, and incident prioritisation — processes that can become difficult to manage at scale. Many of these tasks require speed rather than strategic decision-making, creating a natural opening for AI systems to operate with greater autonomy.
Microsoft has aggressively moved into this space. In March 2025, the company introduced Security Copilot agents designed to autonomously handle phishing triage, data security investigations, and identity management. Rather than replacing human analysts, Microsoft positioned the tools to reduce repetitive workloads and enable security teams to focus on more complex threats.
Google has approached the issue through vulnerability research. Through Project Naptime, the company demonstrated how AI systems could replicate parts of the workflow traditionally handled by human security researchers by identifying vulnerabilities, testing hypotheses, and reproducing findings.
Anthropic introduced another layer of complexity through Claude Mythos, a model built for high-risk cybersecurity tasks. While the company presented the model as a controlled release for defensive purposes, the announcement also highlighted how advanced cyber capabilities are becoming increasingly embedded in frontier AI systems.
Meanwhile, OpenAI has expanded partnerships with cybersecurity organisations and broadened access to specialised tools for defenders, signalling that major AI firms increasingly view cybersecurity as one of the most commercially viable applications for autonomous systems.
Together, these developments show that agentic AI is gradually becoming embedded in the cybersecurity infrastructure. For many companies, the question is no longer whether autonomous systems can support cyber defence, but how much responsibility they should be given.
When agentic AI tools become offensive weapons
The same capabilities that make agentic AI valuable to defenders also make it attractive to malicious actors. Systems designed to identify vulnerabilities, analyse code, automate workflows, and accelerate decision-making can be repurposed for offensive cyber operations.
Anthropic offered one of the clearest examples of that risk when it disclosed that malicious actors had used Claude in cyber campaigns. The company said attackers were not simply using the model for basic assistance, but were integrating it into broader operational workflows. The incident showed how agentic AI can move cyber misuse beyond advice and into execution.
The risk extends beyond large-scale cyber operations. Agentic AI systems could make phishing campaigns more scalable, automate reconnaissance, accelerate vulnerability discovery, and reduce the technical expertise needed to launch certain attacks. Tasks that once required specialist teams could become easier to coordinate through autonomous systems.
Security researchers have repeatedly warned that generative AI is already making social engineering more convincing through realistic phishing emails, cloned voices, and synthetic identities. More autonomous systems could further push those risks by combining content generation with independent action.
The concern is not that agentic AI will replace human hackers. Cybercrime could become faster, cheaper, and more scalable, mirroring the same efficiencies that organisations hope to achieve through AI-powered defence.
The agentic AI governance gap
The governance challenge surrounding agentic AI is no longer theoretical. As autonomous systems gain access to internal networks, cloud infrastructure, code repositories, and sensitive datasets, companies and regulators are being forced to confront risks that existing cybersecurity frameworks were not designed to manage.
Policymakers are starting to respond. In February 2026, the US National Institute of Standards and Technology (NIST) launched its AI Agent Standards Initiative, focused on identity verification and authentication frameworks for AI agents operating across digital environments. The aim is simple but important: organisations need to know which agents can be trusted, what they are allowed to do, and how their actions can be traced.
Governments are also becoming more cautious about deployment risks. In May 2026, the Cybersecurity and Infrastructure Security Agency (CISA) joined cybersecurity agencies from Australia, Canada, New Zealand, and the United Kingdom in issuing guidance on the secure adoption of agentic AI services. The warning was clear: autonomous systems become more dangerous when they are connected to sensitive infrastructure, external tools, and internal permissions.
The private sector is adjusting as well. Companies are increasingly discussing safeguards such as restricted permissions, audit logs, human approval checkpoints, and sandboxed environments to limit the degree of autonomy granted to AI agents.
The questions facing businesses are becoming practical. Should an AI agent be allowed to patch vulnerabilities without approval? Can it disable accounts, quarantine systems, or modify infrastructure independently? Who is held accountable when an autonomous system makes the wrong decision?
Agentic AI may become one of cybersecurity’s most effective defensive tools. Its success, however, will depend on whether governance frameworks evolve quickly enough to keep pace with the technology itself.
How companies are building guardrails around agentic AI
As concerns around autonomous cyber systems grow, companies are increasingly experimenting with safeguards designed to prevent agentic AI from becoming an uncontrolled risk. Rather than granting unrestricted access, many organisations are limiting what AI agents can see, what systems they can interact with, and what actions they can execute without human approval.
Anthropic has restricted access to Claude Mythos over concerns about offensive misuse, while OpenAI has recently expanded its Trusted Access for Cyber programme to provide vetted defenders with broader access to advanced cyber tools. Both approaches reflect a growing consensus that powerful cyber capabilities may require tiered access rather than unrestricted deployment.
The broader industry is moving in a similar direction. CrowdStrike has increasingly integrated AI-driven automation into threat intelligence and incident response workflows while maintaining human oversight for critical decisions. Palo Alto Networks has also expanded its AI-powered security automation tools designed to reduce response times without fully removing human analysts from the decision-making process.
Cloud providers are also becoming more cautious about autonomous access. Amazon Web Services, Google Cloud, and Microsoft Azure have increasingly emphasised zero-trust security models, role-based permissions, and segmented access controls as enterprises deploy more automated tools across sensitive infrastructure.
Meanwhile, sectors such as finance, healthcare, and critical infrastructure remain particularly cautious about fully autonomous deployment due to the potential consequences of false positives, accidental shutdowns, or disruptions to essential services.
As a result, security teams are increasingly discussing safeguards such as audit logs, sandboxed environments, role-based permissions, staged deployments, and human approval checkpoints to balance speed with accountability. For now, many companies seem ready to embrace agentic AI, but without keeping one hand on the emergency brake.
The future of cybersecurity may be agentic
Agentic AI is unlikely to remain a niche experiment for long. The scale of modern cyber threats, combined with the mounting pressure on security teams, means organisations will continue to look for faster and more scalable defensive tools.
That shift could significantly improve cybersecurity resilience. Autonomous systems may help organisations detect threats earlier, reduce response times, address workforce shortages, and manage the growing volume of attacks that human teams increasingly struggle to handle alone.
At the same time, the technology’s long-term success will depend as much on restraint as on innovation. Without clear governance frameworks, operational safeguards, and human oversight, the same tools designed to strengthen cyber defence could introduce entirely new vulnerabilities.
The future of cybersecurity may increasingly belong to agentic AI. Whether that future becomes safer or more volatile may depend on how responsibly governments, companies, and security teams manage the transition.
Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!































