AI in Critical Infrastructure: What Good Actually Looks Like

We know what can go wrong. Here’s what getting it right could look like.

A wide cinematic scene showing a modern city skyline, power lines, wind turbines, and water infrastructure at sunrise. Blue and amber lighting spread across the landscape while glowing digital network lines and AI-style blueprint overlays connect the systems together, symbolizing intelligent coordination across critical infrastructure. — AI Generated Image (ChatGPT)

We’ve spent two articles describing what could potentially go wrong when AI agents are deployed critical infrastructure. The technical risks are real, but the human ones are sharper. It has been established that systems don’t only fail in isolation; they can also fail through the people who trust them.

If the risk in AI-driven critical infrastructure isn’t purely technical, then the solution can’t be purely technical either.

That sounds obvious.

In practice, almost every security control being discussed today still assumes the system itself is the primary problem.

Well, it isn’t.

Parts 1 and 2 of this series mapped the real attack surface: prompt injection, induced model drift, engineered automation bias, privilege creep, and the quiet erosion of operator situational awareness. The pattern across all of them was the same: adversaries don’t need to break AI systems to cause harm. They only need to understand how humans interact with them.

This is the part where we talk about what getting it right actually looks like and what comes next. An interesting look at what security in the age of agentic AI requires, and why, with the right choices, we have a genuine opportunity to get this right.

The Frameworks We Have Were Built for a Different World.

The dominant standard in operational technology is IEC 62443. Rigorous, comprehensive, and widely adopted. It was also designed for deterministic systems: systems that behave consistently, produce predictable outputs, and fail in ways that can be traced.

AI agents are none of these things. They adapt. Their outputs shift with their inputs. Their failure modes are probabilistic, not mechanical. And when they go wrong, whether through drift, injection, or manipulation, the logs may show nothing unusual.

The NIST AI Risk Management Framework is newer and more attuned to AI-specific risks. But it was developed primarily for IT contexts and doesn’t exactly translate to OT environments, where legacy hardware, network isolation requirements, and real-time operational demands create a different risk landscape.

The gap between these two frameworks is where the most serious risks currently live. Bridging it is not a technical problem. It’s a design and governance problem, and that’s solvable.

So what does “good” actually look like in a critical infrastructure environment with AI agents deployed?

Not perfection. Not full autonomy. But systems designed with failure in mind.

1. Design for Interruption, Not Just Automation

AI deployments in critical systems are likely optimized for continuity, fewer interruptions, smoother performance, and less human involvement.

This could be the wrong goal.

In high-risk environments, the ability to interrupt a system safely, I think, should be more important than the ability to run it efficiently.

This means:

Operators must be able to override AI decisions quickly and confidently.
The systems themselves should invite intervention where necessary, rather than discourage it.
“No alert” should never be treated as “no problem”.

A well-designed system shouldn’t just automate decisions. It should make it easy to challenge these decisions.

2. Make AI Legible Under Pressure

Being able to explain something easily should be an operational requirement in critical infrastructure.

During an incident:

Operators don’t need a full model breakdown.
They probably only need to answer one real question:

“Do i trust this Agents decision right now?”

To answer this question requires:

Context-aware explanations such as; what changed, and why.
Clear indicators of uncertainity need to be addressed, or confidence.
Visibility into what the system is not seeing.

Much current work on AI explainability focuses on making models interpretable to data scientists and engineers. This is necessary. However, operators, the people running the infrastructure, need explanations that allow then to evaluate whether to trust a decision in real time, not explanations buried in a technical log.

If operators cannot interpret the reasoning behind an AI system’s behaviour, they lose the ability to confidently verify its decisions, intervene when necessary, or distinguish normal operation from emerging failure.

3. Monitor the Human-AI Interface, Not Just the System

Most monitoring today focuses on:

System health
Network activity
Model performance.

Very little focuses on the interaction layer, and that is were many failures begin.

What should also be monitored:

How often operators override AI decisions
How long it takes to intervene after anomalies appear
Patterns of over-reliance or under-reliance.

In other words, not just:

“Is the system working”

But:

“Are humans intracting with it safely?”

Just as flight crews conduct manual flying exercises to prevent skill atrophy in an era of autopilot, critical infrastructure operators need regular practice maintaining the independent situational awareness that is the last line of defence when an AI agent is potentially compromised.

Regular override drills, structured exercises in which operators manage systems without AI assistance, should be formalised and treated as non-negotiable.

4. Design Accountability Before You Need It, Don’t Assume

One of the most dangerous gaps in AI deployment is the assumption that responsibility will “sort itself out”.

Well, It definitely won’t.

In traditional systems, accountability is clearer:

A misconfiguration → Engineer
A missed alert → Operator

In AI systems, those boundaries are likely very blur.

So explicit role definitions need to be put in place.

Who is responsible for model behavior in production?
Who has authority to override decisions?
Who is accountable when human and AI judgment conflict?

Without predefined answers, incident response becomes a negotiation, and negotiation costs time.

5. Contain Failure Before It Cascades

As AI systems become more interconnected, failure is no longer isolated.

One agent’s decision can cascade across power, water, and communications systems within seconds.

This makes containment a design requirement, not an afterthought.

Key principles:

Strict boundaries on agent permissions.
Independent verification for high-impact actions.

The goal is simple:

No single decision, human or AI, should be able to cause irreversible system-wide failure, although I think most organizations already have this right.

This is also where IEC 62443 may need to evolve most urgently.

Its existing supply chain security requirements need to extend to the integrity of data pipelines feeding AI systems, because an agent trained on compromised data is a compromised agent, regardless of how secure its deployment environment is.

The Real Shift

Across all of this, one thing becomes clear.

We are no longer just securing systems. We are securing relationships; among humans, machines, and the environments they operate in. This is an entirely different problem. And it requires an entirely different mindset.

The good news is that the field is already moving.

The NIST AI RMF is being actively extended. IEC 62443 was not originally designed with adaptive systems in mind, though working groups are now beginning to address that gap. Researchers in human factors, OT security, and adversarial ML are now working in the same rooms. The conceptual building blocks for a genuine sociotechnical security framework, one that treats the human-AI interface as a first-class security concern, already exist.

What has been missing is the urgency to put them together.

A Different Kind of Optimism

A control room operator sits at a workstation in a dark, futuristic operations center, facing large glowing dashboards filled with green system data and network visualizations. White blueprint-like lines and connected nodes overlay the room, representing AI decision pathways and collaboration between humans and intelligent systems in managing critical infrastructure. — AI Generated Image (ChatGPT)

On April 28, 2025, 60 million people across Spain and Portugal lost power for nearly an entire day. The cause was technical, overvoltage cascades in a high-renewable grid, but the deeper failure was one of coordination, governance, and response structures that could not keep pace with the speed of events.

No AI agents were involved. But that event showed exactly what is at stake when complex, interconnected infrastructure fails faster than human institutions can respond.

The answer is not to slow down AI deployment in critical infrastructure. These systems, well-designed, can genuinely make infrastructure more resilient, faster anomaly detection, better load forecasting, and more responsive failover. The potential is real.

The failures we’re worried about won’t always look like failures.

They’ll look like normal operations, until suddenly, they aren’t.

The dashboards will be green.

The logs will be clean.

The system will appear stable.

And that’s exactly what makes the situation dangerous.

Getting this right means building systems where dashboard appearance can be trusted, because the humans and institutions behind them have earned that trust deliberately, not assumed it by default.

That is a harder goal than building a better model. It is also a more important one. And for the first time, we have the tools, the research, and, slowly, the regulatory will to pursue it seriously.

The question is whether we move fast enough to matter. Because in critical infrastructure, incomplete is not good enough.

This is the final part of a three-part series on AI agents and critical infrastructure. Part 1 was published in Towards Agentic AI and Part 2 in MeetCyber on Medium.

AI in Critical Infrastructure: What Good Actually Looks Like was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.