How Data Centers Actually Work: From Cooling Systems to GPU Clusters

The factories behind your favorite apps.

Every time you ask ChatGPT a question, stream a movie on Netflix, or scroll through Instagram, your request travels to a building you’ll probably never see. Inside that building are rows upon rows of machines — humming, blinking, and generating enough heat to warm a small town.

If you can’t see the full story, Click me and unlock full story

That building is a data center. And in 2026, data centers aren’t just warehouses full of servers anymore. They’re becoming AI factories — purpose-built industrial facilities designed to produce intelligence at scale.

I’m a computer engineer, and I think data centers are one of the most underappreciated engineering marvels of our time. Most people imagine a room full of computers. The reality is far more complex — and far more interesting.

So let’s go inside. From the power that flows in, to the heat that flows out, to the GPU clusters that are reshaping these buildings from the ground up — here’s how data centers actually work.

What a Data Center Actually Is

At its most basic, a data center is a facility designed to house, power, cool, and connect large numbers of computers. But calling it “a room full of computers” is like calling a hospital “a building with beds.” The infrastructure surrounding those machines is what makes everything work.

A modern data center has five critical layers:

Power systems — bringing electricity in and keeping it reliable.
Cooling systems — removing the enormous heat generated by thousands of processors.
Compute hardware — the servers, GPUs, CPUs, and storage that do the actual work.
Networking — connecting machines to each other and to the outside world.
Physical security — protecting millions of dollars of hardware and billions of bytes of data.

Each of these layers is an engineering discipline in its own right. Let’s walk through them.

Power: The Lifeblood

A data center without power is just an expensive warehouse. Power infrastructure is the foundation everything else sits on.

How much power are we talking about? A traditional enterprise data center might use 5–10 megawatts. A modern hyperscale facility built for AI — the kind operated by Google, Microsoft, Meta, or Amazon — can consume 100+ megawatts. That’s enough electricity to power a small city.

The power enters the facility from the utility grid and passes through multiple layers of protection:

Substations and transformers step the voltage down from high-voltage transmission lines to levels the equipment can use.

Uninterruptible Power Supplies (UPS) contain massive battery banks that kick in instantly if grid power fails. They bridge the gap — usually just seconds — until backup generators start.

Diesel generators provide long-term backup power. Most facilities maintain enough fuel to run for 24–72 hours without grid power. Some hyperscalers are now exploring natural gas and even small modular nuclear reactors as alternatives.

Power Distribution Units (PDUs) route electricity from the UPS to individual server racks.

The industry measures efficiency using a metric called Power Usage Effectiveness (PUE). A PUE of 1.0 would mean every watt goes to computing. In reality, a significant portion goes to cooling and other infrastructure. The industry average in 2026 hovers around 1.3–1.5, meaning 30–50% of total power is used for non-compute purposes. The best hyperscale facilities achieve PUEs below 1.1.

Why power matters now more than ever: GPU clusters for AI consume dramatically more power than traditional servers. A single rack of NVIDIA H100 GPUs can pull 40 kilowatts. The newest Blackwell-based racks exceed 120 kW. NVIDIA’s upcoming Vera Rubin NVL72 configuration — 72 GPUs in a single rack — pushes past 200 kW. Power availability has become the primary constraint for new data center construction. It’s no longer about finding land or laying fiber — it’s about securing megawatts.

Cooling: The Silent Engineering Battle

Here’s a fact that surprises most people: cooling accounts for roughly 40% of a data center’s total energy consumption. The heat problem is so significant that cooling has evolved from a supporting system to a central component of data center architecture — influencing building design, layout, and even geographic location.

Every watt of electricity consumed by a processor is eventually converted to heat. That heat must be removed, or the equipment fails. Let’s look at how:

Air Cooling — The Traditional Approach

For decades, the standard method was simple: blow cold air over the servers.

Most data centers use a hot aisle / cold aisle layout. Servers are arranged in rows so that all the cool air intakes face one aisle (the cold aisle) and all the exhaust fans face the other (the hot aisle). Computer Room Air Conditioning (CRAC) units push cold air under a raised floor, where it flows up through perforated tiles into the cold aisle.

This works well for traditional servers generating 5–10 kW per rack. But air cooling hits a physical wall at roughly 20 kW per rack. Beyond that, you simply can’t move enough air fast enough to keep the chips cool.

And modern GPU racks start at 40 kW.

Liquid Cooling — The 2026 Standard

This is where things get fascinating. Liquid cooling has moved from experimental to standard specification for AI infrastructure:

Direct-to-chip cooling runs coolant — typically water or specialized fluid — through cold plates that sit directly on top of the hottest components (GPUs and CPUs). The liquid absorbs heat far more efficiently than air — roughly 3,500 times more effectively, in fact. The heated liquid then circulates to a Coolant Distribution Unit (CDU), where it transfers heat to the building’s chilled water loop or an external heat rejection system.

Rear-door heat exchangers attach to the back of server racks. Hot exhaust air passes through a coil of chilled water on its way out, removing heat before it reaches the room.

Immersion cooling is the most dramatic approach: servers are literally submerged in tanks of non-conductive liquid. Some systems use two-phase immersion — the liquid boils on contact with hot components, turns to vapor, rises, condenses on a cool surface, and drips back down. It’s like a self-contained rain cycle inside a tank.

2026 reality check: NVIDIA’s latest Vera Rubin NVL72 rack is fully liquid-cooled, fanless, and cableless — exceeding 200 kW in a single enclosure. Air cooling these machines isn’t just inefficient — it’s physically impossible.

Where Does All That Heat Go?

An increasingly creative answer: it gets reused. AI data centers produce enormous waste heat, and forward-thinking operators are capturing it instead of venting it into the atmosphere. In Scandinavian countries, data center waste heat already warms thousands of homes through district heating networks. In 2026, heat-reuse infrastructure is being built directly into new data center designs worldwide.

Compute: The Machines That Do the Work

Inside those cooled, powered racks live the machines that actually process your requests.

Traditional Servers (CPU-Based)

Most of the internet still runs on CPU-based servers. Web applications, databases, email systems, and business software — these workloads use standard server hardware with multi-core processors, memory, and storage drives.

A typical server rack might hold 20–40 servers, each with one or two CPUs. These racks are well-understood, efficiently cooled with air, and represent the backbone of cloud computing as we know it.

GPU Clusters (The AI Revolution)

Then there are the GPU racks — and this is where data centers are being fundamentally redesigned.

Why GPUs for AI? A CPU is designed to handle a few complex tasks very quickly (like running your operating system). A GPU is designed to handle thousands of simple tasks simultaneously (like matrix multiplications — the core math behind neural networks). A single NVIDIA H100 GPU contains over 16,000 cores. A cluster of thousands of these GPUs can train AI models with billions of parameters.

But GPU clusters aren’t just “more computers.” They create unique infrastructure demands:

Extreme power density. Where a CPU server rack might draw 10 kW, a GPU rack pulls 40–200+ kW. The power infrastructure must be dramatically upgraded.

Continuous operation. GPU clusters run at near-100% utilization for days or weeks during AI model training. There’s no idle time, no downtime for cooling to catch up. The thermal load is sustained and relentless.

Ultra-fast networking. For thousands of GPUs to function as a single system, they need to communicate with each other at extraordinary speeds. Technologies like InfiniBand and advanced 800 Gbps Ethernet connect GPU nodes with latencies measured in microseconds. The performance of a cluster often depends more on the network connecting the GPUs than on the GPUs themselves.

Scale check: The largest AI clusters in 2026 exceed 100,000 GPUs in a single deployment. These aren’t incremental upgrades — they’re industrial-scale facilities purpose-built to produce artificial intelligence.

Networking: The Invisible Backbone

A data center’s network is what connects everything — machines to each other, and the facility to the outside world.

Inside the data center, a layered network architecture connects servers to switches, switches to aggregation layers, and aggregation to core routers. For AI clusters, specialized fabrics like InfiniBand create a mesh that allows any GPU to communicate with any other GPU at near wire speed.

Outside the data center, high-capacity fiber optic connections link the facility to internet backbone networks, other data centers, and CDN edge servers. Major data center campuses sit at the intersection of multiple fiber routes, providing redundancy and low-latency connectivity to global networks.

Why networking constrains AI more than you’d think: You can buy all the GPUs you want, but if the network between them can’t move data fast enough, they sit idle waiting for each other. In many large deployments, network port density and optical component supply now limit cluster size more than GPU availability. The interconnect has become the bottleneck.

Physical Security: More Than Locked Doors

Data centers hold some of the most valuable — and sensitive — information on the planet. Physical security is taken extremely seriously:

Multi-layer perimeter security includes fences, barriers, security guards, and vehicle checkpoints. Biometric access control (fingerprint, iris scan, or facial recognition) restricts entry to authorized personnel only. Mantrap entrances — small rooms where one door must close before the next opens — prevent tailgating. 24/7 video surveillance covers every angle of the facility. Environmental monitoring tracks temperature, humidity, water leaks, and fire risk across every zone.

Some hyperscale facilities are built in undisclosed locations. You could drive past one and never know it — they’re often designed to look like ordinary warehouses or office buildings.

The Shift: From Data Center to AI Factory

Here’s the biggest transformation happening in 2026: data centers are being redesigned from the ground up around AI workloads.

Traditional data centers were built for general-purpose computing — flexible facilities that could host any mix of web servers, databases, and cloud applications. The new generation is different. These are AI factories: specialized environments engineered for one primary purpose — training and running artificial intelligence at industrial scale.

What makes an AI factory different?

GPU-centric layout. Instead of rows of standard server racks, AI factories are designed around dense GPU clusters with liquid cooling infrastructure integrated from day one. The chip drives the building design — not the other way around.

Massive power density. AI factories require dedicated power substations, often with direct connections to the utility grid at medium or high voltage. Some are being co-located with power plants.

Integrated cooling. Liquid cooling isn’t retrofitted — it’s foundational. Piping for coolant runs through the facility like plumbing in a building, with CDUs positioned alongside every row of GPU racks.

Cluster-aware networking. The network isn’t just connectivity — it’s part of the compute architecture. High-speed fabrics are designed to make thousands of GPUs behave as a single, coordinated system.

NVIDIA’s CEO Jensen Huang has called data centers “the factories of the AI era” — and the analogy is precise. Just as factories in the industrial revolution were purpose-built to produce goods at scale, AI factories are purpose-built to produce intelligence at scale.

Why This Matters to You

You might never visit a data center. But every digital experience you have — every search, every stream, every AI conversation — depends on one.

Understanding how data centers work gives you a deeper appreciation for the physical infrastructure behind the digital world. The cloud isn’t floating in the sky. It’s running in carefully engineered buildings, consuming real electricity, generating real heat, and requiring real engineering to keep running 24/7/365.

In 2026, data centers are at the center of the biggest infrastructure buildout since the electrical grid. The AI revolution isn’t just a software story — it’s a story about power, cooling, chips, and the buildings that house them.

The next time someone says “it’s in the cloud,” you’ll know exactly where the cloud lives.

🙌 If You Learned Something New…

Thank you for making it to the end! These deep dives take serious research, and knowing that real people are reading them cover-to-cover is what keeps me going.

Here’s how you can help — and I truly mean it:

👏 Hold down that clap button until it hits 50 — each clap tells Medium’s algorithm to share this article with more curious engineers and tech enthusiasts like you.

➕ Follow me on Medium @elsa-andrea — I publish weekly deep dives on the real engineering behind AI, cloud, infrastructure, and software systems. No fluff. No hype. Just clarity.

Every clap, follow, and share is genuine fuel for me. Writing these articles takes hours, and knowing they’re helping people understand technology better gives me all the energy I need for the next one. 💙

Coming up next: The Hidden Engineering Behind Every AI Model: Storage, Compute, and the Data Pipeline Nobody Talks About — follow me so you don’t miss it!

How Data Centers Actually Work: From Cooling Systems to GPU Clusters was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.