Tools of the Trade
Technical Guide

Liquid Cooling Field Guide (D2C & Immersion) for Mixed Fleets

Vendor‑neutral, operationally focused guidance to design, commission, and operate direct‑to‑chip (D2C) and immersion cooling—side by side—from a single edge pod to a multi‑MW campus.

TL;DR

If you’re running modern accelerators or dense CPUs, air alone won’t cut it. This guide shows how to design, commission, and operate D2C and immersion—side by side—so you can safely scale from a single edge pod to a multi‑MW campus. It’s vendor‑neutral, operationally focused, and ends with practical checklists, formulas, and a light GridSite tie‑in.

1) When to choose liquid (and which kind)

Use liquid when:

  • Rack density consistently > 15–20 kW (air pain begins) or > 30–40 kW (air becomes impractical).
  • GPUs/CPUs with > 400–700 W per device.
  • Hot climates or strict noise limits (airflow and fans become excessive).
  • You need lower PUE and stable inlet temps for performance consistency.

Pick D2C when:

You have mainstream servers with cold‑plate options, want standard 19″ racks, and need easy serviceability (swap a node without draining a whole tank).

Pick immersion when:

You want extreme rack density (40–100+ kW/rack), simpler airflow management, and sealed thermal islands—at the cost of different service procedures and fluid handling.

Yes, you can mix both: D2C rows for general fleets, immersion pods for the hottest SKUs—on a shared primary loop with proper isolation and chemistry control.

2) Cooling loop building blocks (common to both)

  • Rack or row CDU: isolates facility loop from IT loop via plate heat exchangers (HX) and pumps; provides flow, pressure, and leak containment local to the IT.
  • Room/CDU or plant CDU: aggregates multiple racks/rows; pumps with N+1 redundancy; dual power feeds.
  • Primary (facility) loop: glycol/water to dry coolers (preferable for water‑lite sites), adiabatic coolers, or chillers (where climate demands).
  • Quick‑disconnects (QDs): dry‑break couplings at rack/device to limit spill risk.
  • Instrumentation: supply/return temperature, ΔP, flow meters, leak sensors (pan/inline), conductivity, filters with ΔP sensors, air separators/deaeration.
  • Controls: VFD pump curves, supply temperature setpoints, automatic staging, interlocks (leak → isolate and stop pumps).

3) D2C (Direct‑to‑Chip) reference design

3.1 Topology

Server cold plates on CPUs/GPUs (optionally memory/VRMs), plumbed to a rack manifold. Rack CDU with pumps (N+1), small buffer tank, filters (start at 50 μm, then 10 μm), air separator, leak tray. IT loop fluid: typically treated water or water/glycol (0–25%), corrosion inhibitor compatible with metals in path (Cu/Br/SS). Facility loop: glycol/water (20–35% typical) to HX, then to dry coolers/chillers.

3.2 Design targets (typical starting points)

  • Supply (IT loop): 25–35 °C—keeps fans slow, supports high economizer hours.
  • ΔT across IT loop: 8–15 °C (higher ΔT → lower flow, smaller pumps).
  • ΔP across cold plates/manifold: 20–60 kPa (3–9 psi), vendor‑specific.
  • Filtration: 50 μm new install/flushing, then 10 μm service; optional 5 μm polishers.
  • Materials: avoid aluminum in mixed‑metal loops; prefer copper/brass/stainless; elastomers EPDM/FKM per chem.

3.3 Quick math (flow sizing)

Q̇ = ṁ * Cp * ΔT
ṁ = Q̇ / (Cp * ΔT)

Where:
  Q̇ = heat (W)
  ṁ = mass flow (kg/s)
  Cp ≈ 4180 J/kg·°C for water

Example: 60 kW rack, ΔT = 10 °C
ṁ = 60000 / (4180 * 10) ≈ 1.44 kg/s ≈ 1.44 L/s ≈ 86 L/min

Recommendation: add +10–20% headroom per rack for transients and fouling.

3.4 Reliability notes

  • CDU pumps N+1 with auto‑failover (and test it monthly).
  • Dual power to CDU; interlock “CDU stop” → close QDs.
  • Leak trays + sensors beneath manifolds; dripless QDs at every service break.
  • Bypass line across rack manifold to maintain flow when a node is offline.

4) Immersion reference design

4.1 Single‑phase immersion (most common)

IT tank with dielectric fluid (non‑conductive), heat picked up directly by fluid. Tank‑side HX: fluid → plate HX → secondary water/glycol loop → plant. Pumps circulate dielectric fluid within the tank and the secondary loop (some tanks rely on buoyancy/jet mixing; verify).

4.2 Two‑phase immersion (specialized)

Fluid boils on hot components, condenses on coils; more efficient but stricter on fluid handling, materials, and sealing.

4.3 Fluid selection (single‑phase)

  • Key properties: dielectric strength, viscosity, thermal stability, material compatibility, flash point, environmental profile.
  • EHS: SDS on file; spill kits; reclaim/recycle plan; avoid floor drains.
  • Compatibility: test boards, adhesives, potting; evaluate gaskets and plastics for swelling/embrittlement.

4.4 Service procedures

  • Ullage management to accommodate thermal expansion.
  • Board handling: drip‑tray staging, wipes, gloves; ESD practices still apply.
  • Filtration: inline 10 μm, polishers for fines; degassing to remove entrained air.
  • Leak containment: tank lip height, secondary mats, level sensors.

5) Shared plant: dry coolers, adiabatic, chillers

  • Dry coolers: simplest, water‑lite; size for full load at design ambient with N+1 fans.
  • Adiabatic assist: wetted pads for extreme days; manage water treatment and legionella controls.
  • Chillers: required in hot/humid regions or when lower supply temps are needed; consider free‑cooling chillers.
  • Noise/Electrical: low‑noise fan arrays, night derate; VFDs on fans/pumps; plan MCC and maintenance bypasses.

6) Chemistry & cleanliness (do not skip)

  • Fill & flush: pressure test → flush with inhibited water → filter to target counts → verify conductivity and pH.
  • Biocide program; compatible inhibitors; avoid mixed metals unless chemistry supports it.
  • Sampling cadence: monthly (first quarter), then quarterly; track ΔP across filters.

7) Controls & setpoints (starter library)

  • Facility loop supply: 28–32 °C for D2C, 30–35 °C for single‑phase immersion.
  • Pump control: maintain differential pressure at CDUs (e.g., 40–60 kPa) with VFD.
  • Fan staging: maintain approach temperature (outlet coolant − ambient dry bulb ≈ 8–12 °C).
  • Rack supply: 25–30 °C; ensure ΔT ≥ 8 °C returning from plates.
  • Alarms: low flow, high ΔT, leak detect, filter ΔP high, pump fail, HX fouling.

8) Instrumentation & telemetry (NOC‑ready)

  • Temperatures: facility supply/return, CDU supply/return, cold‑plate inlet/outlet, tank gradients, ambient.
  • Flow: per rack (D2C) and per tank/loop (immersion). Pressure: pump suction/discharge; ΔP across filters and plates.
  • Electrical: pump VFD %; fan RPMs; kW for PUE calc. Events: leaks, door/lid open, chemical additions, filter swaps.
  • KPIs: loop ΔT, HX approach, pump/fan kW per kW IT, alarm rate per MW, MTTR on pump/fan swaps.

9) Reliability & maintainability

  • N+1 pumps at each CDU/plant; quarterly failover tests.
  • Concurrent maintainability: valves/bypasses to isolate a rack/tank without draining the row.
  • Spares: pumps, seals, QDs, filters, HX gaskets, leak sensors, dielectric fluid.
  • Drain & purge points at logical low/high spots; sloped piping; clear labeling mirrored in as‑builts.

10) Commissioning (IST) checklist

Pre‑power (mechanical)

  • Hydrostatic/pressure test to 1.1–1.5× operating pressure, hold 24 h.
  • Flush and filter to spec; chemical report signed off.
  • Verify valve positions, pump rotation, VFD control, sensor calibration.

Integrated thermal

  • Stage load banks (or IT burn‑in) 25% → 50% → 80% nameplate; hold 2 h each; capture P‑T‑F curves.
  • Simulate pump fail, fan fail, power transfer (UPS/gens where used).
  • Immersion: verify degassing performance, leak response, and ullage across thermal cycles.

Network & alarms

  • Telemetry to NOC; alarms route with severities; safe auto‑actions within guardrails.

Handover artifacts

  • As‑builts (P&IDs, valve matrix), setpoint file, chemical certificates, spare parts list, golden curve packet.

11) Operating procedures (Day‑2)

  • Daily: visual walk, leak trays, sensor sanity; check ΔT and approach.
  • Weekly: filter ΔP, pump trends, chemical logs; NOC report.
  • Monthly: sample chemistry; test a pump failover.
  • Quarterly: thermal drill; verify alarms/auto‑actions; update golden curves.
  • Annual: HX inspection, bearing checks, megger tests; immersion fluid reclaim/replace readiness.

12) Edge‑site specifics

  • Noise/zoning: low‑noise dry coolers; night derate; acoustic screens.
  • Power quality: VFDs tolerate sags better; still test ride‑through.
  • Footprint/roof loads: verify structural loads; wind/snow.
  • Permits: immersion fluids storage, adiabatic water usage, external equipment line‑of‑sight.
  • Remote hands: tool‑less filters, color‑coded valves, QR codes to MOPs.

13) Economics (rules of thumb)

  • Capex: roughly $0.8–1.5M per MW incremental over basic air (climate/design dependent).
  • Opex: non‑IT energy falls 10–25% vs air; water use ≈ zero for dry coolers.
  • Capacity unlock: avoid throttling; higher sustained clocks on AI/HPC.
  • Risk reduction: fewer hot‑aisle issues, more predictable performance.

14) Ten common pitfalls (and fixes)

  • Plates starved of flow → size pump head for worst‑case ΔP; instrument each rack.
  • Mixed metals corrosion → harmonize materials/chemistry; avoid Al + Cu without proper inhibitors.
  • No air management on D2C → fans still matter; maintain aisle containment.
  • Leaky QDs → use rated dry‑breaks; train techs; replace seals.
  • Under‑spec’d filtration → step down to 10 μm post‑flush; add ΔP gauges; schedule swaps.
  • Immersion fluid mishandling → SDS training, spill kits, level sensors, secondary containment.
  • No bypasses → add bypasses to service without downtime.
  • Poor telemetry → add flow, ΔP, and leak sensing.
  • Controls not tuned → tune PID; add deadbands.
  • Skipping IST → always thermal‑soak and drill failovers before go‑live.

15) Quick design tables

Starter setpoints (tune per site)

ItemD2CImmersion (single‑phase)
IT loop supply25–30 °C30–35 °C
IT loop ΔT8–15 °C6–12 °C
Facility loop supply28–32 °C30–35 °C
Pump ΔP (rack)20–60 kPan/a (tank circulation)
Filtration steady10 μm10 μm (dielectric)

Rack flow (water) quick look

Flow (L/min) ≈ 1.43 × kW / ΔT(°C)
Example: 60 kW @ ΔT 10 °C → ~86 L/min

16) Templates (copy/paste)

Leak response (D2C)

Alarm received → identify rack ID and valve map.
Close rack isolation valves; stop CDU pump; place absorbent pads.
Verify drip tray sensor dry; re-pressurize at low speed; monitor ΔP/flow.
If persistent, de-pressurize rack loop; replace seals/QD; document in CMMS.

Immersion service (single‑phase)

Pause workload; wipe board; lift with drip tooling; place on service tray.
Inspect, replace parts; check gaskets; re-insert; allow degas period.
Verify tank level/ullage; run mixer; check temp gradient and filter ΔP.
Log service in CMMS with photos.

17) How this ties to GridSite

You can absolutely build and run this yourself. If you want to go faster or de‑risk, the GridSite ecosystem includes specialized vendors and operators who can help with design, provisioning, commissioning, operations, and integrations.

  • Selection & design: D2C vs immersion mix, CDU sizing, manifold/QD specs, acoustics, chemistry.
  • Ordering & provisioning: BOMs, procurement, FATs, delivery phasing.
  • Commissioning & IST: load‑bank soaks, failover drills, telemetry hookups, golden curve package.
  • Operations: chemistry management, filter programs, spares, remote monitoring, seasonal tuning, 24×7 response.
  • Integrations: NOC telemetry, alarm→ticket webhooks, compliance evidence (e.g., CAM I‑COOL scoring).

Ask for the “Liquid Cooling Starter Pack”: setpoint library, sizing calculator, commissioning checklists, and Day‑2 runbooks you can adopt as‑is or tailor to your fleet. Contact us to get started.