Is your AI training cluster thirsty? Let's talk water.
A practical look at AI cooling water demand, where the risk concentrates, and how teams can mitigate it.
We were spending $0.45 per request just on tool definitions.
Not on actual AI work. Just telling the model what tools exist.
With 30 tools in context, you're burning 150,000 tokens before the conversation even starts. At $3/million input tokens for Claude, that adds up fast.
So we rewrote everything.
Instead of loading tool schemas into the context window, we:
| Metric | Before | After | Reduction |
|---|---|---|---|
| 30 tools token count | 84,000 | 600 | 99.3% |
| Cost per request | $0.45 | $0.003 | 99.3% |
Treat skills like files on disk, not system prompts.
Agents discover what they need. They don't carry everything everywhere.
Think about how you work: you don't read the entire manual before starting a task. You look up the specific section when you need it.
Each skill has a skill.md file with YAML frontmatter. At build time, we extract just the metadata:
{
"lcoe-calculator": {
"name": "LCOE Calculator",
"trigger": "lcoe|levelized cost|energy economics",
"agents": ["INVESTMENT_INTELLIGENCE", "COST_PREDICTION"]
}
}That's it. ~50 bytes per skill instead of 5,000.
When a query comes in, we do a simple regex match against triggers. If we hit, we load the full skill instructions.
Instead of 30 different tool schemas, we have one:
execute_skill(skill_id: string, script: string, input: object)The agent knows how to call this. The skill instructions tell it what to put in the input.
skills/
└── lcoe-calculator/
├── skill.md # Entry point with metadata
├── scripts/ # Executable Python scripts
│ ├── calculate_lcoe.py
│ └── sensitivity.py
├── schemas/ # JSON schemas for validation
│ ├── input.schema.json
│ └── output.schema.json
└── examples/ # Sample inputs/outputsWe based this on Anthropic's "Code Execution with MCP" pattern but pushed it further. Their insight: agents should write and execute code, not just call predefined tools.
Our extension: predefine the code patterns as skills, so agents get the best of both worlds - reliable, tested scripts with the flexibility of dynamic invocation.
If you're building multi-agent systems and hitting token limits or cost walls, look at your tool definitions first. That's probably where the waste is.
The solution isn't fewer tools. It's smarter tool loading.
Full implementation details in our GitHub. Happy to share if anyone's hitting the same wall.
A practical look at AI cooling water demand, where the risk concentrates, and how teams can mitigate it.
Why we moved from traditional SaaS patterns to a multi-agent operating model for infrastructure intelligence.
Why grid-visibility tooling may become the limiting factor for AI data center expansion.
Where market-implied probabilities beat headlines for timing-sensitive energy and infrastructure decisions.
What the EU AI Act means for AI energy reporting, compliance timelines, and exposure management.
How structured disagreement between specialist agents produced better portfolio decisions.
Why LCOE remains a core metric for comparing technologies and underwriting long-horizon energy risk.
How carbon-aware workload scheduling reduces both emissions and compute cost volatility.
Inside our ingestion pipeline for extracting, scoring, and publishing infrastructure signals automatically.
A portfolio-level briefing on grid constraints, power costs, and capital-allocation implications.
Who is funding hyperscale buildout, where structures are changing, and what risk shifts to lenders.
A practical playbook for lowering AI energy intensity without sacrificing delivery speed.