20 CPU/tick: how I hit the ceiling and found 4.8 CPU in untracked logic
The bot ate a steady 17.7 of 20 CPU, bucket sat around 2-22. Decomposition showed: 8.25 CPU is a hard intent ceiling, nothing to do. And 4.8 CPU is in untracked logic — that's where 90% of the potential lives. A story about module-level cache in an ephemeral runtime.
A continuation of the article on monitoring a Screeps bot via Grafana. That one was about infrastructure; this one is about the specific crisis that infrastructure helped solve.
In Screeps every player has a hard CPU budget. At GCL 6 it's 20 CPU/tick. Exceed it — the bucket drops; bucket runs out — creeps stop executing, the bot stalls. This isn't "optimization for optimization's sake," it's a hard limit past which the bot physically can't play.
In April my bot was steadily burning 17.7 CPU/tick out of the 20 ceiling, with the bucket bouncing 2-22 against a normal of 10,000. Every 5 minutes the dashboard showed 1-2 overrun spikes after which a couple of rooms skipped a tick. I knew I had to optimize. But I didn't know what exactly.
This article is about how I broke those 17.7 CPU down by category, found that 4.8 of them are "air" (untracked logic), and about the canonical module-level cache pattern, which works differently than it seems in Screeps' ephemeral runtime.
Bucket as a health indicator
First, the CPU model in Screeps. Every tick the bot has:
- CPU limit — fixed, 20 at GCL 6
- Bucket — a reservoir. If you spent under the limit, the leftover drips into the bucket (up to 10,000). If you spent over, it pulls the difference from the bucket.
- Overrun — when the bucket is empty and the tick didn't fit: code stops,
Game.cpu.tickLimit = 5for the next tick, the bot effectively skips.
A healthy system: avg < 15, bucket steadily > 5000, you can afford a one-off 25-30 CPU spike without pain. My system: avg 17.7, bucket 2-22 — on the edge, any surge (spawn wave, combat) → overrun.
The bucket graph in Grafana looked like a dying patient's cardiogram: crawling near zero, sometimes twitching to 22, then back to zero. Without the chart I'd have written it off as "well, that happens."
Decomposition: where the 17.7 CPU goes
The bot writes Memory._cpu metrics every 100 ticks. That goes into segment 0 → PostgreSQL → Grafana. With a Grafana filter I broke down the 17.7 CPU/tick like this:
| Component | CPU | Reducible? |
|---|---|---|
| Intents (tracked) | 8.25 | No — hard ceiling |
| Task generation | ~1-2 | Yes |
| Overhead (findTask, Logger, pickup) | ~1.5 | Yes |
| Role state machines | ~0.3 | Don't touch |
| Init + Memory parse | 1.6 | A little |
| Rooms | 4.2 | A little |
| Untracked in creeps | 4.8 | Yes, the main potential |
Intents are creep.move(), creep.harvest(), creep.transfer(), any in-game actions. Each has a fixed CPU price you can't lower — that's the rule of the game. 8.25 CPU of intents at 43-45 creeps is the ceiling; the only way through is fewer creeps or fewer actions.
But 4.8 CPU untracked — that's interesting. That's not intents, that's all the rest of the JS logic: task selection, the runRoom loop, task-pool generation, state machines, filters. That's where 90% of the optimization potential sits.
CPU per role
| Role | count | CPU | CPU/creep |
|---|---|---|---|
| hauler | 17 | 5.9 | 0.35 |
| miner | 12 | 3.3 | 0.27 |
| upgrader | 3 | 1.2 | 0.40 |
| skMiner | 3 | 0.9 | 0.30 |
| skHauler | — | 0.9 | 0.30 |
| worker | 5 | 0.4 | 0.08 ✅ |
| skKiller | 1 | 0.3 | 0.30 |
The worker is cheap because its logic is simple: came to a build site, built, went for energy. The hauler is expensive because every tick it picks a task again from a pool of 30+ options with priorities. With 17 haulers that's 17 walks through the pool per tick — that's where the 5.9 CPU sits.
CPU per method
The bot also writes which game methods spent how much:
| Method | CPU | Calls | CPU/call |
|---|---|---|---|
| moveTo | 2.11 | 8 | 0.26 |
| transfer | 1.69 | 11 | 0.15 |
| harvest | 1.48 | 14 | 0.11 |
| withdraw | 0.81 | 5 | 0.16 |
| roomFind | 0.69 | 124 | 0.006 |
| upgrade | 0.59 | 3 | 0.20 |
| findInRange | 0.50 | 41 | 0.012 |
roomFind 124 times per tick sounds a lot, but it adds up to 0.69 CPU — not the worst pain. But transfer at 1.69 CPU on 11 calls is a hint that someone is calling transfer every tick.
Module-level cache: the canonical pattern
The most counterintuitive part of Screeps optimization is where to keep your cache.
In a regular Node.js process you can do room.taskPool = [...] — and it lives as long as the process lives. In Screeps that doesn't work:
Game.*objects are recreated every tick.room.foo,creep.bar,structure.bazdo NOT survive the tick boundary.
My first two optimization attempts went through room._taskPool — the cache "worked" for that tick, but a tick later it was gone. And Memory.rooms[name].taskPool is also bad: Memory deserialization is expensive, plus task objects (with pos, target) after deserialization are dead pojos, no methods.
The right pattern is module-level state, which lives in the require cache of the global sandbox:
// top of module — persists via require cache
const _cache = {};
function getCached(key, ttl, compute) {
const tick = Game.time;
const e = _cache[key];
if (e && (tick - e.tick) < ttl) return e.value;
const v = compute();
_cache[key] = { value: v, tick };
return v;
}
The Screeps global sandbox lives between ticks — until the server does a global reset (on code deploy or once every hundreds-thousands of ticks). So _cache in module scope survives the tick boundary, unlike fields on game objects.
What you can't cache in the module cache: the Room, Creep, Structure wrapper objects themselves. They're dead next tick — you can't call methods. You store ids, and resolve via Game.getObjectById(id) each time.
This pattern is used by every well-known public bot (Overmind, the-international, bonzAI). I just didn't know.
Phase 1: Task pool TTL=2
The most expensive bottleneck — generating the task pool for a room. TaskGenerator.generate(room) walks structures, looks for drops, computes priorities — and returns an array of 30+ tasks. This was being called every tick in every room: 6 rooms × ~0.7 CPU = ~4.2 CPU in task generation.
But the pool changes rarely: new tasks appear on events — a creep died, a structure filled up, a drop disappeared after pickup. Against the backdrop of 30+ tasks in the pool, between events the pool is identical. Catch it and hold for 2-3 ticks.
Implementation:
// core.task.queue.js
getPool: function(room) {
Cache.init(room);
const TTL = 2;
if (room._cache.taskPool && room._cache.taskPoolTick &&
Game.time - room._cache.taskPoolTick < TTL) {
return room._cache.taskPool;
}
const pool = TaskGenerator.generate(room);
pool.sort((a, b) => a.priority - b.priority);
room._cache.taskPool = pool;
room._cache.taskPoolTick = Game.time;
return pool;
}
And invalidations on events:
TaskQueue.assign()withmaxAssigned=1→ drop the cache (slot taken)TaskQueue.complete/release()→ drop the cache (slot freed)- A creep died in the room → drop the cache
- Pickup removed a ground resource → drop the cache
The risk was around pickup tasks: a dropped resource decays at 1/1000 of amount per tick. 500e loses 25e over 50 ticks. At TTL=2 that's negligible. Tomb decay is a bit faster, but still inside acceptable.
Effect after deploy:
- Hit rate 73% (i.e. 73% of
getPoolcalls return cached value) - avg CPU ~1 lower
- bucket crawled from 2-22 to 13-52
Not the 10,000 you'd want, but for the first time the bucket is steadily growing instead of pinned at zero.
What turned out NOT to be the problem
After profiling I thought the main culprits were room.find() (124 calls/tick) and moveTo. Turned out no. The real bottlenecks showed up in decomposition, not intuition. Candidates I rejected:
| Candidate | Why I didn't touch it |
|---|---|
manager.colonize.js 26 finds/tick | Already throttled via Game.time % 50 === 0 |
room.find in hauler dead branch | Only fires in starter colonies, not prod |
Caching 124 roomFind calls | Sums to 0.69 CPU — bad ROI on refactoring 30 sites |
Removing moveTo reusePath | 8 calls/tick — pathfinding fires rarely already |
Lesson: intuition said "optimize find()", numbers said "optimize task generation". Numbers won.
Next step: miner transfer throttle
The funniest find came after Phase 1. I noticed that transfer at 1.69 CPU on 11 calls/tick was mostly miners doing creep.transfer(link, ENERGY) every tick when store > 0.
A 5W miner harvests 10 e/tick, store cap 50. So after the first harvest they have 10 in store, after transfer — 0, a tick later 10 again. Same cycle every tick, transfer every tick — 0.15 CPU × 12 miners = 1.8 CPU on a single action that could be done once every 5 ticks.
One-line fix:
// WAS:
if (creep.store[RESOURCE_ENERGY] > 0) {
creep.transfer(link, RESOURCE_ENERGY);
}
// IS:
if (creep.store[RESOURCE_ENERGY] >= 40) {
creep.transfer(link, RESOURCE_ENERGY);
}
The miner accumulates to 40, then dumps in one shot. Cycle 40→0 once every 4 ticks. Savings: 9 transfers/tick × 0.15 CPU = −1.35 CPU. That's more than all of Phase 1 gave me.
Risk is zero. harvest is in pipeline P1, transfer is in P3 — they don't conflict (see [the upcoming article on action pipelines, if I write it]). Threshold 40 + harvest 10 = 50 = capacity — overflow is impossible.
This one-liner is queued for deploy as the next phase.
What I learned
Hard ceiling vs soft ceiling. In Screeps intents cost a fixed amount — that's a hard ceiling. No matter how much you optimize, 8.25 CPU on 43 creeps doesn't go anywhere. If a system has a hard ceiling — find it first so you don't optimize the impossible. The SaaS analogue is the cost of a SQL query in the DB: if the query is mandatory, optimizing the code around it has a ceiling.
Hot path ≠ hot logic. room.find() 124 times/tick sounds scary, but adds up to 0.69 CPU. And TaskGenerator.generate() 6 times/tick is 4.2 CPU. Optimize logic that does a lot per call, not points that get called often.
Module-level state in ephemeral runtimes. In Screeps the global sandbox lives between ticks, but game objects don't. That's counterintuitive: it feels like room.foo is more persistent than a global variable, but it's the opposite. FaaS / Lambda has a similar rule: a warm container survives requests, request-scope state doesn't.
Numbers beat intuition. Until I broke down the 17.7 CPU on shelves, I'd have optimized room.find() — and saved 0.5 CPU. Decomposition showed that untracked logic holds 4.8 CPU out of 13.1 — 90% of the potential. Without a Grafana dashboard I wouldn't have seen this.
In the next article — about action pipelines in Screeps: how two methods can both return OK but one of them is silently ignored by the engine; and how a drainer (5T+18RA+17M+10H) becomes immortal under three towers precisely because of correct pipeline choices. That's not about performance anymore — it's about silent failures and why OK ≠ "done."
Comments are powered by Giscus + GitHub. Clicking transfers data to GitHub Inc. (USA). No click — no transfer.