20 CPU/tick: how I hit the ceiling and found 4.8 CPU in untracked logic

The bot ate a steady 17.7 of 20 CPU, bucket sat around 2-22. Decomposition showed: 8.25 CPU is a hard intent ceiling, nothing to do. And 4.8 CPU is in untracked logic — that's where 90% of the potential lives. A story about module-level cache in an ephemeral runtime.

12 May 2026#screeps#performance#profiling#architecture#javascript

A continuation of the article on monitoring a Screeps bot via Grafana. That one was about infrastructure; this one is about the specific crisis that infrastructure helped solve.

In Screeps every player has a hard CPU budget. At GCL 6 it's 20 CPU/tick. Exceed it — the bucket drops; bucket runs out — creeps stop executing, the bot stalls. This isn't "optimization for optimization's sake," it's a hard limit past which the bot physically can't play.

In April my bot was steadily burning 17.7 CPU/tick out of the 20 ceiling, with the bucket bouncing 2-22 against a normal of 10,000. Every 5 minutes the dashboard showed 1-2 overrun spikes after which a couple of rooms skipped a tick. I knew I had to optimize. But I didn't know what exactly.

This article is about how I broke those 17.7 CPU down by category, found that 4.8 of them are "air" (untracked logic), and about the canonical module-level cache pattern, which works differently than it seems in Screeps' ephemeral runtime.

Bucket as a health indicator

First, the CPU model in Screeps. Every tick the bot has:

CPU limit — fixed, 20 at GCL 6
Bucket — a reservoir. If you spent under the limit, the leftover drips into the bucket (up to 10,000). If you spent over, it pulls the difference from the bucket.
Overrun — when the bucket is empty and the tick didn't fit: code stops, Game.cpu.tickLimit = 5 for the next tick, the bot effectively skips.

A healthy system: avg < 15, bucket steadily > 5000, you can afford a one-off 25-30 CPU spike without pain. My system: avg 17.7, bucket 2-22 — on the edge, any surge (spawn wave, combat) → overrun.

The bucket graph in Grafana looked like a dying patient's cardiogram: crawling near zero, sometimes twitching to 22, then back to zero. Without the chart I'd have written it off as "well, that happens."

Decomposition: where the 17.7 CPU goes

The bot writes Memory._cpu metrics every 100 ticks. That goes into segment 0 → PostgreSQL → Grafana. With a Grafana filter I broke down the 17.7 CPU/tick like this:

Component	CPU	Reducible?
Intents (tracked)	8.25	No — hard ceiling
Task generation	~1-2	Yes
Overhead (findTask, Logger, pickup)	~1.5	Yes
Role state machines	~0.3	Don't touch
Init + Memory parse	1.6	A little
Rooms	4.2	A little
Untracked in creeps	4.8	Yes, the main potential

Intents are creep.move(), creep.harvest(), creep.transfer(), any in-game actions. Each has a fixed CPU price you can't lower — that's the rule of the game. 8.25 CPU of intents at 43-45 creeps is the ceiling; the only way through is fewer creeps or fewer actions.

But 4.8 CPU untracked — that's interesting. That's not intents, that's all the rest of the JS logic: task selection, the runRoom loop, task-pool generation, state machines, filters. That's where 90% of the optimization potential sits.

CPU per role

Role	count	CPU	CPU/creep
hauler	17	5.9	0.35
miner	12	3.3	0.27
upgrader	3	1.2	0.40
skMiner	3	0.9	0.30
skHauler	—	0.9	0.30
worker	5	0.4	0.08 ✅
skKiller	1	0.3	0.30

The worker is cheap because its logic is simple: came to a build site, built, went for energy. The hauler is expensive because every tick it picks a task again from a pool of 30+ options with priorities. With 17 haulers that's 17 walks through the pool per tick — that's where the 5.9 CPU sits.

CPU per method

The bot also writes which game methods spent how much:

Method	CPU	Calls	CPU/call
moveTo	2.11	8	0.26
transfer	1.69	11	0.15
harvest	1.48	14	0.11
withdraw	0.81	5	0.16
roomFind	0.69	124	0.006
upgrade	0.59	3	0.20
findInRange	0.50	41	0.012

roomFind 124 times per tick sounds a lot, but it adds up to 0.69 CPU — not the worst pain. But transfer at 1.69 CPU on 11 calls is a hint that someone is calling transfer every tick.

Module-level cache: the canonical pattern

The most counterintuitive part of Screeps optimization is where to keep your cache.

In a regular Node.js process you can do room.taskPool = [...] — and it lives as long as the process lives. In Screeps that doesn't work:

Game.* objects are recreated every tick. room.foo, creep.bar, structure.baz do NOT survive the tick boundary.

My first two optimization attempts went through room._taskPool — the cache "worked" for that tick, but a tick later it was gone. And Memory.rooms[name].taskPool is also bad: Memory deserialization is expensive, plus task objects (with pos, target) after deserialization are dead pojos, no methods.

The right pattern is module-level state, which lives in the require cache of the global sandbox:

// top of module — persists via require cache
const _cache = {};

function getCached(key, ttl, compute) {
    const tick = Game.time;
    const e = _cache[key];
    if (e && (tick - e.tick) < ttl) return e.value;
    const v = compute();
    _cache[key] = { value: v, tick };
    return v;
}

The Screeps global sandbox lives between ticks — until the server does a global reset (on code deploy or once every hundreds-thousands of ticks). So _cache in module scope survives the tick boundary, unlike fields on game objects.

What you can't cache in the module cache: the Room, Creep, Structure wrapper objects themselves. They're dead next tick — you can't call methods. You store ids, and resolve via Game.getObjectById(id) each time.

This pattern is used by every well-known public bot (Overmind, the-international, bonzAI). I just didn't know.

Phase 1: Task pool TTL=2

The most expensive bottleneck — generating the task pool for a room. TaskGenerator.generate(room) walks structures, looks for drops, computes priorities — and returns an array of 30+ tasks. This was being called every tick in every room: 6 rooms × ~0.7 CPU = ~4.2 CPU in task generation.

But the pool changes rarely: new tasks appear on events — a creep died, a structure filled up, a drop disappeared after pickup. Against the backdrop of 30+ tasks in the pool, between events the pool is identical. Catch it and hold for 2-3 ticks.

Implementation:

// core.task.queue.js
getPool: function(room) {
    Cache.init(room);
    const TTL = 2;
    if (room._cache.taskPool && room._cache.taskPoolTick &&
        Game.time - room._cache.taskPoolTick < TTL) {
        return room._cache.taskPool;
    }
    const pool = TaskGenerator.generate(room);
    pool.sort((a, b) => a.priority - b.priority);
    room._cache.taskPool = pool;
    room._cache.taskPoolTick = Game.time;
    return pool;
}

And invalidations on events:

TaskQueue.assign() with maxAssigned=1 → drop the cache (slot taken)
TaskQueue.complete/release() → drop the cache (slot freed)
A creep died in the room → drop the cache
Pickup removed a ground resource → drop the cache

The risk was around pickup tasks: a dropped resource decays at 1/1000 of amount per tick. 500e loses 25e over 50 ticks. At TTL=2 that's negligible. Tomb decay is a bit faster, but still inside acceptable.

Effect after deploy:

Hit rate 73% (i.e. 73% of getPool calls return cached value)
avg CPU ~1 lower
bucket crawled from 2-22 to 13-52

Not the 10,000 you'd want, but for the first time the bucket is steadily growing instead of pinned at zero.

What turned out NOT to be the problem

After profiling I thought the main culprits were room.find() (124 calls/tick) and moveTo. Turned out no. The real bottlenecks showed up in decomposition, not intuition. Candidates I rejected:

Candidate	Why I didn't touch it
`manager.colonize.js` 26 finds/tick	Already throttled via `Game.time % 50 === 0`
`room.find` in hauler dead branch	Only fires in starter colonies, not prod
Caching 124 `roomFind` calls	Sums to 0.69 CPU — bad ROI on refactoring 30 sites
Removing `moveTo` reusePath	8 calls/tick — pathfinding fires rarely already

Lesson: intuition said "optimize find()", numbers said "optimize task generation". Numbers won.

Next step: miner transfer throttle

The funniest find came after Phase 1. I noticed that transfer at 1.69 CPU on 11 calls/tick was mostly miners doing creep.transfer(link, ENERGY) every tick when store > 0.

A 5W miner harvests 10 e/tick, store cap 50. So after the first harvest they have 10 in store, after transfer — 0, a tick later 10 again. Same cycle every tick, transfer every tick — 0.15 CPU × 12 miners = 1.8 CPU on a single action that could be done once every 5 ticks.

One-line fix:

// WAS:
if (creep.store[RESOURCE_ENERGY] > 0) {
    creep.transfer(link, RESOURCE_ENERGY);
}

// IS:
if (creep.store[RESOURCE_ENERGY] >= 40) {
    creep.transfer(link, RESOURCE_ENERGY);
}

The miner accumulates to 40, then dumps in one shot. Cycle 40→0 once every 4 ticks. Savings: 9 transfers/tick × 0.15 CPU = −1.35 CPU. That's more than all of Phase 1 gave me.

Risk is zero. harvest is in pipeline P1, transfer is in P3 — they don't conflict (see [the upcoming article on action pipelines, if I write it]). Threshold 40 + harvest 10 = 50 = capacity — overflow is impossible.

This one-liner is queued for deploy as the next phase.

What I learned

Hard ceiling vs soft ceiling. In Screeps intents cost a fixed amount — that's a hard ceiling. No matter how much you optimize, 8.25 CPU on 43 creeps doesn't go anywhere. If a system has a hard ceiling — find it first so you don't optimize the impossible. The SaaS analogue is the cost of a SQL query in the DB: if the query is mandatory, optimizing the code around it has a ceiling.

Hot path ≠ hot logic. room.find() 124 times/tick sounds scary, but adds up to 0.69 CPU. And TaskGenerator.generate() 6 times/tick is 4.2 CPU. Optimize logic that does a lot per call, not points that get called often.

Module-level state in ephemeral runtimes. In Screeps the global sandbox lives between ticks, but game objects don't. That's counterintuitive: it feels like room.foo is more persistent than a global variable, but it's the opposite. FaaS / Lambda has a similar rule: a warm container survives requests, request-scope state doesn't.

Numbers beat intuition. Until I broke down the 17.7 CPU on shelves, I'd have optimized room.find() — and saved 0.5 CPU. Decomposition showed that untracked logic holds 4.8 CPU out of 13.1 — 90% of the potential. Without a Grafana dashboard I wouldn't have seen this.

In the next article — about action pipelines in Screeps: how two methods can both return OK but one of them is silently ignored by the engine; and how a drainer (5T+18RA+17M+10H) becomes immortal under three towers precisely because of correct pipeline choices. That's not about performance anymore — it's about silent failures and why OK ≠ "done."

Telegram X (Twitter)

Discussion

Comments are powered by Giscus + GitHub. Clicking transfers data to GitHub Inc. (USA). No click — no transfer.

Open discussion on GitHub ↗