“What gets measured
gets managed.”
— Peter Drucker
We mapped every dollar of a $100M+ cloud bill, found the waste, built the fix, and automated the whole thing.
Impact at a glance
$10M+
saved per year
Cost / $ revenue
Cost / user / day
Decision Core sits where your data already lives.
No new pipelines. No rip-and-replace. Decision Core ingests signals from every tool your team already uses, processes them in real time, and surfaces every observable insight automatically.
Telemetry
Agreements & Contracts
Cloud Cost Data
Technical Documentation
Riklr
Decision Core
Cost by service
Unit cost trend
Demand signals
Infra utilisation
Unit cost
Identified multiple levers projected
to save ~$11M annually.
The engagement opened with a full opportunity mapping exercise. Every potential savings initiative across the stack was surfaced, modelled against 12 months of billing data, and ranked by savings realised per unit of engineering effort. Only the highest-leverage levers made the programme.
Prediction-based Scaling
Scale infrastructure ahead of demand, not in reaction to it. Decision Core ingests event schedules and live signals to provision capacity before traffic arrives, eliminating both over-provisioning and lag-cost.
$6.5M
projected / yr
Instance Reservation
Commit the right baseline of capacity at reserved rates. ML models predict sustained-use patterns with enough accuracy to shift workloads off on-demand pricing without sacrificing headroom.
$2.7M
projected / yr
Additional Initiatives
A portfolio of targeted fixes surfaced during the diagnostic phase, individually smaller but collectively material.
$1.8M
projected / yr
Projections were validated against 12 months of billing data before engagement. Realised savings of $10M+ were delivered within the first operating year.
Traffic spikes in a minute.
Infrastructure takes 15–20.
Reactive autoscaling was built for gradual load growth, not for a live sports platform where a single whistle can push concurrent users from 1M to 20M in under 60 seconds. By the time new instances are provisioned and warmed up, the match moment has already passed.
< 60 sec
Time for traffic to spike to 20M concurrent users
15–20 min
Time for new instances to provision, boot, and warm up
Every spike
Reactive infra arrives late. The window has already moved on.
Without prediction, the system kept
falling into two bad operating modes.
Engineering teams were trapped between two equally unsustainable choices. Neither was a strategy. Both were symptoms of running infrastructure without any view of what demand would do next.
Mode 01
Over-provisioned for safety
Burned by past incidents, teams ran 2–3× expected peak headroom and held it permanently. Outside active match windows, the majority of every day, that capacity sat completely idle.
~50%
avg idle
The waste wasn't visible on any single day. It accumulated invisibly across thousands of hours of off-peak idle time every year.
Mode 02
Reactive, always 15 min late
The autoscaler fired on CPU and memory thresholds. By the time those signals appeared, new instances provisioned, and caches warmed, the match moment had already passed.
15 min
too late
The autoscaler was designed for gradual ramp-ups, not a system where a referee's whistle can move a million concurrent users in under 60 seconds.
Both modes share the same root cause: infrastructure decisions made without any knowledge of what demand will do next. The fix wasn't faster alerting or a better autoscaler. It was replacing the reactive loop entirely with a prediction-first architecture.
After prediction came quiet, always-on execution.
Riklr agents continuously consumed prediction events, checked policy boundaries, and executed bounded scale actions in the background. The operating model became calmer precisely because the system stopped waiting for humans to catch up.
15–20 min
The old reactive lag, eliminated. The loop now acts 15–30 min ahead.
observe
Observe
Watch live demand, event signals, and current stack posture continuously.
predict
Predict
Project the next operating horizon rather than reacting to the present minute.
act
Act
Issue scale-up or scale-down actions across the relevant systems inside policy guardrails.
stable
Stable
Return to low-friction monitoring with fewer manual escalations and less wasted capacity.
Reservations are a living portfolio,
not a one-time procurement decision.
Every reserved instance carries an expiry date. As batches expire, total covered capacity drops, and anything uncovered defaults to on-demand rates at the worst possible moment. Decision Core tracks the full reservation portfolio, forecasts upcoming demand, and triggers renewals and resizes before each expiry window closes.
Projected annual savings
$2.7M
From shifting baseline capacity to reserved pricing, right-sized continuously by ML forecast
Rate reduction vs on-demand
~30%
Reserved instances cost significantly less if you commit the right size at the right time
Uncovered expiry windows
0
The system renews before every expiry. No gap where capacity reverts to on-demand billing.
ML-Driven Reservation Portfolio
12 months of continuous portfolio management. Reservations are bought before peaks, renewed before expiry, and right-sized each cycle based on the latest demand forecast. The staircase pattern is the system working: each step up is a purchase, each step down an expiry.
8
portfolio events / yr
≤ 5%
headroom over demand
0
uncovered windows
~30%
rate saving on floor
The key insight: The gap between reserved capacity and actual demand is intentionally tight: wide enough to absorb forecast error, narrow enough to avoid idle waste. Getting that gap right, across every renewal cycle, is the optimisation. A static floor set once a year cannot do this.
Key outcomes
Seeing the same patterns in your infrastructure?
We work with engineering and platform teams to instrument cloud spend, model demand ahead of events, and build the automation layer that removes manual toil. If any of this resonates, let’s talk.