Cloud Run vs GKE cost: when serverless beats a node pool

Quick answer

Cloud Run bills per request-second and scales to zero, so it wins for spiky or low-traffic services — an idle service costs nothing. GKE bills for nodes 24/7, so it wins for steady, high-utilization fleets you can pack densely and cover with committed-use discounts. The crossover is utilization: below ~50% node packing, Cloud Run is usually cheaper; above it, GKE pulls ahead.

Cloud Run and GKE solve the same problem — run a container, serve traffic — with opposite cost models. Cloud Run charges for work done; GKE charges for capacity provisioned. Picking the cheaper one is almost entirely a question of how constant and how dense your traffic is.

Two different meters

Cloud Run (request-based billing) charges per vCPU-second and GiB-second only while a request is in flight, plus a small per-request fee, after a free tier. When no requests arrive and min-instances is zero, the meter stops. A service handling a few thousand requests a day can land in single-digit dollars per month.

GKE charges for the node pool: node count × the machine-type rate, running continuously, plus a $0.10/hour cluster fee (one zonal cluster free per account). Three e2-standard-4 nodes are ~$294/month whether they serve one request or a million.

The crossover is utilization

Spiky or low traffic: Cloud Run wins decisively. You pay for the seconds you serve, not for idle nodes. A nightly job or a bursty internal API is far cheaper here.
Steady, high utilization: GKE wins. Once nodes stay packed above ~50-70%, the lower per-unit node price beats Cloud Run's per-second premium, and committed-use discounts widen the gap further.
Many small services: Cloud Run avoids the bin-packing problem entirely — you don't pay for the gaps between services that never quite fill a node.

The min-instances trap

Cloud Run's cost advantage depends on scaling to zero. Teams set min-instances above zero to dodge cold starts, and quietly turn a pay-per-use service into an always-on bill — now you're paying for warm containers around the clock, which is exactly GKE's model without GKE's density. If you need several warm instances continuously, re-run the comparison; GKE or Autopilot may now be cheaper.

Where Autopilot fits

GKE Autopilot bills per pod resource request instead of per node, removing the empty-headroom waste that makes Standard GKE pricey for uneven workloads. It's the middle option: full Kubernetes, but you stop paying for half-empty nodes. If you want Kubernetes features but your traffic is too uneven to pack Standard nodes well, compare Autopilot before defaulting to Cloud Run — see GKE Standard vs Autopilot.

A decision rule

Uneven traffic, few services, or anything that can idle? Cloud Run with min-instances = 0.
Steady high traffic you can pack densely? GKE Standard with a node pool and committed-use discounts.
Kubernetes features but uneven load? GKE Autopilot.

FAQ

Is Cloud Run cheaper than GKE?

For spiky, low-traffic, or scale-to-zero workloads, yes — Cloud Run bills per request-second and drops to zero cost when idle, while a GKE node pool pays for nodes around the clock. For steady, high-utilization services that keep nodes packed, GKE is usually cheaper per unit. The crossover is utilization and how constant the traffic is.

How is Cloud Run priced?

Per vCPU-second and GiB-second of allocated resources while a request is being handled (request-based billing), plus a per-request fee, with a generous free tier. With CPU-always-allocated billing you pay for the full container lifetime instead. A service that's idle most of the day can cost a few dollars a month.

When should I choose GKE over Cloud Run?

When you run many services at steady high utilization (nodes stay packed), need GPUs, daemonsets, custom networking, or sidecars, or want to amortize cost with committed-use discounts. GKE's per-unit price is lower but you pay for nodes whether or not pods fill them.

Does Cloud Run scale to zero?

Yes, with request-based billing and min-instances set to zero, an idle Cloud Run service costs nothing. Setting min-instances above zero (to avoid cold starts) means you pay for those warm instances continuously, which changes the cost model toward GKE's.

What about GKE Autopilot in this comparison?

Autopilot bills per pod resource request rather than per node, which removes the empty-node-headroom waste that makes Standard GKE expensive for spiky workloads. It sits between Standard GKE and Cloud Run: more Kubernetes than Cloud Run, less node-management waste than Standard.

How does C3X compare the two?

C3X prices a google_cloud_run_v2_service from its CPU/memory and a google_container_node_pool from node count and machine type, so you can put the serverless and node-based options side by side on your real Terraform before committing to either.

What to do next

Put both options on the same page. C3X prices a google_container_node_pool from node count and machine type and a Cloud Run service from its CPU/memory request, so you can compare serverless against node-based on your real Terraform. For the broader version of this trade-off across clouds, see containers vs serverless, then run the quickstart on your own stack.