Google CloudCloud RunServerless

`google_cloud_run_v2_service` cost estimation

A fully managed container service that scales to zero. Priced per vCPU-second and GiB-second while requests are being handled, plus per-request fees.

A google_cloud_run_v2_service runs a container that scales automatically, including down to zero when there's no traffic. That scale-to-zero is the defining cost characteristic: under request-based billing you pay only for the vCPU and memory allocated while a request is actually being processed, billed to the nearest 100 milliseconds.

The two compute dimensions are vCPU allocation time (per vCPU-second) and memory allocation time (per GiB-second). A service handling steady traffic with 1 vCPU and 512 MiB might accrue a few million vCPU-seconds a month; an idle service accrues nothing. There's also a small per-request charge after the free tier, and CPU is billed at a lower rate during startup/idle if you enable always-on CPU.

Because Cloud Run cost is entirely usage-driven, c3x prices it from the request and allocation figures you supply in c3x-usage.yml (monthly vCPU-seconds and GiB-seconds). Without usage input the standing cost is zero, which is the honest answer for a scale-to-zero service. The v2 API is the current generation; the older google_cloud_run_service prices the same way.

Terraform example

A minimal but realistic configuration that C3X can estimate.

resource "google_cloud_run_v2_service" "api" {
  name     = "api"
  location = "us-central1"

  template {
    containers {
      image = "us-docker.pkg.dev/proj/api/server:latest"
      resources {
        limits = {
          cpu    = "1"
          memory = "512Mi"
        }
      }
    }
    scaling {
      min_instance_count = 0
      max_instance_count = 10
    }
  }
}

Pricing dimensions

What you actually pay for when you provision google_cloud_run_v2_service.

Dimension	Unit	What's being charged
vCPU allocation time	per vCPU-second	Billed for vCPU allocated while handling requests, to the nearest 100ms. Free tier covers the first 180,000 vCPU-seconds/month. Usage-based. $0.000024/vCPU-second
Memory allocation time	per GiB-second	Billed for memory allocated while handling requests. Free tier covers the first 360,000 GiB-seconds/month. Usage-based. $0.0000025/GiB-second
Requests	per million requests	Per-request charge after the first 2 million requests/month free. Usually a minor component. $0.40 per million requests

Sample C3X output

Example output from c3x estimate with usage supplied (1M vCPU-seconds, 2M GiB-seconds):

google_cloud_run_v2_service.api
├─ vCPU allocation time    1,000,000  vCPU-seconds    $24.00
└─ Memory allocation time  2,000,000  GiB-seconds      $5.00

OVERALL TOTAL                                         $29.00

Optimization tips

Common ways to reduce google_cloud_run_v2_service cost without changing the workload.

Keep min_instance_count at zero where latency allows

All idle compute

Setting a minimum keeps instances warm and billing even with no traffic. Scale-to-zero is the cheapest mode; only set a minimum if cold-start latency is unacceptable for the endpoint.

Right-size CPU and memory limits

Proportional to over-allocation

Cost is linear in allocated vCPU and memory. Over-allocating 2 vCPU when 1 suffices doubles the vCPU bill. Profile real usage and set the smallest limits that meet your latency SLO.

Use request-based (not instance-based) CPU billing

Idle CPU between requests

Unless you need background processing between requests, request-based billing only charges during request handling. Always-on CPU bills continuously.

FAQ

How does c3x estimate Cloud Run v2 cost?

Cloud Run is fully usage-driven, so c3x prices it from monthly vCPU-seconds and GiB-seconds you provide in c3x-usage.yml, using the live request-based rates. With no usage supplied the standing cost is zero, which is correct for a scale-to-zero service.

Why does the estimate show $0 without a usage file?

A Cloud Run service with min_instance_count of 0 costs nothing when idle. There is no per-hour standing charge, so zero is the honest baseline. Supply expected traffic in the usage file to see a non-zero estimate.

What's the difference from google_cloud_run_service?

The v2 resource uses the newer Cloud Run Admin API v2 with a richer schema, but the billing model and rates are identical. c3x prices both the same way.

Does setting a minimum instance count change pricing?

Yes. With a minimum above zero, those instances stay warm and bill for allocated CPU and memory continuously (at the idle CPU rate), even with no requests. Model this by adding the always-on seconds to your usage file.