AzureAzure HDInsightAnalytics

azurerm_hdinsight_interactive_query_cluster cost estimation

A managed Hive LLAP (Interactive Query) cluster on HDInsight, billed per node-hour for head and worker VMs. A 2-head + 3-worker cluster is ~$1,894/month.

An azurerm_hdinsight_interactive_query_cluster runs Hive LLAP (Low-Latency Analytical Processing) on Azure HDInsight — interactive, in-memory SQL querying over data-lake data. Cost is per node-hour across all nodes (head plus workers) at the VM rate plus HDInsight surcharge. A 2-head + 3-worker D12v2-class cluster is ~$0.519/node-hour × 5 × 730 ≈ $1,894/month, billed continuously.

Interactive Query is built for fast, repeated queries with cached/in-memory data, so it's typically kept running during business hours rather than created per job — which means the always-on node cost is the dominant factor. Worker nodes hold the LLAP cache, so node count and memory drive both performance and cost.

Because it doesn't auto-pause, the cost-efficient pattern is to run it during query hours and delete it overnight/weekends (data persists in ADLS/Blob), or to use a modern alternative — Synapse Spark/SQL pools auto-pause, and Databricks SQL warehouses can auto-stop — for interactive analytics without standing HDInsight nodes.

c3x prices the cluster from the worker node count and node size, so the always-on cost is visible before deployment.

Terraform example

A minimal but realistic configuration that C3X can estimate.

resource "azurerm_hdinsight_interactive_query_cluster" "iq" {
  name                = "interactive-query"
  resource_group_name = azurerm_resource_group.main.name
  location            = azurerm_resource_group.main.location
  cluster_version     = "5.1"
  tier                = "Standard"

  roles {
    head_node {
      vm_size = "Standard_D13_v2"
      # ... credentials
    }
    worker_node {
      vm_size               = "Standard_D14_v2"
      target_instance_count = 3
    }
    zookeeper_node {
      vm_size = "Standard_A4_v2"
    }
  }
}

Pricing dimensions

What you actually pay for when you provision azurerm_hdinsight_interactive_query_cluster.

DimensionUnitWhat's being charged
Cluster nodesper node-hourAll nodes (2 head + workers) bill per node-hour at the VM rate plus HDInsight surcharge, continuously. Worker nodes hold the LLAP cache.
$0.519/node-hour (D12v2-class) → 5 nodes ≈ $1,894.35/month

Sample C3X output

2 head + 3 worker nodes (D12v2-class), running 24/7:

azurerm_hdinsight_interactive_query_cluster.iq
└─ Cluster nodes (5 × D12v2-class)   3650 node-hours   $1,894.35
                                     Monthly           $1,894.35

Optimization tips

Common ways to reduce azurerm_hdinsight_interactive_query_cluster cost without changing the workload.

Run during query hours, delete off-hours

Up to ~60% with off-hours teardown

Interactive Query clusters don't auto-pause and bill nodes 24/7. If queries are business-hours only, delete the cluster overnight/weekends (data persists in ADLS/Blob) and recreate it, rather than paying for idle LLAP cache nodes.

Use auto-pausing alternatives for interactive SQL

Large vs an always-on cluster

Synapse Spark/SQL pools auto-pause and Databricks SQL warehouses auto-stop when idle — providing interactive analytics without standing HDInsight nodes. Often cheaper than a 24/7 LLAP cluster.

Right-size worker nodes to the cache working set

Proportional to right-sizing

LLAP performance comes from in-memory caching on workers, so memory and count drive both speed and cost. Size to the hot data your queries touch rather than over-provisioning.

Reserve nodes if the cluster must stay up

40–60% on a steady cluster

For a cluster that genuinely runs continuously, a 1-3 year reservation on the node VMs discounts the always-on cost.

FAQ

How is an HDInsight Interactive Query cluster billed?

Per node-hour across all nodes (head + workers) at the VM rate plus HDInsight surcharge, continuously — a 2-head + 3-worker D12v2 cluster is ~$1,894/month. It doesn't auto-pause, so the always-on node cost dominates.

Is there a cheaper way to do interactive SQL on a data lake?

Synapse Spark/SQL pools (auto-pause) and Databricks SQL warehouses (auto-stop) provide interactive analytics without standing HDInsight nodes, and are often cheaper than a 24/7 LLAP cluster — especially for intermittent query patterns.

How does c3x estimate the cost?

From the worker node count and node size plus head nodes, pricing node-hours at the HDInsight rate.

Related resources

Estimate this resource in your own Terraform

Free, open source, no API key. C3X parses your Terraform and shows line-item cost for every resource, including azurerm_hdinsight_interactive_query_cluster.