ServiceWatch metric
You can view Kubernetes Engine metrics in ServiceWatch for the Kubernetes Engine created from Quick Query. As with Kubernetes Engine, the metrics provided by default monitoring are data collected at one‑minute intervals.
Basic Metrics
The following are basic metrics for the Kubernetes Engine namespace.
The metrics whose names are shown in bold below are the key metrics selected from the default metrics provided by Kubernetes Engine. Key metrics are used to build service dashboards that are automatically created for each service in ServiceWatch.
Each metric provides guidance in the user guide on which statistical values are meaningful when querying that metric, and among the meaningful statistics, the values shown in bold are the primary statistics. In the service dashboard, you can view key metrics using primary statistical values.
| Indicator Name | Detailed description | unit | meaningful statistics |
|---|---|---|---|
| cluster_up | Cluster up | Count |
|
| cluster_node_count | Number of cluster nodes | Count |
|
| cluster_failed_node_count | Number of failed nodes in the cluster | Count |
|
| cluster_namespace_phase_count | Number of cluster namespace phases | Count |
|
| cluster_pod_phase_count | Number of cluster pod phases | Count |
|
| node_cpu_allocatable | Node CPU allocatable amount | - |
|
| node_cpu_capacity | Node CPU capacity | - |
|
| node_cpu_usage | Node CPU usage | - |
|
| node_cpu_utilization | Node CPU usage | - |
|
| node_memory_allocatable | Node memory allocatable amount | Bytes |
|
| node_memory_capacity | Node memory capacity | Bytes |
|
| node_memory_usage | Node memory usage | Bytes |
|
| node_memory_utilization | Node memory usage rate | - |
|
| node_network_rx_bytes | Node network received bytes | Bytes/Second |
|
| node_network_tx_bytes | Node network transmitted bytes | Bytes/Second |
|
| node_network_total_bytes | Total bytes of the node network | Bytes/Second |
|
| node_number_of_running_pods | Number of pods running on the node | Count |
|
| namespace_number_of_running_pods | Number of running pods in the namespace | Count |
|
| namespace_deployment_pod_count | Namespace deployment pod count | Count |
|
| namespace_statefulset_pod_count | Namespace StatefulSet pod count | Count |
|
| namespace_daemonset_pod_count | Number of DaemonSet Pods per Namespace | Count |
|
| namespace_job_active_count | Active namespace job count | Count |
|
| namespace_cronjob_active_count | Number of active namespace cronjobs | Count |
|
| pod_cpu_usage | Pod CPU usage | - |
|
| pod_memory_usage | Pod memory usage | Bytes |
|
| pod_network_rx_bytes | Pod network received bytes | Bytes/Second |
|
| pod_network_tx_bytes | Pod network transmitted bytes | Bytes/Second |
|
| pod_network_total_bytes | Total pod network bytes | Count |
|
| container_cpu_usage | Container CPU usage | - |
|
| container_cpu_limit | Container CPU limit | - |
|
| container_cpu_utilization | Container CPU usage | - |
|
| container_memory_usage | Container memory usage | Bytes |
|
| container_memory_limit | Container memory limit | Bytes |
|
| container_memory_utilization | Container memory usage rate | - |
|
| node_gpu_count | Node GPU count | Count |
|
| gpu_temp | GPU temperature | - |
|
| gpu_power_usage | GPU power consumption | - |
|
| gpu_util | GPU utilization | Percent |
|
| gpu_sm_clock | GPU SM clock | - |
|
| gpu_fb_used | GPU FB usage | Megabytes |
|
| gpu_tensor_active | GPU tensor utilization | - |
|
| pod_gpu_util | Pod GPU utilization | Percent |
|
| pod_gpu_tensor_active | Pod GPU Tensor Utilization Rate | - |
|