ServiceWatch metric
In ServiceWatch, you can view Kubernetes Engine metrics for the Kubernetes Engine created by Data Flow. As with Kubernetes Engine, the metrics provided by default monitoring are data collected at one‑minute intervals.
Basic Metrics
The following are the default metrics for the Kubernetes Engine namespace.
The metrics whose names are displayed in bold below are the key metrics selected from the default metrics provided by Kubernetes Engine. Key metrics are used to build service dashboards that are automatically created for each service in ServiceWatch.
Each metric provides guidance in the user guide on which statistical values are meaningful when querying that metric, and among the meaningful statistics, the values shown in bold are the primary statistics. In the service dashboard, you can view key metrics using primary statistical values.
| Indicator name | Detailed description | unit | meaningful statistics |
|---|---|---|---|
| cluster_up | Cluster up | Count |
|
| cluster_node_count | Number of cluster nodes | Count |
|
| cluster_failed_node_count | Number of failed nodes in the cluster | Count |
|
| cluster_namespace_phase_count | Number of cluster namespace phases | Count |
|
| cluster_pod_phase_count | Cluster pod phase count | Count |
|
| node_cpu_allocatable | Node CPU allocatable | - |
|
| node_cpu_capacity | Node CPU capacity | - |
|
| node_cpu_usage | Node CPU usage | - |
|
| node_cpu_utilization | Node CPU usage | - |
|
| node_memory_allocatable | Node memory allocatable amount | Bytes |
|
| node_memory_capacity | Node memory capacity | Bytes |
|
| node_memory_usage | Node memory usage | Bytes |
|
| node_memory_utilization | Node memory usage rate | - |
|
| node_network_rx_bytes | Node network received bytes | Bytes/Second |
|
| node_network_tx_bytes | Node network transmitted bytes | Bytes/Second |
|
| node_network_total_bytes | Total bytes of the node network | Bytes/Second |
|
| node_number_of_running_pods | Number of pods running on a node | Count |
|
| namespace_number_of_running_pods | Number of running pods in the namespace | Count |
|
| namespace_deployment_pod_count | Namespace deployment pod count | Count |
|
| namespace_statefulset_pod_count | Namespace StatefulSet pod count | Count |
|
| namespace_daemonset_pod_count | Namespace daemonset pod count | Count |
|
| namespace_job_active_count | Active namespace job count | Count |
|
| namespace_cronjob_active_count | Number of active namespace cronjobs | Count |
|
| pod_cpu_usage | Pod CPU usage | - |
|
| pod_memory_usage | Pod memory usage | Bytes |
|
| pod_network_rx_bytes | Pod network received bytes | Bytes/Second |
|
| pod_network_tx_bytes | Pod network transmitted bytes | Bytes/Second |
|
| pod_network_total_bytes | Pod network total bytes | Count |
|
| container_cpu_usage | Container CPU usage | - |
|
| container_cpu_limit | Container CPU limit | - |
|
| container_cpu_utilization | Container CPU usage | - |
|
| container_memory_usage | Container memory usage | Bytes |
|
| container_memory_limit | Container memory limit | Bytes |
|
| container_memory_utilization | Container memory usage | - |
|
| node_gpu_count | Node GPU count | Count |
|
| gpu_temp | GPU temperature | - |
|
| gpu_power_usage | GPU power usage | - |
|
| gpu_util | GPU utilization | Percent |
|
| gpu_sm_clock | GPU SM clock | - |
|
| gpu_fb_used | GPU FB usage | Megabytes |
|
| gpu_tensor_active | GPU tensor utilization | - |
|
| pod_gpu_util | Pod GPU utilization | Percent |
|
| pod_gpu_tensor_active | Pod GPU Tensor Utilization Rate | - |
|