Overview

1: Monitoring Metrics
2: ServiceWatch Metrics

Service Overview

Kubernetes Engine is a service that provides lightweight virtual computing, containers, and a Kubernetes cluster to manage them. Users can leverage a Kubernetes environment without complex preparation by installing, operating, and maintaining the Kubernetes Control Plane.

Features

Standard Kubernetes Environment Setup: You can use a standard Kubernetes environment without additional configuration through the built-in Kubernetes Control Plane. It is compatible with applications in other standard Kubernetes environments, allowing you to use standard Kubernetes applications without modifying code.
Easy Kubernetes Deployment: provides secure communication between the worker node (Worker Node) and the managed control plane, and quickly provisions worker nodes so users can focus on building applications on the provided container environment.
Convenient Kubernetes Management: For enterprise environments, we provide various management features to conveniently use the created Kubernetes clusters, including cluster information lookup and management via a dashboard, namespace management, and workload management functions.

Service Diagram

Provided features

Kubernetes Engine provides the following features.

Cluster Management: You can create and manage clusters to use the Kubernetes Engine service. After creating a cluster, you can add services needed for operation such as nodes, namespaces, and workloads.
Node Management: A node is a set of machines that run containerized applications. Every cluster must have at least one worker node to deploy applications. Nodes can be used by defining node pools. Nodes belonging to a node pool must have the same server type, size, and OS image, and creating multiple node pools enables flexible deployment strategies.
Namespace Management: A namespace is a logical partition within a Kubernetes cluster and is used to specify access permissions or resource usage limits per namespace.
Workload Management: A workload is an application running on Kubernetes Engine. After creating a namespace, you can add or delete workloads. Workloads are created and managed per item such as Deployment, Pod, StatefulSet, DaemonSet, Job, and CronJob.
Service and Ingress Management: A service is an abstraction that exposes applications running in a set of pods as a network service, and an ingress is used to expose HTTP and HTTPS paths from outside the cluster to inside the cluster. After creating a namespace, you can create or delete services, endpoints, ingresses, and ingress classes.
Storage Management: You can create and manage the storage to be used when using Kubernetes Engine. Storage is created and managed per PVC, PV, and StorageClass items.
Configuration Management: When you need to manage values that change inside containers across multiple environments such as Dev/Prod, creating separate images to handle them via environment variables is inconvenient and wasteful. In Kubernetes, you can manage environment variables or configuration settings as variables that can be changed externally and injected when a Pod is created; at that point you can use ConfigMaps and Secrets.
Permission Management: When multiple users access a Kubernetes cluster, you can assign permissions per specific API or namespace to define the access scope. By applying Kubernetes’ role-based access control (RBAC) feature, you can set permissions for clusters or namespaces. You can create and manage ClusterRoles, ClusterRoleBindings, Roles, and RoleBindings.

Component

control plane

Control Plane is the component that serves as the master node in the Kubernetes Engine service. The master node is the cluster’s management node, responsible for managing the other nodes in the cluster. A cluster is the basic creation unit of the Kubernetes Engine service and is used for managing node pools, objects, controllers, etc., that belong to it. Users configure the cluster name (cluster name), control plane, network, File Storage, and then create node pools within the cluster for use. The master node assigns work to the cluster, monitors node status, and handles data communication between nodes.

The cluster name creation rules are as follows.

It must start with a letter and can be set using letters, numbers, and special characters (-) within 3 to 30 characters.
It must not duplicate an already existing cluster name.

worker node

The worker node (Worker Node) is a compute node in the cluster that performs tasks. It receives task assignments from the cluster’s master node, executes them, and reports the results back to the master node. All nodes created within a node pool and namespace serve as worker nodes.

The rules for creating a node pool, which is a collection of worker nodes, are as follows.

A node pool must contain at least one node for the application deployment to be possible.
A maximum of 100 nodes can be created within a node pool.
Since the maximum number of nodes is 100, you can freely create up to 100 nodes—for example, with 100 node pools you get 1 node per pool, and with 50 node pools you get 2 nodes per pool.
It is possible to configure block storage attached to a node pool.
You can configure the server type, size, and OS image for nodes in a node pool, and they must all be identical.
Through the Auto-Scaling service, you can configure automatic scaling and shrinking of node pools according to the requirements of the deployed application.

Preliminary Service

This is a list of services that must be pre-configured before creating the service. Please refer to the guide provided for each service for details and prepare in advance.

Service Category	service	Detailed description
Networking	VPC	A service that provides an isolated virtual network in a cloud environment
Networking	Security Group	Virtual firewall that controls server traffic
Storage	File Storage	A storage that allows multiple clients to share files over the network used as a Persistant Volume

Table. Kubernetes Engine Prerequisite Services

1 - Monitoring Metrics

Cloud Monitoring service termination notice

According to Samsung Cloud Platform’s policy, the Cloud Monitoring service is scheduled to be discontinued in September 2026.
Accordingly, after the September 2026 release, resource monitoring of the Samsung Cloud Platform via Cloud Monitoring will no longer be possible.

With the new alternative service, you can continuously perform resource monitoring by using ServiceWatch, released in October 2025.
ServiceWatch provides more modern and powerful features, replacing Cloud Monitoring to deliver a seamless monitoring environment.

Detailed information about ServiceWatch is available in the ServiceWatch Overview.

Kubernetes Engine monitoring metrics

The table below shows the monitoring metrics of Kubernetes Engine that can be viewed through Cloud Monitoring. For detailed usage of Cloud Monitoring, refer to the Cloud Monitoring guide.

Performance items	Detailed description	unit
Cluster Namespaces [Active]	Number of namespaces in active state	cnt
Cluster Namespaces [Total]	Total number of namespaces in the cluster	cnt
Cluster Nodes [Ready]	Number of nodes in READY state	cnt
Cluster Nodes [Total]	Total number of nodes in the cluster	cnt
Cluster Pods [Failed]	Number of failed-state pods in the cluster	cnt
Cluster Pods [Pending]	Number of pending pods in the cluster	cnt
Cluster Pods [Running]	Number of pods in running state within the cluster	cnt
Cluster Pods [Succeeded]	Number of succeeded pods in the cluster	cnt
Cluster Pods [Unknown]	Number of pods in unknown state within the cluster	cnt
Instance Status	cluster status	status
Namespace Pods [Failed]	Number of failed-state pods in a namespace	cnt
Namespace Pods [Pending]	Number of pending pods in a namespace	cnt
Namespace Pods [Running]	Number of running pods in a namespace	cnt
Namespace Pods [Succeeded]	Number of succeeded-state pods in a namespace	cnt
Namespace Pods [Unknown]	Number of pods in unknown state within a namespace	cnt
Namespace GPU Clock Frequency	SM clock frequency in the Namespace	MHz
Namespace GPU Memory Usage	Memory utilization in the Namespace	%
Namespace GPU Usage	GPU utilization in the Namespace	%
Node CPU Size [Allocatable]	Node CPU allocatable	cnt
Node CPU Size [Capacity]	CPU capacity in the node	cnt
Node CPU Usage	CPU usage per node	%
Node CPU Usage [Request]	CPU request_ratio within node	%
Node CPU Used	CPU utilization within the node	status
Node Filesystem Usage	Node FS utilization	%
Node Memory Size [Allocatable]	memory allocatable within the node	bytes
Node Memory Size [Capacity]	Node memory utilization	bytes
Node Memory Usage	Node memory utilization	%
Node Memory Usage [Request]	memory request_ratio within node	%
Node Memory Workingset	memory working set within the node	bytes
Node Network In Bytes	Node network rx bytes	bytes
Node Network Out Bytes	Node network tx bytes	bytes
Node Network Total Bytes	Node network total bytes	bytes
Node Pods [Failed]	Number of pods in failed state within the node	cnt
Node Pods [Pending]	Number of pending pods in the node	cnt
Node Pods [Running]	Number of running pods per node	cnt
Node Pods [Succeeded]	Number of succeeded pods in the node	cnt
Node Pods [Unknown]	Number of unknown‑state pods in the node	cnt
Pod CPU Usage [Limit]	CPU usage_limit_ratio in the pod	%
Pod CPU Usage [Request]	CPU request_ratio in the pod	%
Pod CPU Usage	CPU usage within the pod	%
Pod GPU Clock Frequency	SM clock frequency in the Pod	MHz
Pod GPU Memory Usage	Memory utilization within the Pod	%
Pod GPU Usage	GPU utilization within the Pod	%
Pod Memory Usage [Limit]	memory usage_limit_ratio in pod	%
Pod Memory Usage [Request]	memory request_ratio in pod	%
Pod Memory Usage	Memory usage within pod	bytes
Pod Network In Bytes	network rx bytes in pod	bytes
Pod Network Out Bytes	network tx bytes in pod	bytes
Pod Network Total Bytes	Network total bytes in pod	bytes
Pod Restart Containers	container restart count in pod	cnt
Workload Pods [Running]	-	cnt

Table. Kubernetes Engine monitoring metrics

2 - ServiceWatch Metrics

Kubernetes Engine sends metrics to ServiceWatch. The metrics provided by default monitoring are data collected at a 1‑minute interval.

Reference

To view metrics in ServiceWatch, refer to the ServiceWatch guide.

Basic Metrics

The following are the basic metrics for the Kubernetes Engine namespace.

The metrics whose names are displayed in bold below are the metrics selected as key metrics among the default metrics provided by Kubernetes Engine. Key metrics are used to configure service dashboards that are automatically generated for each service in ServiceWatch.

Each metric indicates through the user guide which statistical values are meaningful when viewing that metric, and among the meaningful statistics, the values displayed in bold are the primary statistics. In the service dashboard, you can view key metrics using these primary statistical values.

Indicator name	Detailed description	unit	meaningful statistics
cluster_up	Cluster up	Count	Total Average Maximum Minimum
cluster_node_count	Cluster node count	Count	Total Average Maximum Minimum
cluster_failed_node_count	Number of failed nodes in the cluster	Count	Total Average Maximum Minimum
cluster_namespace_phase_count	Number of cluster namespace phases	Count	Total Average Maximum Minimum
cluster_pod_phase_count	Number of cluster pod phases	Count	Total Average Maximum Minimum
node_cpu_allocatable	Node CPU allocatable amount	-	Total Average Maximum Minimum
node_cpu_capacity	Node CPU capacity	-	Total Average Maximum Minimum
node_cpu_usage	Node CPU usage	-	Total Average Maximum Minimum
node_cpu_utilization	Node CPU utilization	-	Total Average Maximum Minimum
node_memory_allocatable	Node memory allocatable amount	Bytes	Total Average Maximum Minimum
node_memory_capacity	Node memory capacity	Bytes	Total Average Maximum Minimum
node_memory_usage	Node memory usage	Bytes	Total Average Maximum Minimum
node_memory_utilization	Node memory usage rate	-	Total Average Maximum Minimum
node_network_rx_bytes	Node network received bytes	Bytes/Second	Total Average Maximum Minimum
node_network_tx_bytes	Node network transmitted bytes	Bytes/Second	Total Average Maximum Minimum
node_network_total_bytes	Total bytes of the node network	Bytes/Second	Total Average Maximum Minimum
node_number_of_running_pods	Number of pods running on a node	Count	Total Average Maximum Minimum
namespace_number_of_running_pods	Number of running pods in a namespace	Count	Total Average Maximum Minimum
namespace_deployment_pod_count	Namespace deployment pod count	Count	Total Average Maximum Minimum
namespace_statefulset_pod_count	Namespace StatefulSet pod count	Count	Total Average Maximum Minimum
namespace_daemonset_pod_count	Namespace DaemonSet Pod Count	Count	Total Average Maximum Minimum
namespace_job_active_count	Active namespace job count	Count	Total Average Maximum Minimum
namespace_cronjob_active_count	Number of active namespace cron jobs	Count	Total Average Maximum Minimum
pod_cpu_usage	Pod CPU usage	-	Total Average Maximum Minimum
pod_memory_usage	Pod memory usage	Bytes	Total Average Maximum Minimum
pod_network_rx_bytes	Pod network received bytes	Bytes/Second	Total Average Maximum Minimum
pod_network_tx_bytes	Pod network transmit bytes	Bytes/Second	Total Average Maximum Minimum
pod_network_total_bytes	Pod network total bytes	Count	Total Average Maximum Minimum
container_cpu_usage	Container CPU usage	-	Total Average Maximum Minimum
container_cpu_limit	Container CPU limit	-	Total Average Maximum Minimum
container_cpu_utilization	Container CPU usage	-	Total Average Maximum Minimum
container_memory_usage	Container memory usage	Bytes	Total Average Maximum Minimum
container_memory_limit	Container memory limit	Bytes	Total Average Maximum Minimum
container_memory_utilization	Container memory usage	-	Total Average Maximum Minimum
node_gpu_count	Number of node GPUs	Count	Total Average Maximum Minimum
gpu_temp	GPU temperature	-	Total Average Maximum Minimum
gpu_power_usage	GPU power consumption	-	Total Average Maximum Minimum
gpu_util	GPU utilization	Percent	Total Average Maximum Minimum
gpu_sm_clock	GPU SM clock	-	Total Average Maximum Minimum
gpu_fb_used	GPU FB usage	Megabytes	Total Average Maximum Minimum
gpu_tensor_active	GPU Tensor Utilization	-	Total Average Maximum Minimum
pod_gpu_util	Pod GPU utilization	Percent	Total Average Maximum Minimum
pod_gpu_tensor_active	Pod GPU Tensor Utilization	-	Total Average Maximum Minimum

Table. Kubernetes Engine Basic Metrics