1 - Server Type

GPU Server Server Type

GPU Server is classified according to the GPU Type provided, and the GPU used in the GPU Server is determined by the server type selected when creating the GPU Server. Please select the server type according to the specifications of the application you want to run on the GPU Server.

The server types supported by the GPU Server are as follows.

GPU-H100-2 g2v12h1
Category
ExampleDetailed description
Server TypeGPU-H100-2Provided server type classification
  • GPU-H100-2
    • GPU-H100 means the provided GPU type
    • 2 means the generation
  • GPU-A100-1
    • GPU-A100 means the provided GPU type
    • 1 means the generation
Server specificationsg2Provided server type classification and generation
  • g2
    • g means GPU server specifications
    • 2 means generation
Server specificationsv12Number of vCores
  • v2: 2 virtual cores
Server specificationsh1GPU type and quantity
  • h1
    • h means GPU-H100
    • 1 means 1 GPU
  • a2
    • a means GPU-A100
    • 2 means 2 GPUs
Table. GPU Server server type format

g1 server type

The g1 server type is a GPU Server that uses NVIDIA A100 Tensor Core GPU, suitable for high-performance applications.

  • Provides up to 8 NVIDIA A100 Tensor Core GPUs
  • Equipped with 6,912 CUDA cores and 432 Tensor cores per GPU
  • Supports up to 128 vCPUs and 1,920 GB of memory
  • Maximum 40 Gbps networking speed
  • 600GB/s GPU and NVIDIA NVSwitch P2P communication
CategoryServer TypeGPUCPUMemoryGPU MemoryNetwork Bandwidth
GPU-A100-1g1v16a1116 vCore234 GB80 GBup to 20 Gbps
GPU-A100-1g1v32a2232 vCore468 GB160 GBup to 20 Gbps
GPU-A100-1g1v64a4464 vCore936 GB320 GBup to 40 Gbps
GPU-A100-1g1v128a88128 vCore1872 GB640 GBMaximum 40 Gbps
Table. GPU Server server type > GPU-A100-1 server type

g2 server type

The g2 server type is a GPU Server that uses NVIDIA H100 Tensor Core GPU, suitable for high-performance applications.

  • Up to 8 NVIDIA H100 Tensor Core GPUs provided
  • Equipped with 16,896 CUDA cores and 528 Tensor cores per GPU
  • Supports up to 96 vCPUs and 1,920 GB of memory
  • Maximum networking speed of 40Gbps
  • 900GB/s GPU and NVIDIA NVSwitch P2P communication
CategoryServer TypeGPUCPUMemoryGPU MemoryNetwork Bandwidth
GPU-H100-2g2v12h1112 vCore234 GB80 GBup to 20 Gbps
GPU-H100-2g2v24h2224 vCore468 GB160 GBup to 20 Gbps
GPU-H100-2g2v48h4448 vCore936 GB320 GBMaximum 40 Gbps
GPU-H100-2g2v96h8896 vCore1872 GB640 GBup to 40 Gbps
Table. GPU Server server type > GPU-H100-2 server type

2 - Monitoring Metrics

GPU Server Monitoring Metrics

The following table shows the monitoring metrics of the GPU Server that can be checked through Cloud Monitoring.

Even without installing an Agent, basic monitoring metrics are provided. Please check the Table. GPU Server Monitoring Metrics (Basic) below. Additionally, metrics that can be retrieved by installing an Agent are referenced in the Table. GPU Server Additional Monitoring Metrics (Agent Installation Required) below.

For detailed Cloud Monitoring usage, please refer to the Cloud Monitoring guide.

Performance Item NameDescriptionUnit
Memory Total [Basic]Total available memory in bytesbytes
Memory Used [Basic]Currently used memory in bytesbytes
Memory Swap In [Basic]Swapped memory in bytesbytes
Memory Swap Out [Basic]Swapped memory in bytesbytes
Memory Free [Basic]Unused memory in bytesbytes
Disk Read Bytes [Basic]Read bytesbytes
Disk Read Requests [Basic]Number of read requestscnt
Disk Write Bytes [Basic]Written bytesbytes
Disk Write Requests [Basic]Number of write requestscnt
CPU Usage [Basic]Average system CPU usage over 1 minute%
Instance State [Basic]Instance statestate
Network In Bytes [Basic]Received bytesbytes
Network In Dropped [Basic]Dropped received packetscnt
Network In Packets [Basic]Received packetscnt
Network Out Bytes [Basic]Sent bytesbytes
Network Out Dropped [Basic]Dropped sent packetscnt
Network Out Packets [Basic]Sent packetscnt
Table. GPU Server Basic Monitoring Metrics (Basic)
Performance Item NameDescriptionUnit
GPU CountNumber of GPUscnt
GPU Memory UsageGPU memory usage rate%
GPU Memory UsedUsed GPU memoryMB
GPU TemperatureGPU temperature
GPU UsageGPU utilization%
GPU Usage [Avg]Average GPU usage rate%
GPU Power CapMaximum power capacity of the GPUW
GPU Power UsageCurrent power usage of the GPUW
GPU Memory Usage [Avg]Average GPU memory usage rate%
GPU Count in useNumber of GPUs in use by jobs on the nodecnt
Execution Status for nvidia-smiExecution result of the nvidia-smi commandstatus
Core Usage [IO Wait]CPU time spent in IO wait state%
Core Usage [System]CPU time spent in system space%
Core Usage [User]CPU time spent in user space%
CPU CoresNumber of CPU cores on the hostcnt
CPU Usage [Active]CPU time used, excluding idle and IO wait states%
CPU Usage [Idle]CPU time spent in idle state%
CPU Usage [IO Wait]CPU time spent in IO wait state%
CPU Usage [System]CPU time used by the kernel%
CPU Usage [User]CPU time used by user space%
CPU Usage/Core [Active]CPU time used per core, excluding idle and IO wait states%
CPU Usage/Core [Idle]CPU time spent in idle state per core%
CPU Usage/Core [IO Wait]CPU time spent in IO wait state per core%
CPU Usage/Core [System]CPU time used by the kernel per core%
CPU Usage/Core [User]CPU time used by user space per core%
Disk CPU Usage [IO Request]CPU time spent on IO requests%
Disk Queue Size [Avg]Average queue length of requestsnum
Disk Read BytesBytes read from the device per secondbytes
Disk Read Bytes [Delta Avg]Average delta of bytes read from the devicebytes
Disk Read Bytes [Delta Max]Maximum delta of bytes read from the devicebytes
Disk Read Bytes [Delta Min]Minimum delta of bytes read from the devicebytes
Disk Read Bytes [Delta Sum]Sum of delta of bytes read from the devicebytes
Disk Read Bytes [Delta]Delta of bytes read from the devicebytes
Disk Read Bytes [Success]Total bytes successfully readbytes
Disk Read RequestsNumber of read requests to the device per secondcnt
Disk Read Requests [Delta Avg]Average delta of read requests to the devicecnt
Disk Read Requests [Delta Max]Maximum delta of read requests to the devicecnt
Disk Read Requests [Delta Min]Minimum delta of read requests to the devicecnt
Disk Read Requests [Delta Sum]Sum of delta of read requests to the devicecnt
Disk Read Requests [Success Delta]Delta of successful read requests to the devicecnt
Disk Read Requests [Success]Total successful read requestscnt
Disk Request Size [Avg]Average size of requests to the devicenum
Disk Service Time [Avg]Average service time of requests to the devicems
Disk Wait Time [Avg]Average wait time of requests to the devicems
Disk Wait Time [Read]Average read wait time of the devicems
Disk Wait Time [Write]Average write wait time of the devicems
Disk Write Bytes [Delta Avg]Average delta of bytes written to the devicebytes
Disk Write Bytes [Delta Max]Maximum delta of bytes written to the devicebytes
Disk Write Bytes [Delta Min]Minimum delta of bytes written to the devicebytes
Disk Write Bytes [Delta Sum]Sum of delta of bytes written to the devicebytes
Disk Write Bytes [Delta]Delta of bytes written to the devicebytes
Disk Write Bytes [Success]Total bytes successfully writtenbytes
Disk Write RequestsNumber of write requests to the device per secondcnt
Disk Write Requests [Delta Avg]Average delta of write requests to the devicecnt
Disk Write Requests [Delta Max]Maximum delta of write requests to the devicecnt
Disk Write Requests [Delta Min]Minimum delta of write requests to the devicecnt
Disk Write Requests [Delta Sum]Sum of delta of write requests to the devicecnt
Disk Write Requests [Success Delta]Delta of successful write requests to the devicecnt
Disk Write Requests [Success]Total successful write requestscnt
Disk Writes BytesBytes written to the device per secondbytes
Filesystem Hang CheckFilesystem hang check (normal: 1, abnormal: 0)status
Filesystem NodesTotal number of filesystem nodescnt
Filesystem Nodes [Free]Total number of available filesystem nodescnt
Filesystem Size [Available]Available disk space in bytesbytes
Filesystem Size [Free]Free disk space in bytesbytes
Filesystem Size [Total]Total disk space in bytesbytes
Filesystem UsageDisk space usage rate%
Filesystem Usage [Avg]Average disk space usage rate%
Filesystem Usage [Inode]Inode usage rate%
Filesystem Usage [Max]Maximum disk space usage rate%
Filesystem Usage [Min]Minimum disk space usage rate%
Filesystem Usage [Total]Total disk space usage rate%
Filesystem UsedUsed disk space in bytesbytes
Filesystem Used [Inode]Used inode space in bytesbytes
Memory FreeTotal available memory in bytesbytes
Memory Free [Actual]Actual available memory in bytesbytes
Memory Free [Swap]Available swap memory in bytesbytes
Memory TotalTotal memory in bytesbytes
Memory Total [Swap]Total swap memory in bytesbytes
Memory UsageMemory usage rate%
Memory Usage [Actual]Actual memory usage rate%
Memory Usage [Cache Swap]Cache swap usage rate%
Memory Usage [Swap]Swap memory usage rate%
Memory UsedUsed memory in bytesbytes
Memory Used [Actual]Actual used memory in bytesbytes
Memory Used [Swap]Used swap memory in bytesbytes
CollisionsNetwork collisionscnt
Network In BytesReceived bytesbytes
Network In Bytes [Delta Avg]Average delta of received bytesbytes
Network In Bytes [Delta Max]Maximum delta of received bytesbytes
Network In Bytes [Delta Min]Minimum delta of received bytesbytes
Network In Bytes [Delta Sum]Sum of delta of received bytesbytes
Network In Bytes [Delta]Delta of received bytesbytes
Network In DroppedDropped received packetscnt
Network In ErrorsReceived errorscnt
Network In PacketsReceived packetscnt
Network In Packets [Delta Avg]Average delta of received packetscnt
Network In Packets [Delta Max]Maximum delta of received packetscnt
Network In Packets [Delta Min]Minimum delta of received packetscnt
Network In Packets [Delta Sum]Sum of delta of received packetscnt
Network In Packets [Delta]Delta of received packetscnt
Network Out BytesSent bytesbytes
Network Out Bytes [Delta Avg]Average delta of sent bytesbytes
Network Out Bytes [Delta Max]Maximum delta of sent bytesbytes
Network Out Bytes [Delta Min]Minimum delta of sent bytesbytes
Network Out Bytes [Delta Sum]Sum of delta of sent bytesbytes
Network Out Bytes [Delta]Delta of sent bytesbytes
Network Out DroppedDropped sent packetscnt
Network Out ErrorsSent errorscnt
Network Out PacketsSent packetscnt
Network Out Packets [Delta Avg]Average delta of sent packetscnt
Network Out Packets [Delta Max]Maximum delta of sent packetscnt
Network Out Packets [Delta Min]Minimum delta of sent packetscnt
Network Out Packets [Delta Sum]Sum of delta of sent packetscnt
Network Out Packets [Delta]Delta of sent packetscnt
Open Connections [TCP]Open TCP connectionscnt
Open Connections [UDP]Open UDP connectionscnt
Port UsagePort usage rate%
SYN Sent SocketsNumber of sockets in SYN_SENT statecnt
Kernel PID MaxMaximum PID valuecnt
Kernel Thread MaxMaximum thread valuecnt
Process CPU UsageCPU time used by the process%
Process CPU Usage/CoreCPU time used by the process per core%
Process Memory UsageResident Set size%
Process Memory UsedUsed memory by the processbytes
Process PIDProcess IDPID
Process PPIDParent process IDPID
Processes [Dead]Number of dead processescnt
Processes [Idle]Number of idle processescnt
Processes [Running]Number of running processescnt
Processes [Sleeping]Number of sleeping processescnt
Processes [Stopped]Number of stopped processescnt
Processes [Total]Total number of processescnt
Processes [Unknown]Number of unknown processescnt
Processes [Zombie]Number of zombie processescnt
Running Process UsageProcess usage rate%
Running ProcessesNumber of running processescnt
Running Thread UsageThread usage rate%
Running ThreadsNumber of running threadscnt
Context SwitchesContext switches per secondcnt
Load/Core [1 min]Load per core over 1 minutecnt
Load/Core [15 min]Load per core over 15 minutescnt
Load/Core [5 min]Load per core over 5 minutescnt
Multipaths [Active]Number of active multipath connectionscnt
Multipaths [Failed]Number of failed multipath connectionscnt
Multipaths [Faulty]Number of faulty multipath connectionscnt
NTP OffsetMeasured offset from the NTP servernum
Run Queue LengthRun queue lengthnum
UptimeSystem uptime in millisecondsms
Context SwitchiesContext switches per secondcnt
Disk Read Bytes [Sec]Bytes read from the device per secondcnt
Disk Read Time [Avg]Average read time from the devicesec
Disk Transfer Time [Avg]Average disk transfer timesec
Disk UsageDisk usage rate%
Disk Write Bytes [Sec]Bytes written to the device per secondcnt
Disk Write Time [Avg]Average write time to the devicesec
Pagingfile UsagePaging file usage rate%
Pool Used [Non Paged]Non-paged pool usagebytes
Pool Used [Paged]Paged pool usagebytes
Process [Running]Number of running processescnt
Threads [Running]Number of running threadscnt
Threads [Waiting]Number of waiting threadscnt
Table. GPU Server Additional Monitoring Metrics (Agent Installation Required)

3 - ServiceWatch Metrics

GPU Server sends metrics to ServiceWatch. The metrics provided by default monitoring are data collected at 5‑minute intervals. If detailed monitoring is enabled, you can view data collected at 1‑minute intervals.

Information
  • The basic monitoring and detailed monitoring of the GPU Server are provided with the same metrics as the Virtual Server, and the namespace is also provided as Virtual Server.
  • GPU related metrics are provided through ServiceWatch Agent, and for how to collect metrics using ServiceWatch Agent, refer to the ServiceWatch Agent guide.
Reference
To check metrics in ServiceWatch, refer to the ServiceWatch guide.

How to enable detailed monitoring of GPU Server, please refer to How-to guides > ServiceWatch Enable Detailed Monitoring.

Basic Indicators

The following are the basic metrics for the Virtual Server namespace.

Performance ItemDetailed DescriptionUnitMeaningful Statistics
Instance StateInstance State Display--
CPU UsageCPU Usage%
  • Average
  • Maximum
  • Minimum
Disk Read BytesCapacity read from block device (bytes)Bytes
  • Total
  • Average
  • Maximum
  • Minimum
Disk Read RequestsNumber of read requests on block deviceCount
  • Total
  • Average
  • Maximum
  • Minimum
Disk Write BytesWrite capacity on block device (bytes)Bytes
  • Total
  • Average
  • Maximum
  • Minimum
Disk Write RequestsNumber of write requests on block deviceCount
  • Total
  • Average
  • Maximum
  • Minimum
Network In BytesCapacity received from network interface (bytes)Bytes
  • Total
  • Average
  • Maximum
  • Minimum
Network In DroppedNumber of packet drops received on network interfaceCount
  • Total
  • Average
  • Maximum
  • Minimum
Network In PacketsNumber of packets received on the network interfaceCount
  • Total
  • Average
  • Maximum
  • Minimum
Network Out BytesData transmitted from the network interface (bytes)Bytes
  • Total
  • Average
  • Maximum
  • Minimum
Network Out DroppedNumber of packet drops transmitted from the network interfaceCount
  • Total
  • Average
  • Maximum
  • Minimum
Network Out PacketsNumber of packets transmitted from the network interfaceCount
  • Total
  • Average
  • Maximum
  • Minimum
Table. Virtual Server Basic Metrics