The page has been translated by Gen AI.

Cluster Fabric Management

Cluster Fabric is a service that helps manage servers (GPU Nodes) included in a GPU Cluster. Using Cluster Fabric, you can move servers between GPU Clusters in the same Node pool and optimize the performance and speed of GPUs within the same GPU Cluster.

Creating Cluster Fabric

Cluster Fabric can be created together with a GPU Node, and it cannot be created or deleted separately. When all GPU Nodes within a Cluster Fabric are terminated, the Cluster Fabric is automatically deleted. If you haven’t created a GPU Node, please create one first. For more information, refer to Creating a GPU Node.

Checking Cluster Fabric Details

Guide
  • Cluster Fabric can be created together with a GPU Node, and it cannot be created or deleted separately.
  • When all GPU Nodes within a Cluster Fabric are terminated, the Cluster Fabric is automatically deleted.
  • If you haven’t created a GPU Node, please create one first. For more information, refer to Creating a GPU Node.

You can check the created Cluster Fabric list and details, and move servers on the Cluster Fabric List page and Cluster Fabric Details page.

  1. Click on All Services > Compute > Multi-node GPU Server menu. It will move to the Service Home page of the Multi-node GPU Cluster.

  2. Click on the Cluster Fabric menu on the Service Home page. It will move to the Cluster Fabric List page.

    • On the Cluster Fabric List page, you can view the list of resources of the GPU Cluster created by the user.
    • Resource items other than required columns can be added through the Settings button.
      Category
      Required
      Description
      Resource IDOptionalCluster Fabric ID created by the user
      Cluster Fabric NameRequiredCluster Fabric name created by the user
      Node PoolOptionalA collection of nodes that can be bundled into the same Cluster Fabric
      Number of ServersOptionalNumber of GPU Nodes
      Server TypeOptionalServer type of the GPU Node
      • The user can check the number of cores, memory capacity, and GPU type and number of the created resource
      StatusOptionalStatus of the Cluster Fabric created by the user
      Creation TimeOptionalTime when the Cluster Fabric was created
      Table. Cluster Fabric resource list items
  3. Click on the resource to check the details on the Cluster Fabric List page. It will move to the Cluster Fabric Details page.

    • At the top of the Cluster Fabric Details page, status information and additional feature descriptions are displayed.
      CategoryDescription
      Cluster Fabric StatusStatus of the Cluster Fabric created by the user
      • Creating: Cluster creation in progress
      • Active: Creation completed and available
      • Editing: IP change in progress
      • Deleting: Termination in progress
      • Deleted: Termination completed
      Add Target ServerFunction to move a server from another cluster to this cluster
      Table. Cluster Fabric status information and additional features

Details

On the Details tab of the Cluster Fabric List page, you can check the details of the selected resource and bring in servers from other clusters.

CategoryDescription
ServiceService category
Resource TypeService name
SRNUnique resource ID in Samsung Cloud Platform
  • In Cluster Fabric, it means Cluster Fabric SRN
Resource NameResource name
  • In Cluster Fabric service, it means Cluster Fabric name
Resource IDUnique resource ID in the service
CreatorUser who created the service
Creation TimeTime when the service was created
ModifierUser who modified the service information
Modification TimeTime when the service information was modified
Cluster Fabric NameCluster Fabric name created by the user
Node PoolA collection of nodes that can be bundled into the same Cluster Fabric
Target ServerList of GPU Nodes bound to the Cluster Fabric
  • Server name, server type, IP, status
Table. Cluster Fabric details tab items

Bringing in Cluster Fabric Servers

Using the Add Target Server feature on the Cluster Fabric Details page, you can bring in servers from other clusters and add them to the selected cluster.

  1. Click on All Services > Compute > Multi-node GPU Server menu. It will move to the Service Home page of the Multi-node GPU Cluster.
  2. Click on the Cluster Fabric menu on the Service Home page. It will move to the Cluster Fabric List page.
  3. Click on the resource to check the details on the Cluster Fabric List page. It will move to the Cluster Fabric Details page.
  4. Click the Add button on the right side of the target server on the details tab.
    • The target server addition popup window opens.
      • Cluster Fabric Select a cluster.
      • The GPU Node bound to the selected cluster is retrieved, and you can select the GPU Node to bring in.
      • The selected GPU Node is listed at the bottom with the GPU Node name.
      • Click the Confirm button to complete.
      • Click the Cancel button to cancel the task.
    • Check if the added GPU Node is retrieved in the target server.

Terminating Cluster Fabric

When all GPU Nodes within a Cluster Fabric are terminated, the Cluster Fabric is automatically deleted. For more information, refer to Terminating a GPU Node.

How-to guides
ServiceWatch Agent Install