This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

How-to guides

Create AI&MLOps Platform

Users can create the service by entering the required information for the AI&MLOps Platform and selecting detailed options through the Samsung Cloud Platform Console.

To create an AI&MLOps Platform, follow these steps.

  1. Click the All Services > AI/ML > AI&MLOps Platform menu. You will be taken to the Service Home page of AI&MLOps Platform.
  2. Service Home page, click the AI&MLOps Platform Create button. You will be taken to the AI&MLOps Platform Create page.
  3. On the AI&MLOps Platform creation Service Type Selection page, enter the information required to create the service and select detailed options.
    • Select the service type in the Service Type and Version Selection area.
      Category
      Required
      Detailed description
      Service typeRequiredService type selected by the user
      • AI&MLOps Platform
      • Kubeflow Mini
      Service type versionRequiredSelect version of the selected service
      • Provide a list of versions of the offered service
      Table. AI&MLOps Platform service types and version selection items
    • Cluster Deployment Area Classification Select the options required to create a service in this area.
      Category
      Required
      Detailed description
      Cluster deployment areaRequired
      • Deploy from Kubernetes Engine: Select the previously created Kubernetes Engine
      • Deploy to a new cluster: When creating the AI&MLOps Platform, also create a Kubernetes Engine
      Table. AI&MLOps Platform Service Cluster Deployment Area Classification Items
      Reference
      The configuration elements on the following Service Information Input page vary depending on the cluster deployment settings.
  4. On the Service Information Input page of AI&MLOps Platform Creation, enter the information required to create the service and select detailed options.
  5. On the Creation Information Check page of AI&MLOps Platform creation, review the detailed information you created and the estimated billing amount, and click the Complete button.
    • Once creation is complete, check the created resources on the AI&MLOps Platform Service List page.

Check detailed information of AI&MLOps Platform

The AI&MLOps Platform service allows you to view and edit the full list of resources and detailed information. AI&MLOps Platform Service Details page consists of Details, Tags, Activity History tabs.

To view detailed information about the AI&MLOps Platform service, follow the steps below.

  1. Click the All Services > AI/ML > AI&MLOps Platform Service menu. Navigate to the Service Home page of the AI&MLOps Platform Service.
  2. On the Service Home page, click the AI&MLOps Platform menu. You will be taken to the AI&MLOps Platform Service List page.
  3. On the AI&MLOps Platform Service List page, click the resource to view detailed information. You will be taken to the AI&MLOps Platform Service Details page.
    • AI&MLOps Platform Service Details page displays status information and additional feature information, and consists of Details, Tags, Activity History tabs.

Detailed Information

AI&MLOps Platform Service List page lets you view detailed information of the selected resource and edit the information if needed.

Category
Detailed description
serviceService name
Resource TypeResource Type
SRNUnique resource ID in Samsung Cloud Platform
Resource nameResource name
  • In the AI&MLOps Platform service, it refers to the cluster name
Resource IDUnique resource ID in the service
constructorUser who created the service
Creation date and timeService creation date and time
editorUser who edited the service information
Modification dateDate and time the service information was modified
Dashboard statusDashboard status value
Service nameService name
Admin Email AddressAdministrator email address
image nameService image name
VersionImage version
Service typeDeployed service type
Table. AI&MLOps Platform Service Detailed Information Items

tag

AI&MLOps Platform Service List page lets you view the tag information of the selected resource, and you can add, modify, or delete it.

CategoryDetailed description
Tag listTag list
  • You can view the Key and Value information of the tag
  • Up to 50 tags can be added per resource
  • When entering a tag, you can search and select from the list of previously created Keys and Values
Table. Cluster Tag Tab Items

Job History

AI&MLOps Platform Service List page lets you view the operation history of the selected resource.

CategoryDetailed description
Task History ListResource Change History
  • You can view operation details, operation time, resource type, resource name, operation result, and operator information
  • Operation History List Click the relevant resource in the list. Operation History Details A popup window will open.
Table. AI&MLOps Platform Service Job History Tab Detailed Information Items

Access AI&MLOps Platform

To access the AI&MLOps Platform dashboard, you must complete the prerequisite steps.

Preliminary work

To access the AI&MLOps Platform, you must preconfigure the relevant ports and the IP addresses required for connection in the Security Group and Firewall (if using a firewall).

  • Kubeflow Mini: port 31390 (inbound rules of Security Group, VPC firewall)

  • To access the cluster’s worker node, you must set an inbound rule for port 22 on the Security Group and Firewall (when using a VPC firewall).

Access Dashboard

To access the AI&MLOps Platform service, follow these steps.

  1. Click the All Services > AI/ML > AI&MLOps Platform Service menu. You will be taken to the Service Home page of the AI&MLOps Platform service.
  2. Click the AI&MLOps Platform Service menu on the Service Home page. You will be taken to the AI&MLOps Platform Service List page.
  3. Click the resource to view detailed information on the AI&MLOps Platform Service List page. You will be taken to the AI&MLOps Platform Details page.
  4. AI&MLOps Platform Details on the page, click the Access Guide button. The Access Guide popup window opens.
  5. Access Guide In the popup window, click the dashboard’s URL link. You will be taken to the corresponding dashboard page.
Caution
When using a public subnet and assigning a public IP, you may be exposed to security attacks such as external hacking and malware infection.

Terminate AI&MLOps Platform

You can cancel the unused service to reduce operating costs. However, canceling the service may cause the running service to stop immediately, so you should thoroughly consider the impact of service interruption before proceeding with the cancellation.

Caution
Please note that data cannot be recovered after terminating the service.

To cancel the AI&MLOps Platform, follow the steps below.

  1. Click the All Services > AI/ML > AI&MLOps Platform Service menu. Navigate to the Service Home page of the AI&MLOps Platform Service.
  2. On the Service Home page, click the AI&MLOps Platform Service menu. You will be taken to the AI&MLOps Platform Service List page.
  3. Click the resource to view detailed information on the AI&MLOps Platform Service List page. You will be taken to the AI&MLOps Platform Details page.
  4. AI&MLOps Platform Details on the page, click the Cancel Service button. The Cancel Service popup will open.
  5. After entering the service name for verification, click Confirm.
  6. When termination is complete, check on the AI&MLOps Platform Service List page whether the resource has been terminated.

1 - Cluster Deployment

Cluster deployment area

In Samsung Cloud Platform, the AI&MLOps Platform creation’s service type selection provides two cloud deployment regions.

common

Before proceeding with the cluster deployment, be sure to verify the Kubernetes cluster specifications required for installation.

  • Regardless of the selected cluster deployment region, you must verify the Kubernetes cluster specifications in advance.
  • For detailed specification information, refer to the Cluster Specification guide.

Depending on the selection of the cluster deployment region, the installation details on the AI&MLOps Platform creation service information input page differ.

Deploy from SCP Kubernetes Engine

  1. Click the All Services > AI/ML > AI&MLOps Platform menu. You will be taken to the Service Home page of AI&MLOps Platform.
  2. On the Service Home page, click the AI&MLOps Platform Create button. It navigates to the AI&MLOps Platform Create page.
  3. On the service type selection page of AI&MLOps Platform creation, enter the information required to create the service and select detailed options.
    Cluster deployment
    Select the Deploy on SCP Kubernetes Engine option.
  4. On the Service Information Input page of AI&MLOps Platform Creation, enter the information required to create the service, and select detailed options.
    • In the Service Information Input area, enter or view the information required to create a service.
      Category
      Required status
      Detailed description
      Service nameRequiredEnter AI&MLOps Platform name
      • AI&MLOps Platform name cannot be duplicated within a project
      Storage ClassRequiredStorage Class is automatically registered
      Installation node informationLookupView the node information of the selected Kubernetes Engine
      Admin Email AddressRequiredEnter the administrator (Admin) email address to use for login
      passwordRequiredEnter the password to use for login
      Confirm PasswordRequiredRe-enter password to prevent password errors
      Table. AI&MLOps Platform service information input fields
    • Additional Information Input area: enter or select the information required to create a service.
      Category
      Required status
      Detailed description
      tagSelectionSelect tags to add to the AI&MLOps Platform
      • Click ‘Add Tag’ to create a new tag or add an existing tag
      • You can register up to 50 tags
      • The newly added tags will be applied after the service creation is completed
      Table. AI&MLOps Platform service additional information input fields

Deploy to a new cluster

  1. Click the All Services > AI/ML > AI&MLOps Platform menu. You will be taken to the Service Home page of AI&MLOps Platform.
  2. Service Home page, click the Create AI&MLOps Platform button. It navigates to the Create AI&MLOps Platform page.
  3. On the Service Type Selection page of the AI&MLOps Platform creation, enter the information required to create the service and select detailed options.
    Cluster deployment
    Select the Deploy to a new cluster option.
  4. On the Service Information Input page of AI&MLOps Platform creation, enter the information required to create a service and select detailed options.
    • In the Service Information Input area, enter or view the information required to create a service.

      Category
      Required
      Detailed description
      Service nameRequiredEnter AI&MLOps Platform name
      • AI&MLOps Platform name cannot be duplicated within a project
      Storage ClassRequiredStorage Class is automatically registered
      Installation node informationLookupView the node information of the selected Kubernetes Engine
      Admin Email AddressRequiredEnter the email address of the administrator (Admin) to be used for login.
      passwordRequiredEnter the password to use for login
      Confirm PasswordRequiredRe-enter password to prevent password errors
      Table. AI&MLOps Platform Service Information Input Items

    • Kubernetes Engine Information Input Enter or select the required information in this area.

      Category
      Required
      Detailed description
      Cluster nameRequiredCluster name
      • must start with an English letter and may use English letters, numbers, and special characters (-)
      • Enter within 3 to 30 characters
      Control plane settings > Kubernetes versionRequiredSelect Kubernetes version
      Control Area Settings > Control Area LoggingSelectionSelect whether to enable control plane logging
      • Audit/Event logs from the cluster control plane can be viewed in Cloud Monitoring’s log analysis
      • Log storage up to 1 GB for all services within the account is provided for free, and logs exceeding 1 GB are deleted sequentially
      Network SettingsRequiredNetwork connection settings for the node pool
      • VPC: Select a pre‑created VPC
      • Subnet: Select a standard Subnet to use from the subnets of the chosen VPC
      • Security Group: Click the Search button, then select a Security Group in the Select Security Group popup
      • Load Balancer: Provides the type:LoadBalancer feature in a Kubernetes Service object
        • Select a load balancer on the same network
        • Select whether to use
        • Cannot be changed after configuration
      File Storage SettingsRequiredSelect the file storage volume to use in the cluster
      • Default Volume (NFS): Select File Storage using the Search button
        • The default Volume file storage provides only the NFS format
      Table. Kubernetes Engine Service Information Input Items

    • Enter or select the required information in the Node Pool Information Input area.

      Category
      Required status
      Detailed description
      Node pool configurationRequiredSelect node pool information
      • * Items marked with an asterisk are required fields and must be entered
      • For the AI&MLOps Platform, image size may continuously increase depending on usage, so setting Block Storage to at least 200 GB enables smooth system configuration
      Table. AI&MLOps Platform Service Information Input Items
      Reference
      • A Windows OS node pool can be created only when an additional storage (CIFS) volume is in use in the cluster.
      • Volume encryption for node pool Block Storage can only be set at initial creation.
        • Enabling encryption may cause performance degradation in some features.
      • Only when you have selected the node pool auto‑scaling or shrinking feature can you input node count, minimum node count, maximum node count.

    • In the Additional Information Input area, enter or select the required information.

      Category
      Required
      Detailed description
      tagSelectionSelect tags to add to the AI&MLOps Platform
      • Click ‘Add Tag’ to create a new tag or add an existing tag
      • You can register up to 50 tags
      • The newly added tags will be applied after the service creation is completed
      Table. AI&MLOps Platform Service Information Input Items

Cluster specifications

To use the AI&MLOps Platform, you need a Kubernetes Engine to install the AI&MLOps Platform. You can select an existing Kubernetes Engine, or you can create a Kubernetes Engine together when creating the AI&MLOps Platform.

The specifications of the Kubernetes cluster required for installation are as follows.

  • Node pool resource size (composed of 2 or more nodes)

    • AI&MLOps Platform: vCPU 32, Memory 128G or more
    • Kubeflow Mini: vCPU 24, Memory 96G or more
  • Kubernetes version

    • AI&MLOps Platform v1.9.1 (k8s v1.30)
    • Kubeflow Mini v1.9.1 (k8s v1.30)
information
Only one AI&MLOps Platform can be installed per Kubernetes cluster, and a cluster that is being used for other purposes cannot have the AI&MLOps Platform installed.

2 - Kubeflow Usage Guide

Below, we guide you on how to use Kubeflow after creating it.

Add Kubeflow User

Below is a guide on how to use Kubeflow after it has been created.

Kubeflow only creates the account of the single Admin User entered on the initial installation screen.

When using the Kubeflow Dashboard, to add users other than the initial user, you must modify the settings of Dex (the authentication integration component of Kubeflow).

  • Dex is deployed in the auth namespace, and its configuration is stored in a configmap named dex.
Reference
Kubeflow separates namespaces for each user.

The following is an example of Dex configuration.

Color mode
apiVersion: v1
kind: ConfigMap
metadata:
  name: dex
  namespace: auth
data:
  config.yaml: |
    issuer: http://dex.auth.svc.cluster.local:5556/dex
    storage:
      type: kubernetes
      config:
        inCluster: true
    web:
      http: 0.0.0.0:5556
    logger:
      level: "debug"
      format: text
    oauth2:
      skipApprovalScreen: true
    enablePasswordDB: true
    staticPasswords:
    - email: admin@kubeflow.org
      hash: $2y$10$Yb9WVbn8pzVSM6fBgKdFae1Bh6Z.XTihi7bNu3sB6/h5bt1JuUOgq
      username: admin
      userID: 9cb67307-fd6d-4441-9b59-52acd78f4c9e
    staticClients:
    - id: kubeflow-oidc-authservice
      redirectURIs: ["/login/oidc"]
      name: 'Dex Login Application'
      secret: pUBnBOY80SnXgjibTYM9ZWNzY2xreNGQok    
apiVersion: v1
kind: ConfigMap
metadata:
  name: dex
  namespace: auth
data:
  config.yaml: |
    issuer: http://dex.auth.svc.cluster.local:5556/dex
    storage:
      type: kubernetes
      config:
        inCluster: true
    web:
      http: 0.0.0.0:5556
    logger:
      level: "debug"
      format: text
    oauth2:
      skipApprovalScreen: true
    enablePasswordDB: true
    staticPasswords:
    - email: admin@kubeflow.org
      hash: $2y$10$Yb9WVbn8pzVSM6fBgKdFae1Bh6Z.XTihi7bNu3sB6/h5bt1JuUOgq
      username: admin
      userID: 9cb67307-fd6d-4441-9b59-52acd78f4c9e
    staticClients:
    - id: kubeflow-oidc-authservice
      redirectURIs: ["/login/oidc"]
      name: 'Dex Login Application'
      secret: pUBnBOY80SnXgjibTYM9ZWNzY2xreNGQok    
Code block. Dex environment configuration example

When the enablePasswordDB value in the configuration is true, Dex stores the list of users defined in staticPasswords from the configmap into its internal storage when the service starts. Therefore, by adding new user entries composed of email, hash, username, and userID to staticPasswords, you can freely add users beyond the initial ones and use the Kubeflow service.

The attribute values for adding a user can be defined as follows.

parameterExplanation
emailA value in a standard E‑mail format
hashBcrypt algorithm encrypted user password value, and you can directly input the hash value generated by the Bcrypt algorithm
usernameUser name
  • follows the Kubernetes namespace naming conventions
  • 63-character limit, lowercase letters, numbers, and - only these characters are allowed
userIDA uniquely identifiable ID value
  • The initial user’s userID is generated using the uuidgen command
Table. Attribute values for adding a user

From a node where you can use kubectl, use the following command to enter the edit screen of dex configmap.

Color mode
kubectl edit configmap dex -n auth
kubectl edit configmap dex -n auth
Code block. kubectl - modify dex configmap
Color mode
staticPasswords:
    - email: admin@kubeflow.org
      hash: $2y$10$Yb9WVbn8pzVSM6fBgKdFae1Bh6Z.XTihi7bNu3sB6/h5bt1JuUOgq
      username: admin
      userID: 9cb67307-fd6d-4441-9b59-52acd78f4c9e
    - email: sds@samsung.com
      hash: $2y$12$0g5.y86jnrt0v6In5NRCZ.YVuvrAUQ6j/RJYO3rV.kNulaDALOKfq
      username: sds
      userID: 8961d517-3498-4148-90c9-7e442ee91154
staticPasswords:
    - email: admin@kubeflow.org
      hash: $2y$10$Yb9WVbn8pzVSM6fBgKdFae1Bh6Z.XTihi7bNu3sB6/h5bt1JuUOgq
      username: admin
      userID: 9cb67307-fd6d-4441-9b59-52acd78f4c9e
    - email: sds@samsung.com
      hash: $2y$12$0g5.y86jnrt0v6In5NRCZ.YVuvrAUQ6j/RJYO3rV.kNulaDALOKfq
      username: sds
      userID: 8961d517-3498-4148-90c9-7e442ee91154
Code block. Modify dex configmap

Since the staticPasswords value in the configmap is applied when the Dex service starts, restart the Dex service using the following command.

Color mode
kubectl rollout restart deployment dex -n auth
kubectl rollout restart deployment dex -n auth
Code block. kubectl - dex restart

Attempt to log in using new user information.

Figure 1
New user information login

Verify that after successful login, it transitions to the screen for creating a new Namespace(profile).

Figure 2
Create Namespace Name

The above content was written with reference to the official Kubeflow site. For more details, see Kubeflow Profiles.

How to use Custom Image in Kubeflow Jupyter Notebook

To use a custom image in the Kubeflow Notebook Controller that manages the Notebook life cycle of Kubeflow, you must meet several requirements.

Kubeflow assumes that Jupyter will start automatically when a Notebook image is run. Therefore, you need to set the default command to start Jupyter in the container image.

The following is an example of what should be included in a Dockerfile.

Color mode
ENV NB_PREFIX

CMD ["sh","-c", "jupyter notebook --notebook-dir=/home/${NB_USER} --ip=0.0.0.0 --no-browser --allow-root --port=8888 --NotebookApp.token='' --NotebookApp.password='' --NotebookApp.allow_origin='*' --NotebookApp.base_url=${NB_PREFIX}"]
ENV NB_PREFIX

CMD ["sh","-c", "jupyter notebook --notebook-dir=/home/${NB_USER} --ip=0.0.0.0 --no-browser --allow-root --port=8888 --NotebookApp.token='' --NotebookApp.password='' --NotebookApp.allow_origin='*' --NotebookApp.base_url=${NB_PREFIX}"]
Code block. Dockerfile example

The above items are explained as follows.

parameterExplanation
--notebook-dir=/home/jovyanSet working directory
  • /home/jovyan directory is mounted to a Kubernetes persistent volume (PV)
--ip=0.0.0.0Allow Jupyter Notebook to accept connections from any IP
--allow-rootAllow the user to run Jupyter Notebook as root
--port=8888Port configuration
--NotebookApp.token=’’ –NotebookApp.password=’’Disable Jupyter authentication
  • Since Kubeflow relies on Istio for authentication, the authentication feature provided by Jupyter is disabled
  • With this configuration, you can access the Jupyter Notebook Server without a password
--NotebookApp.allow_origin=’*’Allow origin
--NotebookApp.base_url=NB_PREFIXBase URL setting
Table. Settings to include in Dockerfile

You can create a Custom Image by referring to the Dockerfile that builds the tesorflow notebook image.

Reference
Custom Image must be stored in a public registry such as Docker Hub or a private registry, and be push/pullable from Kubeflow.
  1. On the Notebook Servers page, click the +NEW SERVER button.

    Figure 3

  2. If you have created a Custom Image, check Custom Image on the Kubeflow Notebook Server screen and enter the Custom Image address to create a new Notebook Server.

    Figure 4

Information

The above content was written with reference to the Kubeflow official site.