This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

How-to Guides

    Using AIOS

    AIOS provides an environment where LLMs can be used by default within each resource when you create Virtual Server, GPU Server, and Kubernetes Engine services.

    Reference

    Refer to the table below for detailed information on each service creation.

    serviceGuide
    Virtual ServerCreate Virtual Server
    GPU ServerCreate GPU Server
    Cloud FunctionsCreate Cloud Functions
    Kubernetes EngineCreate Cluster
    Table. AIOS Service Creation Guide

    Using LLM

    LLM can be used by leveraging the LLM Endpoint within the Virtual Server, GPU Server, Cloud Functions, Kubernetes Engine service resources created on the Samsung Cloud Platform. The LLM Endpoint can be viewed through the usage guide for the LLM Endpoint on the service’s detail page.

    Check the LLM Endpoint of Virtual Server

    On the Virtual Server Details page of the created Virtual Server, you can view the usage guide for the LLM Endpoint.

    To view the usage guide for the LLM Endpoint, follow these steps.

    1. Click the All Services > Compute > Virtual Server menu. Navigate to the Service Home page of Virtual Server.
    2. On the Service Home page, click the Virtual Server menu. You will be taken to the Virtual Server List page.
    3. On the Virtual Server List page, click the resource to connect to the LLM Endpoint. You will be taken to the Virtual Server Details page.
    4. Virtual Server Details on the page, click the User Guide link for the LLM Endpoint item. You will be taken to the LLM User Guide popup.
    Reference
    For detailed information about the LLM usage guide, see LLM 이용 가이드.

    Check the LLM Endpoint of the GPU Server

    You can view the usage guide for the LLM Endpoint on the GPU Server Details page of the created GPU Server.

    To view the usage guide for the LLM Endpoint, follow these steps.

    1. Click the All Services > Compute > GPU Server menu. You will be taken to the Service Home page of GPU Server.
    2. On the Service Home page, click the GPU Server menu. You will be taken to the GPU Server List page.
    3. On the GPU Server List page, click the resource to connect to the LLM Endpoint. You will be taken to the GPU Server Details page.
    4. On the GPU Server Details page, click the User Guide link for the LLM Endpoint item. It will open the LLM User Guide popup window.
    Reference
    For detailed information about the LLM usage guide, see the LLM 이용 가이드.

    Checking the LLM Endpoint of Cloud Functions

    You can view the usage guide for the LLM Endpoint on the Cloud Functions Details page of the created Cloud Functions.

    To view the usage guide for the LLM Endpoint, follow these steps.

    1. Click the All Services > Compute > Cloud Functions menu. Go to the Service Home page of Cloud Functions.
    2. On the Service Home page, click the Functions menu. You will be taken to the Functions list page.
    3. Functions List page, click the resource to connect to the LLM Endpoint. Navigate to the Functions Detail page.
    4. Functions Details on the page, click the LLM Endpoint item’s User Guide link. It will navigate to the LLM User Guide popup window.
    Reference
    For detailed information about the LLM usage guide, see LLM 이용 가이드.

    Checking the LLM Endpoint of a Kubernetes Engine cluster

    On the Cluster Details page of the created Kubernetes Engine cluster, you can view the usage guide for the LLM Endpoint.

    To view the usage guide for the LLM Endpoint, follow these steps.

    1. Click the All Services > Container > Kubernetes Engine menu. You will be taken to the Service Home page of Kubernetes Engine.
    2. On the Service Home page, click the Cluster menu. You will be taken to the Cluster List page.
    3. On the Cluster List page, click the resource to connect to the LLM Endpoint. You will be taken to the Cluster Details page.
    4. On the Cluster Details page, click the Usage Guide link for the LLM Endpoint item. It will open the LLM Usage Guide popup.
    Reference
    For detailed information about the LLM usage guide, see the LLM 이용 가이드.

    LLM Usage Guide

    In the LLM Endpoint usage guide, you can find the AIOS LLM Private Endpoint, the provided models, and sample code examples.

    AIOS LLM Private Endpoint

    The URL of the AIOS LLM private endpoint is displayed. Verify the URL to use it within the resources created for the Virtual Server, GPU Server, and Kubernetes Engine services.

    AIOS LLM provided model

    The AIOS LLM provided models are as follows.

    Model nameModel IDContext sizeRPM (Request per minute)TPM (Token per minute)PurposeLicenseDiscontinued
    gpt-oss-120bopenai/gpt-oss-120b131,07250 RPM200KResearch, experiments, advanced language understandingApache 2.02026.05.21
    Qwen3-Coder-30B-A3B-InstructQwen/Qwen3-Coder-30B-A3B-Instruct65,53620 RPM30KCode generation, analysis, and debugging supportApache 2.02026.05.21
    Qwen3-30B-A3B-Thinking-2507Qwen/Qwen3-30B-A3B-Thinking-250732,76810 RPM30KDeep reasoning, long-form analysis, essay writingApache 2.02026.05.21
    Llama-4-Scoutmeta-llama/Llama-4-Scout32,76820 RPM35KThe latest Llama model with multimodal capabilityllama42026.05.21
    Llama-Guard-4-12Bmeta-llama/Llama-Guard-4-12B32,76820 RPM200KCore security and moderation models for enhancing reliability and safety in cutting‑edge large language models and multimodal AI servicesllama42026.05.21
    bge-m3sds/bge-m38,192100 RPM200KA multilingual embedding model that supports multiple languages.Samsung SDS2026.05.21
    bge-reranker-v2-m3sds/bge-reranker-v2-m38,192100 RPM200KProvides fast computation and high performance with a lightweight multilingual relinker.Samsung SDS2026.05.21
    Table. AIOS LLM Provided Models

    sample code

    Refer to the following for AIOS LLM sample code examples.

    Color mode
    curl -H "Content-Type: application/json" \
      -d '{
            "model": "openai/gpt-oss-120b"
          , "prompt" : "Write a haiku about recursion in programming."
          , "temperature": 0
          , "max_tokens": 100
          , "stream": false
          }' \
    {AIOS LLM private endpoint}/{API}
    curl -H "Content-Type: application/json" \
      -d '{
            "model": "openai/gpt-oss-120b"
          , "prompt" : "Write a haiku about recursion in programming."
          , "temperature": 0
          , "max_tokens": 100
          , "stream": false
          }' \
    {AIOS LLM private endpoint}/{API}
    Code block. AIOS LLM sample code

    Check usage per LLM model

    On AIOS’s Service Home page, you can view the list of LLMs and token usage per model.

    1. All Services > AI-ML > AIOS Click the menu. Go to AIOS’s Service Home page.
    2. LLM usage by model Check the LLM’s model name, model type, and token usage (1 week) in the list.
      CategoryDetailed description
      Model nameLLM name
      • Click the name to go to the model’s Report page
      model typeLLM type
      • chat, reasoning, vision, moderation, embedding, rerank
      • For information per model, see Provided Models
      Token usage amount (1 Week)Token usage for the past week as of today
      Table. AIOS LLM list items

    Check Report

    In AIOS’s Report page, you can view the daily LLM call count and token usage.

    You can choose service types such as Virtual Server, GPU Server, or Kubernetes Engine, and you can view resources by selecting their names among the resources actually created in the service, as well as view them by the LLM model used.

    1. All Services > AI-ML > AIOS Click the menu. Navigate to AIOS’s Service Home page.
    2. Click the Report menu on the Service Home page. You will be taken to the Report page of AIOS.
      • LLM usage by model In the list, clicking the LLM model name takes you directly to that LLM’s Report page.
    3. On the Report page, select the LLM model to view the Report, then click the Query button. The Report information for that LLM model will be displayed.
      CategoryDetailed description
      Service typeSelect service type using LLM
      • Virtual Server, GPU Server, Kubernetes Engine
      Resource nameSelect service name
      • If you do not select a service type, only All can be selected, and if you select a specific product in the service type, you can select a specific resource name
      ModelSelect LLM model type
      Query periodSelect the period to view the report
      • Can be selected in weekly increments
      • Previous periods can be queried up to a maximum of 3 months
      • The data retrieved is provided up to 30 minutes prior to the current time
      Number of callsNumber of calls per day during the query period
      • Displayed per day as total count, success count, and failure count
      • Total call count: Provides the total number of calls during the period, broken down by model
      Token usageDaily Token input and output amounts during the query period
      • Total Token count: Total Token usage during the query period
      • Average Token count per request: Average Token amount used per LLM call during the query period
      Table. AIOS Report Items