This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

How-to Guides

    Using AIOS

    AIOS provides an environment where LLM can be used by default within each resource when you create Virtual Server, GPU Server, Kubernetes Engine services.

    Note

    For detailed information on each service creation, refer to the table below.

    ServiceGuide
    Virtual ServerVirtual Server Create
    GPU ServerCreate GPU Server
    Cloud FunctionsCloud Functions Create
    Kubernetes EngineCreate Cluster
    Table. AIOS Available Service Creation Guide

    Using LLM

    LLM can be used by utilizing the LLM Endpoint within the service resources such as Virtual Server, GPU Server, Cloud Functions, Kubernetes Engine created on Samsung Cloud Platform. The LLM Endpoint can be checked through the Usage Guide for the LLM Endpoint on the service’s detail page.

    Check the LLM Endpoint of Virtual Server

    You can check the usage guide for the LLM Endpoint on the Virtual Server Details page of the created Virtual Server.

    To check the usage guide for the LLM Endpoint, follow the steps below.

    1. All Services > Compute > Virtual Server Click the menu. Go to the Service Home page of Virtual Server.
    2. Click the Virtual Server menu on the Service Home page. Navigate to the Virtual Server list page.
    3. Virtual Server List page, click the resource to connect to the LLM Endpoint. Navigate to the Virtual Server Details page.
    4. Virtual Server Details on the page, click the User Guide link of the LLM Endpoint item. It will navigate to the LLM User Guide popup window.
    Reference
    For detailed information about the LLM usage guide, check LLM Usage Guide.

    Check GPU Server’s LLM Endpoint

    You can check the usage guide for the LLM Endpoint on the GPU Server Details page of the created GPU Server.

    To view the usage guide for LLM Endpoint, follow the steps below.

    1. All Services > Compute > GPU Server Click the menu. Go to the Service Home page of GPU Server.
    2. Click the GPU Server menu on the Service Home page. It navigates to the GPU Server List page.
    3. GPU Server List page, click the resource to connect to the LLM Endpoint. GPU Server Details page, navigate.
    4. GPU Server Details on the page, click the LLM Endpoint item’s User Guide link. You will be taken to the LLM User Guide popup window.
    Note
    For detailed information about the LLM usage guide, see the LLM Usage Guide.

    Checking the LLM Endpoint of Cloud Functions

    You can view the usage guide for the LLM Endpoint on the Cloud Functions Details page of the created Cloud Functions.

    To view the usage guide for the LLM Endpoint, follow the steps below.

    1. All Services > Compute > Cloud Functions Click the menu. Go to the Service Home page of Cloud Functions.
    2. Click the Functions menu on the Service Home page. Go to the Functions list page.
    3. On the Functions list page, click the resource to connect to the LLM Endpoint. You will be taken to the Functions details page.
    4. Click the User Guide link of the LLM Endpoint item on the Functions Details page. It will open the LLM User Guide popup.
    Note
    For detailed information about the LLM usage guide, please check LLM Usage Guide.

    Check the LLM Endpoint of the Kubernetes Engine cluster

    You can check the usage guide for the LLM Endpoint on the Cluster Details page of the created Kubernetes Engine cluster.

    To view the usage guide for LLM Endpoint, follow the steps below.

    1. Click the All Services > Container > Kubernetes Engine menu. Navigate to the Service Home page of Kubernetes Engine.
    2. Click the Cluster menu from the Service Home page. Go to the Cluster List page.
    3. Click the resource to connect to the LLM Endpoint on the Cluster List page. You will be taken to the Cluster Details page.
    4. On the Cluster Details page, click the User Guide link of the LLM Endpoint item. It will open the LLM User Guide popup.
    Reference
    For detailed information about the LLM usage guide, please check LLM 이용 가이드.

    LLM Usage Guide

    In the usage guide of LLM Endpoint, you can see AIOS LLM Private Endpoint, the provided model, and sample code examples.

    AIOS LLM Private Endpoint

    The URL of the AIOS LLM private endpoint is displayed. Check the URL to use it within the resources created for the Virtual Server, GPU Server, Kubernetes Engine services.

    AIOS LLM Provided Model

    The AIOS LLM provided models are as follows.

    Model NameModel IDContext SizeRPM (Request per minute)TPM (Token per minute)PurposeLicenseDiscontinuation Date
    gpt-oss-120bopenai/gpt-oss-120b131,07250 RPM200KResearch, Experiment, Advanced Language UnderstandingApache 2.0No plans
    Qwen3-Coder-30B-A3B-InstructQwen/Qwen3-Coder-30B-A3B-Instruct65,53620 RPM30Kcode generation, analysis, debugging supportApache 2.0No plan
    Qwen3-30B-A3B-Thinking-2507Qwen/Qwen3-30B-A3B-Thinking-250732,76810 RPM30Kdeep reasoning, long text analysis, essay writingApache 2.0no plan
    Llama-4-Scoutmeta-llama/Llama-4-Scout32,76820 RPM35KLatest Llama model with multimodal capabilityllama4No plans
    Llama-Guard-4-12Bmeta-llama/Llama-Guard-4-12B32,76820 RPM200KCore security and moderation model to enhance reliability and safety in the latest large language models and multimodal AI servicesllama4No plan
    bge-m3sds/bge-m38,192100 RPM200KIt is a multilingual embedding model that supports multiple languages.Samsung SDSNo plan
    bge-reranker-v2-m3sds/bge-reranker-v2-m38,192100 RPM200KProvides fast computation and high performance as a lightweight multilingual reranker.Samsung SDSNo plans
    Table. AIOS LLM provided models

    Sample code

    Refer to the following for AIOS LLM sample code examples.

    Color mode
    curl -H "Content-Type: application/json" \
      -d '{
            "model": "openai/gpt-oss-120b"
          , "prompt" : "Write a haiku about recursion in programming."
          , "temperature": 0
          , "max_tokens": 100
          , "stream": false
          }' \
    {AIOS LLM private endpoint}/{API}
    curl -H "Content-Type: application/json" \
      -d '{
            "model": "openai/gpt-oss-120b"
          , "prompt" : "Write a haiku about recursion in programming."
          , "temperature": 0
          , "max_tokens": 100
          , "stream": false
          }' \
    {AIOS LLM private endpoint}/{API}
    Code block. AIOS LLM sample code

    Check usage per LLM model

    You can view the list of LLMs and token usage per model on the Service Home page of AIOS.

    1. All Services > AI-ML > AIOS Click the menu. Navigate to AIOS’s Service Home page.
    2. LLM usage by model In the list, check the LLM’s model name, model type, and usage token amount (1 week).
      CategoryDetailed description
      Model NameLLM Name
      • Click the name to go to the model’s Report page
      Model TypeLLM Type
      • chat, reasoning, vision, moderation, embedding, rerank
      • Model-specific information is Provided Model see
      Token usage (1 Week)Token usage for one week as of today
      Table. AIOS LLM list items

    Report Check

    You can check the daily LLM call count and token usage on AIOS’s Report page.

    The service types can be selected as Virtual Server, GPU Server, Kubernetes Engine, and you can query by selecting resource names among the resources actually created in the service, and you can also query by the LLM model used.

    1. All Services > AI-ML > AIOS Click the menu. Navigate to AIOS’s Service Home page.
    2. Click the Report menu on the Service Home page. Navigate to the Report page of AIOS.
      • LLM usage by model In the list, clicking the LLM model name will take you directly to that LLM’s Report page.
    3. Report page, after selecting the LLM model to view the Report, click the Query button. The Report information for that LLM model will be displayed.
      CategoryDetailed description
      Service TypeSelect service type using LLM
      • Virtual Server, GPU Server, Kubernetes Engine
      Resource NameSelect Service Name
      • If you do not select a service type, only All can be selected, and if you select a specific product in the service type, a specific resource name can be selected
      ModelSelect LLM model type
      Query PeriodSelect the period to view the Report
      • Selectable in weekly units
      • Previous periods can be queried up to a maximum of 3 months
      • The data retrieved is provided up to a maximum of 30 minutes prior to the current time
      Call CountDaily call count during the query period
      • Displayed per day as total count, success count, and failure count
      • Total call count: Provides the total number of calls during the period by model
      Token usageDaily Token input and output amounts during the query period
      • Total number of Tokens: Total Token usage during the query period
      • Average number of Tokens per Request: Average Token amount used when calling the LLM during the query period
      Table. AIOS Report items