The page has been translated by Gen AI.

Overview

Service Overview

AIOS provides an environment where, after creating Virtual Server, GPU Server, and Kubernetes Engine resources on the Samsung Cloud Platform, you can develop AI applications using LLM on those resources without separate LLM service installation or configuration.

Features

  • Convenient LLM usage Provides LLM Endpoint as a default, allowing you to use LLM directly from resources such as Virtual Server, GPU Server, Kubernetes Engine on Samsung Cloud Platform.
  • AI Development Productivity Improvement : AI developers can use various models with the same API, and support compatibility with OpenAI and LangChain SDKs, allowing easy integration with existing development environments and frameworks.

Service Configuration Diagram

Diagram
Figure. AIOS diagram

Provided Features

We provide the following features.

  • AIOS LLM Endpoint provided: If you apply for Virtual Server, GPU Server, or Kubernetes Engine services, the detailed page of the created resource provides LLM Endpoint information and a usage guide, and according to the guide you can connect to the LLM from that resource and use it.
  • AIOS Report provided: You can check the number of calls and token usage by type, resource, and model, as well as the total usage by LLM.

Provided Model

The LLM models provided by AIOS are as follows.

Model NameModel TypeIntroductionMain UsesFeatures
gpt-oss-120bChat+Reasoningko) Open-source GPT series model based on 120 billion parameters, latest modelResearch·experimentation, large-scale language understanding, AI services requiring complex reasoning/analysis, building agent-type systems
  • Ultra-large parameters
  • Broad knowledge coverage, general-purpose usability
  • Full CoT chain generation
Qwen3-Coder-30B-A3B-InstructCodeko) Qwen3 series code model optimized for code generation and debuggingSoftware development, AI code assistant, long document/repository analysis
  • Large-scale code knowledge learning
  • Multilingual support
  • Long-context understanding possible
Qwen3-30B-A3B-Thinking-2507Chat+Reasoningko) Qwen3 model enhanced for long-form reasoning and deep thinking (Thinking)Research, analysis reports, logical writing, mathematics, science, coding
  • Specialized in long-form and complex reasoning
  • Consistent CoT chain generation
Llama-4-ScoutChat+VisionLatest Llama model with multimodal capabilityDocument analysis·summarization, customer support·chatbot
  • Multimodal (text+image), fast inference, runnable on a single GPU
  • Very long text, multi-document summarization/analysis possible, multimodal support
  • Top performance among peers on various benchmarks
  • Up to 4 images can be input
Llama-Guard-4-12BmoderationCore security and moderation model to enhance reliability and safety in the latest large language models and multimodal AI servicesUsed for automatic filtering of harmful user inputs and model responses
  • Multimodal security classification
  • Content moderation specialization
  • Multilingual support
bge-m3embeddingCore embedding model with three characteristics: multi-functionality, multilingual support, and large-scale input handlingUsed in generative AI to retrieve external knowledge and provide answer evidence by combining dense and sparse retrieval to ensure both accuracy and generalization performance
  • Multi-Functionality: dense embedding retrieval (Dense Retrieval), token-based weighted retrieval (Sparse Retrieval), multi-vector retrieval (Multi-Vector Retrieval)
  • Multi-Linguality: supports more than 100 languages
  • Multi-Granularity: can handle up to 8,192 tokens
bge-reranker-v2-m3rerankA core component for various information retrieval, question answering, and chatbot systems that require fast and accurate re-ranking of search results in multilingual environmentsRe-rank candidate answers or documents for a question in order of relevance
  • Lightweight and high-speed inference
  • Multilingual support
  • Easy integration: compatible with Hugging Face Transformers, FlagEmbedding
Table. LLM models provided by AIOS

Region-specific provision status

AIOS is available in the following environment.

RegionAvailability
Korea West (kr-west1)Provided
Korea East (kr-east1)Not provided
South Korea 1(kr-south1)Not provided
South Korea South2(kr-south2)Not provided
South Korea South 3(kr-south3)Not provided
Table. AIOS regional provision status

Pre-service

This is a list of services that must be pre-configured before creating the service. For details, refer to the guide provided for each service and prepare in advance.

Service CategoryServiceDetailed Description
ComputeVirtual ServerVirtual server optimized for cloud computing
ComputeGPU ServerA virtual server suitable for tasks that require fast computation speed, such as AI model experiments, predictions, and inference in a cloud environment.
ComputeCloud FunctionsServerless computing based Faas (Function as a Service)
ContainerKubernetes EngineA service that provides lightweight virtual computing and containers, and Kubernetes clusters for managing them
Table. AIOS Preliminary Service
AI-ML
How-to Guides