The page has been translated by Gen AI.

Overview

Service Overview

AIOS provides an environment for developing AI applications using LLM on Virtual Server, GPU Server, and Kubernetes Engine resources created on Samsung Cloud Platform, without the need for separate LLM service installation or configuration.

Key Features

  • Convenient LLM Usage: Provides LLM Endpoints by default that allow direct use of LLM on Virtual Server, GPU Server, and Kubernetes Engine resources on Samsung Cloud Platform.
  • Improved AI Development Productivity: AI developers can use various models with the same API, and compatibility with OpenAI and LangChain SDKs allows easy integration with existing development environments and frameworks.
  • ServiceWatch Integration: Data can be monitored through the ServiceWatch service.

Service Architecture

Architecture
Fig. AIOS Architecture

Provided Features

The following features are provided:

  • AIOS LLM Endpoint Provision: When applying for Virtual Server, GPU Server, or Kubernetes Engine services, LLM Endpoint information and usage guides are provided on the detail page of the created resources. You can access and use LLM on those resources by following the usage guide.
  • AIOS Report Provision: You can check the number of calls and token usage by type, resource, and model, as well as total usage by LLM.

Provided Models

The LLM models provided by AIOS are as follows:

Model NameModel TypeDescriptionMain Use CasesFeatures
gpt-oss-120bChat+ReasoningLatest GPT series open-source model based on 120 billion parametersResearch/experiments, large-scale language understanding, AI services requiring complex reasoning/analysis, building agent-type systems
  • Ultra-large parameters
  • Broad knowledge coverage, general-purpose application possible
  • Complete CoT chain generation
Qwen3-Coder-30B-A3B-InstructCodeQwen3 series code model optimized for code generation and debuggingSoftware development, AI code assistant, long document/repository analysis
  • Large-scale code knowledge learning
  • Multilingual support
  • Long-context understanding possible
Qwen3-30B-A3B-Thinking-2507Chat+ReasoningQwen3 model enhanced for long-form reasoning and deep thinkingResearch, analysis reports, logical writing, mathematics, science, coding
  • Specialized in long-form and complex reasoning
  • Consistent CoT chain generation
Llama-4-ScoutChat+VisionLatest Llama model with multimodal capabilitiesDocument analysis/summarization, customer support/chatbots
  • Multimodal (text+image), fast inference, single GPU operation possible
  • Ultra-long text, multi-document summarization/analysis possible, multimodal support
  • Top-tier performance in various benchmarks
  • Up to 4 images can be input
Llama-Guard-4-12BmoderationKey security and moderation model for enhancing reliability and safety in the latest large language models and multimodal AI servicesUsed for automatic filtering of harmfulness in user input and model responses
  • Multimodal security classification
  • Specialized in content moderation
  • Multilingual support
bge-m3embeddingKey embedding model with three characteristics: multi-functionality, multilingual support, and large input capacityUsed when retrieving external knowledge and providing answer evidence in generative AI, combining Dense and Sparse search to ensure both accuracy and generalization performance
  • Multi-Functionality: Dense embedding retrieval, token-based weighted retrieval (Sparse Retrieval), multi-vector retrieval (Multi-Vector Retrieval)
  • Multi-Linguality: Supports 100+ languages
  • Multi-Granularity: Can process up to 8,192 tokens
bge-reranker-v2-m3rerankKey component for various information retrieval, question answering, and chatbot systems that require fast and accurate search result reranking in multilingual environmentsRerank candidate answers or documents for questions in relevance order
  • Lightweight and fast inference
  • Multilingual support
  • Easy integration: Hugging Face Transformers, FlagEmbedding compatible
Table. AIOS Provided LLM Models

Regional Availability

AIOS can be provided in the following environments:

RegionAvailability
Korea West (kr-west1)Available
Korea East (kr-east1)Not Available
Korea South1 (kr-south1)Not Available
Korea South2 (kr-south2)Not Available
Korea South3 (kr-south3)Not Available
Table. AIOS Regional Availability

Prerequisite Services

This is a list of services that must be configured in advance before creating this service. For detailed information, please prepare in advance by referring to the guides provided for each service.

Service CategoryServiceDetailed Description
ComputeVirtual ServerVirtual server optimized for cloud computing
ComputeGPU ServerVirtual server suitable for tasks requiring fast computation speed such as AI model experimentation, prediction, and inference in cloud environments
ComputeCloud FunctionsFaaS (Function as a Service) based on serverless computing
ContainerKubernetes EngineService providing lightweight virtual computing and containers and Kubernetes clusters to manage them
Table. AIOS Prerequisite Services
AI-ML
ServiceWatch Metrics