Service Overview
AIOS provides an environment where, after creating Virtual Server, GPU Server, and Kubernetes Engine resources on the Samsung Cloud Platform, you can develop AI applications using LLM on those resources without separate LLM service installation or configuration.
Features
- Convenient LLM usage Provides LLM Endpoint as a default, allowing you to use LLM directly from resources such as Virtual Server, GPU Server, Kubernetes Engine on Samsung Cloud Platform.
- AI Development Productivity Improvement : AI developers can use various models with the same API, and support compatibility with OpenAI and LangChain SDKs, allowing easy integration with existing development environments and frameworks.
Service Configuration Diagram
Provided Features
We provide the following features.
- AIOS LLM Endpoint provided: If you apply for Virtual Server, GPU Server, or Kubernetes Engine services, the detailed page of the created resource provides LLM Endpoint information and a usage guide, and according to the guide you can connect to the LLM from that resource and use it.
- AIOS Report provided: You can check the number of calls and token usage by type, resource, and model, as well as the total usage by LLM.
Provided Model
The LLM models provided by AIOS are as follows.
| Model Name | Model Type | Introduction | Main Uses | Features |
|---|---|---|---|---|
| gpt-oss-120b | Chat+Reasoning | ko) Open-source GPT series model based on 120 billion parameters, latest model | Research·experimentation, large-scale language understanding, AI services requiring complex reasoning/analysis, building agent-type systems |
|
| Qwen3-Coder-30B-A3B-Instruct | Code | ko) Qwen3 series code model optimized for code generation and debugging | Software development, AI code assistant, long document/repository analysis |
|
| Qwen3-30B-A3B-Thinking-2507 | Chat+Reasoning | ko) Qwen3 model enhanced for long-form reasoning and deep thinking (Thinking) | Research, analysis reports, logical writing, mathematics, science, coding |
|
| Llama-4-Scout | Chat+Vision | Latest Llama model with multimodal capability | Document analysis·summarization, customer support·chatbot |
|
| Llama-Guard-4-12B | moderation | Core security and moderation model to enhance reliability and safety in the latest large language models and multimodal AI services | Used for automatic filtering of harmful user inputs and model responses |
|
| bge-m3 | embedding | Core embedding model with three characteristics: multi-functionality, multilingual support, and large-scale input handling | Used in generative AI to retrieve external knowledge and provide answer evidence by combining dense and sparse retrieval to ensure both accuracy and generalization performance |
|
| bge-reranker-v2-m3 | rerank | A core component for various information retrieval, question answering, and chatbot systems that require fast and accurate re-ranking of search results in multilingual environments | Re-rank candidate answers or documents for a question in order of relevance |
|
Table. LLM models provided by AIOS
Region-specific provision status
AIOS is available in the following environment.
| Region | Availability |
|---|---|
| Korea West (kr-west1) | Provided |
| Korea East (kr-east1) | Not provided |
| South Korea 1(kr-south1) | Not provided |
| South Korea South2(kr-south2) | Not provided |
| South Korea South 3(kr-south3) | Not provided |
Table. AIOS regional provision status
Pre-service
This is a list of services that must be pre-configured before creating the service. For details, refer to the guide provided for each service and prepare in advance.
| Service Category | Service | Detailed Description |
|---|---|---|
| Compute | Virtual Server | Virtual server optimized for cloud computing |
| Compute | GPU Server | A virtual server suitable for tasks that require fast computation speed, such as AI model experiments, predictions, and inference in a cloud environment. |
| Compute | Cloud Functions | Serverless computing based Faas (Function as a Service) |
| Container | Kubernetes Engine | A service that provides lightweight virtual computing and containers, and Kubernetes clusters for managing them |
Table. AIOS Preliminary Service
