Service Overview
AIOS provides an environment for developing AI applications using LLM on Virtual Server, GPU Server, and Kubernetes Engine resources created on Samsung Cloud Platform, without the need for separate LLM service installation or configuration.
Key Features
- Convenient LLM Usage: Provides LLM Endpoints by default that allow direct use of LLM on Virtual Server, GPU Server, and Kubernetes Engine resources on Samsung Cloud Platform.
- Improved AI Development Productivity: AI developers can use various models with the same API, and compatibility with OpenAI and LangChain SDKs allows easy integration with existing development environments and frameworks.
- ServiceWatch Integration: Data can be monitored through the ServiceWatch service.
Service Architecture
Provided Features
The following features are provided:
- AIOS LLM Endpoint Provision: When applying for Virtual Server, GPU Server, or Kubernetes Engine services, LLM Endpoint information and usage guides are provided on the detail page of the created resources. You can access and use LLM on those resources by following the usage guide.
- AIOS Report Provision: You can check the number of calls and token usage by type, resource, and model, as well as total usage by LLM.
Provided Models
The LLM models provided by AIOS are as follows:
| Model Name | Model Type | Description | Main Use Cases | Features |
|---|---|---|---|---|
| gpt-oss-120b | Chat+Reasoning | Latest GPT series open-source model based on 120 billion parameters | Research/experiments, large-scale language understanding, AI services requiring complex reasoning/analysis, building agent-type systems |
|
| Qwen3-Coder-30B-A3B-Instruct | Code | Qwen3 series code model optimized for code generation and debugging | Software development, AI code assistant, long document/repository analysis |
|
| Qwen3-30B-A3B-Thinking-2507 | Chat+Reasoning | Qwen3 model enhanced for long-form reasoning and deep thinking | Research, analysis reports, logical writing, mathematics, science, coding |
|
| Llama-4-Scout | Chat+Vision | Latest Llama model with multimodal capabilities | Document analysis/summarization, customer support/chatbots |
|
| Llama-Guard-4-12B | moderation | Key security and moderation model for enhancing reliability and safety in the latest large language models and multimodal AI services | Used for automatic filtering of harmfulness in user input and model responses |
|
| bge-m3 | embedding | Key embedding model with three characteristics: multi-functionality, multilingual support, and large input capacity | Used when retrieving external knowledge and providing answer evidence in generative AI, combining Dense and Sparse search to ensure both accuracy and generalization performance |
|
| bge-reranker-v2-m3 | rerank | Key component for various information retrieval, question answering, and chatbot systems that require fast and accurate search result reranking in multilingual environments | Rerank candidate answers or documents for questions in relevance order |
|
Regional Availability
AIOS can be provided in the following environments:
| Region | Availability |
|---|---|
| Korea West (kr-west1) | Available |
| Korea East (kr-east1) | Not Available |
| Korea South1 (kr-south1) | Not Available |
| Korea South2 (kr-south2) | Not Available |
| Korea South3 (kr-south3) | Not Available |
Prerequisite Services
This is a list of services that must be configured in advance before creating this service. For detailed information, please prepare in advance by referring to the guides provided for each service.
| Service Category | Service | Detailed Description |
|---|---|---|
| Compute | Virtual Server | Virtual server optimized for cloud computing |
| Compute | GPU Server | Virtual server suitable for tasks requiring fast computation speed such as AI model experimentation, prediction, and inference in cloud environments |
| Compute | Cloud Functions | FaaS (Function as a Service) based on serverless computing |
| Container | Kubernetes Engine | Service providing lightweight virtual computing and containers and Kubernetes clusters to manage them |
