We provide AI/ML services that allow you to easily and conveniently build and learn ML/DL (Machine Learning/Deep Learning) model development and learning environments.
This is the multi-page printable view of this section. Click here to print.
AI-ML
- 1: AIOS
- 1.1: Overview
- 1.1.1: ServiceWatch Metrics
- 1.2: How-to Guides
- 1.3: References
- 1.3.1: API Reference
- 1.3.2: SDK Reference
- 1.3.3: Tutorial
- 1.3.3.1: Chat Playground
- 1.3.3.2: RAG
- 1.3.3.3: Autogen
- 1.3.4: Request Examples
- 1.4: Release Note
- 1.5: Licenses
- 1.5.1: Llama-4-Scout
- 1.5.2: Llama-Guard-4-12B
- 1.5.3: bge-m3
- 1.5.4: bge-reranker-v2-m3
- 1.5.5: gpt-oss-120b
- 1.5.6: Qwen3-30B-A3B
- 1.5.7: Qwen3-30B-A3B
- 2: CloudML
- 2.1: Overview
- 2.2: How-to guides
- 2.3: API Reference
- 2.4: CLI Reference
- 2.5: Release Note
- 3: AI&MLOps Platform
- 3.1: Overview
- 3.2: How-to guides
- 3.2.1: Cluster Deployment
- 3.2.2: Kubeflow User Guide
- 3.3: API Reference
- 3.4: CLI Reference
- 3.5: Release Note
1 - AIOS
1.1 - Overview
Service Overview
AIOS provides an environment for developing AI applications using LLM on Virtual Server, GPU Server, and Kubernetes Engine resources created on Samsung Cloud Platform, without the need for separate LLM service installation or configuration.
Key Features
- Convenient LLM Usage: Provides LLM Endpoints by default that allow direct use of LLM on Virtual Server, GPU Server, and Kubernetes Engine resources on Samsung Cloud Platform.
- Improved AI Development Productivity: AI developers can use various models with the same API, and compatibility with OpenAI and LangChain SDKs allows easy integration with existing development environments and frameworks.
- ServiceWatch Integration: Data can be monitored through the ServiceWatch service.
Service Architecture
Provided Features
The following features are provided:
- AIOS LLM Endpoint Provision: When applying for Virtual Server, GPU Server, or Kubernetes Engine services, LLM Endpoint information and usage guides are provided on the detail page of the created resources. You can access and use LLM on those resources by following the usage guide.
- AIOS Report Provision: You can check the number of calls and token usage by type, resource, and model, as well as total usage by LLM.
Provided Models
The LLM models provided by AIOS are as follows:
| Model Name | Model Type | Description | Main Use Cases | Features |
|---|---|---|---|---|
| gpt-oss-120b | Chat+Reasoning | Latest GPT series open-source model based on 120 billion parameters | Research/experiments, large-scale language understanding, AI services requiring complex reasoning/analysis, building agent-type systems |
|
| Qwen3-Coder-30B-A3B-Instruct | Code | Qwen3 series code model optimized for code generation and debugging | Software development, AI code assistant, long document/repository analysis |
|
| Qwen3-30B-A3B-Thinking-2507 | Chat+Reasoning | Qwen3 model enhanced for long-form reasoning and deep thinking | Research, analysis reports, logical writing, mathematics, science, coding |
|
| Llama-4-Scout | Chat+Vision | Latest Llama model with multimodal capabilities | Document analysis/summarization, customer support/chatbots |
|
| Llama-Guard-4-12B | moderation | Key security and moderation model for enhancing reliability and safety in the latest large language models and multimodal AI services | Used for automatic filtering of harmfulness in user input and model responses |
|
| bge-m3 | embedding | Key embedding model with three characteristics: multi-functionality, multilingual support, and large input capacity | Used when retrieving external knowledge and providing answer evidence in generative AI, combining Dense and Sparse search to ensure both accuracy and generalization performance |
|
| bge-reranker-v2-m3 | rerank | Key component for various information retrieval, question answering, and chatbot systems that require fast and accurate search result reranking in multilingual environments | Rerank candidate answers or documents for questions in relevance order |
|
Regional Availability
AIOS can be provided in the following environments:
| Region | Availability |
|---|---|
| Korea West (kr-west1) | Available |
| Korea East (kr-east1) | Not Available |
| Korea South1 (kr-south1) | Not Available |
| Korea South2 (kr-south2) | Not Available |
| Korea South3 (kr-south3) | Not Available |
Prerequisite Services
This is a list of services that must be configured in advance before creating this service. For detailed information, please prepare in advance by referring to the guides provided for each service.
| Service Category | Service | Detailed Description |
|---|---|---|
| Compute | Virtual Server | Virtual server optimized for cloud computing |
| Compute | GPU Server | Virtual server suitable for tasks requiring fast computation speed such as AI model experimentation, prediction, and inference in cloud environments |
| Compute | Cloud Functions | FaaS (Function as a Service) based on serverless computing |
| Container | Kubernetes Engine | Service providing lightweight virtual computing and containers and Kubernetes clusters to manage them |
1.1.1 - ServiceWatch Metrics
AIOS sends metrics to ServiceWatch. The metrics provided by default monitoring are data collected at a 1-minute interval.
Basic Indicators
The following are the basic metrics for the AIOS namespace.
| Performance Item | Detailed Description | Unit | Meaningful Statistics |
|---|
1.2 - How-to Guides
Using AIOS
AIOS provides an environment where LLM can be used by default within each resource when you create Virtual Server, GPU Server, Kubernetes Engine services.
For detailed information on each service creation, refer to the table below.
| Service | Guide |
|---|---|
| Virtual Server | Virtual Server Create |
| GPU Server | Create GPU Server |
| Cloud Functions | Cloud Functions Create |
| Kubernetes Engine | Create Cluster |
Using LLM
LLM can be used by utilizing the LLM Endpoint within the service resources such as Virtual Server, GPU Server, Cloud Functions, Kubernetes Engine created on Samsung Cloud Platform. The LLM Endpoint can be checked through the Usage Guide for the LLM Endpoint on the service’s detail page.
Check the LLM Endpoint of Virtual Server
You can check the usage guide for the LLM Endpoint on the Virtual Server Details page of the created Virtual Server.
To check the usage guide for the LLM Endpoint, follow the steps below.
- All Services > Compute > Virtual Server Click the menu. Go to the Service Home page of Virtual Server.
- Click the Virtual Server menu on the Service Home page. Navigate to the Virtual Server list page.
- Virtual Server List page, click the resource to connect to the LLM Endpoint. Navigate to the Virtual Server Details page.
- Virtual Server Details on the page, click the User Guide link of the LLM Endpoint item. It will navigate to the LLM User Guide popup window.
Check GPU Server’s LLM Endpoint
You can check the usage guide for the LLM Endpoint on the GPU Server Details page of the created GPU Server.
To view the usage guide for LLM Endpoint, follow the steps below.
- All Services > Compute > GPU Server Click the menu. Go to the Service Home page of GPU Server.
- Click the GPU Server menu on the Service Home page. It navigates to the GPU Server List page.
- GPU Server List page, click the resource to connect to the LLM Endpoint. GPU Server Details page, navigate.
- GPU Server Details on the page, click the LLM Endpoint item’s User Guide link. You will be taken to the LLM User Guide popup window.
Checking the LLM Endpoint of Cloud Functions
You can view the usage guide for the LLM Endpoint on the Cloud Functions Details page of the created Cloud Functions.
To view the usage guide for the LLM Endpoint, follow the steps below.
- All Services > Compute > Cloud Functions Click the menu. Go to the Service Home page of Cloud Functions.
- Click the Functions menu on the Service Home page. Go to the Functions list page.
- On the Functions list page, click the resource to connect to the LLM Endpoint. You will be taken to the Functions details page.
- Click the User Guide link of the LLM Endpoint item on the Functions Details page. It will open the LLM User Guide popup.
Check the LLM Endpoint of the Kubernetes Engine cluster
You can check the usage guide for the LLM Endpoint on the Cluster Details page of the created Kubernetes Engine cluster.
To view the usage guide for LLM Endpoint, follow the steps below.
- Click the All Services > Container > Kubernetes Engine menu. Navigate to the Service Home page of Kubernetes Engine.
- Click the Cluster menu from the Service Home page. Go to the Cluster List page.
- Click the resource to connect to the LLM Endpoint on the Cluster List page. You will be taken to the Cluster Details page.
- On the Cluster Details page, click the User Guide link of the LLM Endpoint item. It will open the LLM User Guide popup.
LLM Usage Guide
In the usage guide of LLM Endpoint, you can see AIOS LLM Private Endpoint, the provided model, and sample code examples.
AIOS LLM Private Endpoint
The URL of the AIOS LLM private endpoint is displayed. Check the URL to use it within the resources created for the Virtual Server, GPU Server, Kubernetes Engine services.
AIOS LLM Provided Model
The AIOS LLM provided models are as follows.
| Model Name | Model ID | Context Size | RPM (Request per minute) | TPM (Token per minute) | Purpose | License | Discontinuation Date |
|---|---|---|---|---|---|---|---|
| gpt-oss-120b | openai/gpt-oss-120b | 131,072 | 50 RPM | 200K | Research, Experiment, Advanced Language Understanding | Apache 2.0 | No plans |
| Qwen3-Coder-30B-A3B-Instruct | Qwen/Qwen3-Coder-30B-A3B-Instruct | 65,536 | 20 RPM | 30K | code generation, analysis, debugging support | Apache 2.0 | No plan |
| Qwen3-30B-A3B-Thinking-2507 | Qwen/Qwen3-30B-A3B-Thinking-2507 | 32,768 | 10 RPM | 30K | deep reasoning, long text analysis, essay writing | Apache 2.0 | no plan |
| Llama-4-Scout | meta-llama/Llama-4-Scout | 32,768 | 20 RPM | 35K | Latest Llama model with multimodal capability | llama4 | No plans |
| Llama-Guard-4-12B | meta-llama/Llama-Guard-4-12B | 32,768 | 20 RPM | 200K | Core security and moderation model to enhance reliability and safety in the latest large language models and multimodal AI services | llama4 | No plan |
| bge-m3 | sds/bge-m3 | 8,192 | 100 RPM | 200K | It is a multilingual embedding model that supports multiple languages. | Samsung SDS | No plan |
| bge-reranker-v2-m3 | sds/bge-reranker-v2-m3 | 8,192 | 100 RPM | 200K | Provides fast computation and high performance as a lightweight multilingual reranker. | Samsung SDS | No plans |
Sample code
Refer to the following for AIOS LLM sample code examples.
curl -H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-oss-120b"
, "prompt" : "Write a haiku about recursion in programming."
, "temperature": 0
, "max_tokens": 100
, "stream": false
}' \
{AIOS LLM private endpoint}/{API}curl -H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-oss-120b"
, "prompt" : "Write a haiku about recursion in programming."
, "temperature": 0
, "max_tokens": 100
, "stream": false
}' \
{AIOS LLM private endpoint}/{API}Check usage per LLM model
You can view the list of LLMs and token usage per model on the Service Home page of AIOS.
- All Services > AI-ML > AIOS Click the menu. Navigate to AIOS’s Service Home page.
- LLM usage by model In the list, check the LLM’s model name, model type, and usage token amount (1 week).
Category Detailed description Model Name LLM Name - Click the name to go to the model’s Report page
Model Type LLM Type - chat, reasoning, vision, moderation, embedding, rerank
- Model-specific information is Provided Model see
Token usage (1 Week) Token usage for one week as of today Table. AIOS LLM list items
Report Check
You can check the daily LLM call count and token usage on AIOS’s Report page.
The service types can be selected as Virtual Server, GPU Server, Kubernetes Engine, and you can query by selecting resource names among the resources actually created in the service, and you can also query by the LLM model used.
- All Services > AI-ML > AIOS Click the menu. Navigate to AIOS’s Service Home page.
- Click the Report menu on the Service Home page. Navigate to the Report page of AIOS.
- LLM usage by model In the list, clicking the LLM model name will take you directly to that LLM’s Report page.
- Report page, after selecting the LLM model to view the Report, click the Query button. The Report information for that LLM model will be displayed.
Category Detailed description Service Type Select service type using LLM - Virtual Server, GPU Server, Kubernetes Engine
Resource Name Select Service Name - If you do not select a service type, only All can be selected, and if you select a specific product in the service type, a specific resource name can be selected
Model Select LLM model type - For information per model, see Provided Models
Query Period Select the period to view the Report - Selectable in weekly units
- Previous periods can be queried up to a maximum of 3 months
- The data retrieved is provided up to a maximum of 30 minutes prior to the current time
Call Count Daily call count during the query period - Displayed per day as total count, success count, and failure count
- Total call count: Provides the total number of calls during the period by model
Token usage Daily Token input and output amounts during the query period - Total number of Tokens: Total Token usage during the query period
- Average number of Tokens per Request: Average Token amount used when calling the LLM during the query period
Table. AIOS Report items
1.3 - References
References
In AIOS, you can check the API, SDK reference, and tutorials to help you get started.
| Category | Description |
|---|---|
| API Reference | List of APIs supported by AIOS
|
| SDK Reference | Information on SDKs compatible with AIOS, including OpenAI’s SDK
|
| Tutorial | Tutorials to help you get started with AIOS
|
1.3.1 - API Reference
API Reference Overview
The API Reference supported by AIOS is as follows.
| API Name | API | Detailed Description |
|---|---|---|
| Rerank API | POST /rerank, /v1/rerank, /v2/rerank | Applies an embedding model or cross-encoder model to predict the relevance between a single query and each item in a document list. |
| Score API | POST /score, /v1/score | Predicts the similarity between two sentences. |
| Chat Completions API | POST /v1/chat/completions | Compatible with OpenAI’s Completions API and can be used with the OpenAI Python client. |
| Completions API | POST /v1/completions | Compatible with OpenAI’s Completions API and can be used with the OpenAI Python client. |
| Embedding API | POST /v1/embeddings | Converts text into a high-dimensional vector (embedding) that can be used for various natural language processing (NLP) tasks, such as calculating text similarity, clustering, and searching. |
Rerank API
POST /rerank, /v1/rerank, /v2/rerank
Overview
The Rerank API applies an embedding model or cross-encoder model to predict the relevance between a single query and each item in a document list. Generally, the score of a sentence pair represents the similarity between the two sentences on a scale of 0 to 1.
- Embedding-based model: Converts the query and document into vectors and measures the similarity between the vectors (e.g., cosine similarity) to calculate the score.
- Reranker (Cross-Encoder) based model: Evaluates the query and document as a pair.
Request
Context
| Key | Type | Description | Example |
|---|---|---|---|
| Base URL | string | AIOS URL for API requests | application/json |
| Request Method | string | HTTP method used for API requests | POST |
| Headers | object | Header information required for requests | { “accept”: “application/json”, “Content-Type”: “application/json” } |
| Body Parameters | object | Parameters included in the request body | { “model”: “sds/bge-m3”, “query”: …, “documents”: […] } |
Path Parameters
| Name | type | Required | Description | Default value | Boundary value | Example |
|---|---|---|---|---|---|---|
| None |
Query Parameters
| Name | type | Required | Description | Default value | Boundary value | Example |
|---|---|---|---|---|---|---|
| None |
Body Parameters
| Name | Name Sub | type | Required | Description | Default value | Boundary value | Example |
|---|---|---|---|---|---|---|---|
| model | - | string | ✅ | Model used for response generation | “sds/bge-reranker-v2-m3” | ||
| query | - | string | ✅ | User’s search query or question | “What is the capital of France?" | ||
| documents | - | array | ✅ | List of documents to be re-ranked | Maximum model input length limit | [“The capital of France is Paris.”] | |
| top_n | - | integer | ❌ | Number of top documents to return (0 returns all) | 0 | > 0 | 5 |
| truncate_prompt_tokens | - | integer | ❌ | Limits the number of input tokens | > 0 | 100 |
Example
curl -X 'POST' \
'https://aios.private.kr-west1.e.samsungsdscloud.com/rerank' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model": "sds/bge-reranker-v2-m3",
Here is the translation of the given text:
"query": "What is the capital of France?",
"documents": [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France."
],
"top_n": 2,
"truncate_prompt_tokens": 512
}
Response
200 OK
| Name | Type | Description |
|---|---|---|
| id | string | API response’s unique identifier (UUID format) |
| model | string | Name of the model that generated the result |
| usage | integer | Object containing information about the resources used in the request |
| usage.total_tokens | integer | Total number of tokens used in processing the request |
| result | string | Array containing the results of the query-related documents |
| results[].index | integer | Order number in the result array |
| results[].document | object | Object containing the content of the searched document |
| results[].document.text | string | Actual text content of the searched document |
| results[].relevance_score | float | Score indicating the relevance between the query and the document (0 ~ 1) |
Error Code
| HTTP status code | Error Code Description |
|---|---|
| 400 | Bad Request |
| 422 | Validation Error |
| 500 | Internal Server Error |
Example
{
"id": "rerank-scp-aios-rerank",
"model": "sds/sds/bge-m3",
"usage": {
"total_tokens": 65
},
"results": [
{
"index": 0,
"document": {
"text": "The capital of France is Paris."
},
"relevance_score": 0.8291233777999878
},
{
"index": 1,
"document": {
"text": "France capital city is known for the Eiffel Tower."
},
"relevance_score": 0.6996355652809143
}
]
}
Reference
Score API
POST /score, /v1/score
Overview
The Score API predicts the similarity between two sentences. This API uses one of two models to calculate the score:
- Reranker (Cross-Encoder) model: Takes a pair of sentences as input and directly predicts the similarity score.
- Embedding model: Generates embedding vectors for each sentence and calculates the cosine similarity to derive the score.
Request
Context
| Key | Type | Description | Example |
|---|---|---|---|
| Base URL | string | AIOS URL for API requests | application/json |
| Request Method | string | HTTP method used for API requests | POST |
| Headers | object | Header information required for requests | { “accept”: “application/json”, “Content-Type”: “application/json” } |
| Body Parameters | object | Parameters included in the request body | { “model”: “sds/bge-reranker-v2-m3”, “text_1”: […], “text_2”: […] } |
Path Parameters
| Name | type | Required | Description | Default value | Boundary value | Example |
|---|---|---|---|---|---|---|
| None |
Query Parameters
| Name | type | Required | Description | Default value | Boundary value | Example |
|---|---|---|---|---|---|---|
| None |
Body Parameters
| Name | Name Sub | type | Required | Description | Default value | Boundary value | Example |
|---|---|---|---|---|---|---|---|
| model | - | string | ✅ | Specify the model to use for response generation | “sds/bge-reranker-v2-m3” | ||
| encoding_format | - | string | ❌ | Score return format | “float” |
| “float” |
| text_1 | - | string, array | ✅ | First text to compare |
| “What is the capital of France?" | |
| text_2 | - | string, array | ✅ | Second text to compare |
| [“The capital of France is Paris.”, ] | |
| truncate_prompt_tokens | - | integer | ❌ | Limit input token count | > 0 | 100 |
Example
curl -X 'POST' \
'https://aios.private.kr-west1.e.samsungsdscloud.com/score' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model": "sds/bge-reranker-v2-m3",
"encoding_format": "float",
"text_1": [
"What is the largest planet in the solar system?",
"What is the chemical symbol for water?"
],
"text_2": [
"Jupiter is the largest planet in the solar system.",
"The chemical symbol for water is H₂O."
]
}'
Response
200 OK
| Name | Type | Description |
|---|---|---|
| id | string | Unique identifier for the response |
| object | string | Type of response object (e.g., “list” ) |
| created | integer | Creation time (Unix timestamp, seconds) |
| model | string | Name of the model used |
| data | array | List of score calculation results |
| data.index | integer | Index of the item in the data array |
| data.object | string | Type of data item (e.g., “score”) |
| data.score | number | Calculated score value, normalized to 0 ~ 1 |
| usage | object | Token usage statistics |
| usage.prompt_tokens | integer | Number of tokens used in the input prompt |
| usage.total_tokens | integer | Total number of tokens (input + output) |
| usage.completion_tokens | integer | Number of tokens used in the generated response |
| usage.prompt_tokens_details | null | Detailed information about prompt tokens |
Error Code
| HTTP status code | Error Code Description |
|---|---|
| 400 | Bad Request |
| 422 | Validation Error |
| 500 | Internal Server Error |
Example
{
"id": "score-scp-aios-score",
"object": "list",
"created": 1748574112,
"model": "sds/bge-reranker-v2-m3",
"data": [
{
Here is the translated text:
"index": 0,
"object": "score",
"score": 1.0
},
{
"index": 1,
"object": "score",
"score": 1.0
}
], “usage”: { “prompt_tokens”: 53, “total_tokens”: 53, “completion_tokens”: 0, “prompt_tokens_details”: null } }
## Reference
* [Score API vLLM documentation](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#score-api_1)
# Chat Completions API
```python
POST /v1/chat/completions
Overview
Chat Completions API is compatible with OpenAI’s Completions API and can be used with the OpenAI Python client.
Request
Context
| Key | Type | Description | Example |
|---|---|---|---|
| Content-Type | string | application/json |
Path Parameters
| Name | type | Required | Description | Default value | Boundary value | Example |
|---|---|---|---|---|---|---|
| None |
Query Parameters
| Name | type | Required | Description | Default value | Boundary value | Example |
|---|---|---|---|---|---|---|
| None |
Body Parameters
| Name | Name Sub | type | Required | Description | Default value | Boundary value | Example |
|---|---|---|---|---|---|---|---|
| model | - | string | ✅ | Specifies the model to use for generating responses | “meta-llama/Llama-3.3-70B-Instruct” | ||
| messages | role | string | ✅ | List of messages containing conversation history | [ { “role” : “user” , “content” : “message” }] | ||
| frequency_penalty | - | number | ❌ | Adjusts the penalty for repeating tokens | 0 | -2.0 ~ 2.0 | 0.5 |
| logit_bias | - | object | ❌ | Adjusts the probability of specific tokens (e.g., { “100”: 2.0 }) | null | Key: token ID, Value: -100 ~ 100 | { “100”: 2.0 } |
| logprobs | - | boolean | ❌ | Returns the probabilities of the top logprobs number of tokens | false | true, false | true |
| max_completion_tokens | - | integer | ❌ | Limits the maximum number of generated tokens | None | 0 ~ model maximum | 100 |
| max_tokens (Deprecated) | - | integer | ❌ | Limits the maximum number of generated tokens | None | 0 ~ model maximum | 100 |
| n | - | integer | ❌ | Specifies the number of responses to generate | 1 | 3 | |
| presence_penalty | - | number | ❌ | Adjusts the penalty for tokens already present in the text | 0 | -2.0 ~ 2.0 | 1.0 |
| seed | - | integer | ❌ | Specifies the seed value for controlling randomness | None | ||
| stop | - | string / array / null | ❌ | Stops generating when a specific string is encountered | null | "\n" | |
| stream | - | boolean | ❌ | Returns the result in streaming mode | false | true/false | true |
| stream_options | include_usage, continuous_usage_stats | object | ❌ | Controls streaming options (e.g., including usage statistics) | null | { “include_usage”: true } | |
| temperature | - | number | ❌ | Adjusts the creativity of the generated response (higher means more random) | 1 | 0.0 ~ 1.0 | 0.7 |
| tool_choice | - | string | ❌ | Specifies which tool to call
|
| ||
| tools | - | array | ❌ | List of tools that the model can call
| None | ||
| top_logprobs | - | integer | ❌ | Specifies the number of top logprobs tokens to return (between 0 and 20)
| None | 0 ~ 20 | 3 |
| top_p | - | number | ❌ | Limits the sampling probability of tokens (higher means more tokens are considered) | 1 | 0.0 ~ 1.0 | 0.9 |
Example
curl -X 'POST' \
'https://aios.private.kr-west1.e.samsungsdscloud.com/v1/chat/completions' \
-H 'accept: application/json' \
200 OK
| Name | Type | Description |
|---|---|---|
| id | string | Response’s unique identifier |
| object | string | Type of response object (e.g., “chat.completion”) |
| created | integer | Creation time (Unix timestamp, in seconds) |
| model | string | Name of the model used |
| choices | array | List of generated response choices |
| choices[].index | integer | Index of the choice |
| choices[].message | object | Generated message object |
| choices[].message.role | string | Role of the message author (e.g., “assistant”) |
| choices[].message.content | string | Actual content of the generated message |
| choices[].message.reasoning_content | string | Actual content of the generated reasoning message |
| choices[].message.tool_calls | array (optional) | Tool call information (may be included depending on the model/settings) |
| choices[].finish_reason | string or null | Reason why the response was terminated (e.g., “stop”, “length”, etc.) |
| choices[].stop_reason | object or null | Additional termination reason details |
| choices[].logprobs | object or null | Token-wise log probability information (may be included depending on the settings) |
| usage | object | Token usage statistics |
| usage.prompt_tokens | integer | Number of tokens used in the input prompt |
| usage.completion_tokens | integer | Number of tokens used in the generated response |
| usage.total_tokens | integer | Total number of tokens (input + output) |
Error Code
| HTTP status code | Error Code Description |
|---|---|
| 400 | Bad Request |
| 422 | Validation Error |
| 500 | Internal Server Error |
Example
{
"id": "chatcmpl-scp-aios-chat-completions",
"object": "chat.completion",
"created": 1749702816,
"model": "meta-llama/Meta-Llama-3.3-70B-Instruct",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"reasoning_content": null,
"content": "The capital of Korea is Seoul.",
"tool_calls": []
},
"logprobs": null,
"finish_reason": "stop",
"stop_reason": null
}
],
"usage": {
"prompt_tokens": 54,
"total_tokens": 62,
"completion_tokens": 8,
"prompt_tokens_details": null
},
"prompt_logprobs": null
}
Reference
- Chat Completions API vLLM documentation
- Chat Completions API OpenAI documentation POST /v1/completions
## Overview
Completions API is compatible with OpenAI's Completions API and can be used with the OpenAI Python client.
## Request
### Context
Key
Type
Description
Example
Base URL
string
API request URL for AIOS
application/json
Request Method
string
HTTP method used for the API request
POST
Headers
object
Header information required for the request
{ “accept”: “application/json”, “Content-Type”: “application/json” }
Body Parameters
object
Parameters included in the request body
’{“model”: “meta-llama/Llama-3.3-70B-Instruct”, “prompt” : “hello”, “stream”: “true”}’
Table. Completions API - Context
### Path Parameters
Name
type
Required
Description
Default value
Boundary value
Example
None
Table. Completions API - Path Parameters
### Query Parameters
Name
type
Required
Description
Default value
Boundary value
Example
None
Table. Completions API - Query Parameters
### Body Parameters
Name
Name Sub
type
Required
Description
Default value
Boundary value
Example
model
-
string
✅
Model used to generate the response
“meta-llama/Llama-3.3-70B-Instruct”
prompt
-
array, string
✅
User input text
""
echo
-
boolean
❌
Whether to include the input text in the output
false
true/false
true
frequency_penalty
-
number
❌
Adjust the penalty for repeating tokens
0
-2.0 ~ 2.0
0.5
logit_bias
-
object
❌
Adjust the probability of specific tokens (e.g., { “100”: 2.0 })
null
Key: token ID, Value: -100~100
{ “100”: 2.0 }
logprobs
-
integer
❌
Return the probabilities of the top logprobs tokens
null
1 ~ 5
5
max_completion_tokens
-
integer
❌
Limit the maximum number of generated tokens
None
0~model maximum value
100
max_tokens (Deprecated)
-
integer
❌
Limit the maximum number of generated tokens
None
0~model maximum value
100
n
-
integer
❌
Specify the number of responses to generate
1
3
presence_penalty
-
number
❌
Adjust the penalty for tokens already present in the text
0
-2.0 ~ 2.0
1.0
seed
-
integer
❌
Specify a seed value for randomness control
None
stop
-
string / array / null
❌
Stop generating when a specific string is encountered
null
"\n"
stream
-
boolean
❌
Whether to return the results in a streaming manner
false
true/false
true
stream_options
include_usage, continuous_usage_stats
object
❌
Control streaming options (e.g., include usage statistics)
null
{ “include_usage”: true }
temperature
-
number
❌
Control the creativity of the generated response (higher means more random)
1
0.0 ~ 1.0
0.7
top_p
-
number
❌
Limit the sampling probability of tokens (higher means more tokens considered)
1
0.0 ~ 1.0
0.9
Table. Completions API - Body Parameters
### Example
```python
curl -X 'POST' \
'https://aios.private.kr-west1.e.samsungsdscloud.com/v1/completions' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model": "meta-llama/Meta-Llama-3.3-70B-Instruct",
"prompt": "What is the capital of Korea?",
"temperature": 0.7
}'
Response
200 OK
| Name | Type | Description |
|---|---|---|
| id | string | Unique identifier of the response |
| object | string | Type of the response object (e.g., “text_completion”) |
| created | integer | Creation time (Unix timestamp, seconds) |
| model | string | Name of the model used |
| choices | array | List of generated response choices |
| choices[].index | number | Index of the choice |
| choices[].text | string | Generated text object |
| choices[].logprobs | object | Token-wise log probability information (included based on settings) |
| choices[].finish_reason | string or null | Reason why the response was terminated (e.g., “stop”, “length” etc.) |
| choices[].stop_reason | object or null | Additional termination reason details |
| choices[].prompt_logprobs | object or null | Log probability of input prompt tokens (may be null) |
| usage | object | Token usage statistics |
| usage.prompt_tokens | number | Number of tokens used in the input prompt |
| usage.total_tokens | number | Total number of tokens (input + output) |
| usage.completion_tokens | number | Number of tokens used in the generated response |
| usage.prompt_tokens_details | object | Details of prompt token usage |
<div class="figure-caption">
Table. Completions API - 200 OK
</div>
Error Code
| HTTP status code | Error Code Description |
|---|---|
| 400 | Bad Request |
| 422 | Validation Error |
| 500 | Internal Server Error |
Example
{
"id": "cmpl-scp-aios-completions",
"object": "text_completion",
"created": 1749702612,
"model": "meta-llama/Meta-Llama-3.3-70B-Instruct",
"choices": [
{
"index": 0,
"text": " \nOur capital city is Seoul. \n\nA. 1\nB. ",
"logprobs": null,
"finish_reason": "length",
"stop_reason": null,
"prompt_logprobs": null
}
],
"usage": {
"prompt_tokens": 9,
"total_tokens": 25,
"completion_tokens": 16,
"prompt_tokens_details": null
}
}
Reference
Embedding API
POST /v1/embeddings
Overview
The Embedding API converts text into high-dimensional vectors (embeddings) that can be used for various natural language processing (NLP) tasks, such as calculating text similarity, clustering, and search.
Request
Context
| Key | Type | Description | Example |
|---|---|---|---|
| Base URL | string | URL for AIOS API requests | application/json |
| Request Method | string | HTTP method used for API requests | POST |
| Headers | object | Header information required for requests | { “accept”: “application/json”, “Content-Type”: “application/json” } |
| Body Parameters | object | Parameters included in the request body | { “model”: “sds/bge-m3”, “input”: “What is the capital of France?”} |
Path Parameters
| Name | type | Required | Description | Default value | Boundary value | Example |
|---|---|---|---|---|---|---|
| None |
Query Parameters
| Name | type | Required | Description | Default value | Boundary value | Example |
|---|---|---|---|---|---|---|
| None |
Body Parameters
| Name | Name Sub | type | Required | Description | Default value | Boundary value | Example |
|---|---|---|---|---|---|---|---|
| model | - | string | ✅ | Specify the model to use for generating responses | “sds/bge-reranker-v2-m3” | ||
| input | - | array<string | ✅ | User’s search query or question | “What is the capital of France?" | ||
| encoding_format | - | string | ❌ | Specify the format to return the embedding | “float” | “float”, “base64” | [0.01319122314453125,0.057220458984375, … (omitted) |
| truncate_prompt_tokens | - | integer | ❌ | Limit the number of input tokens | > 0 | 100 |
Example
curl -X 'POST' \
'https://aios.private.kr-west1.e.samsungsdscloud.com/v1/embedding' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model": "sds/bge-m3",
"input": "What is the capital of France?",
"encoding_format": "float"
}'
Response
200 OK
| Name | Type | Description |
|---|---|---|
| id | string | Unique identifier of the response |
| object | string | Type of the response object (e.g., “list”) |
| created | number | Creation time (Unix timestamp, seconds) |
| model | string | Name of the model used |
| data | array | Array of objects containing embedding results |
| data.index | number | Index of the input text (e.g., order of input texts) |
| data.object | string | Type of data item |
| data.embedding | array | Embedding vector values of the input text (sds-bge-m3 is a 1024-dimensional float array) |
| usage | object | Token usage statistics |
| usage.prompt_tokens | number | Number of tokens used in the input prompt |
| usage.total_tokens | number | Total number of tokens (input + output) |
| usage.completion_tokens | number | Number of tokens used in the generated response |
| usage.prompt_tokens_details | object | Detailed information about prompt tokens |
Error Code
| HTTP status code | Error Code Description |
|---|---|
| 400 | Bad Request |
| 422 | Validation Error |
| 500 | Internal Server Error |
Example
{
"id":"embd-scp-aios-embeddings",
"object":"list","created":1749035024,
"model":"sds/bge-m3",
"data":[
{
"index":0,
"object":"embedding",
"embedding":
[0.01319122314453125,0.057220458984375,-0.028533935546875,-0.0008697509765625,-0.01422119140625,0.033416748046875,-0.0062408447265625,-0.04364013671875,-0.004497528076171875,0.0008072853088378906,-0.0193328857421875,0.041168212890625,-0.019317626953125,-0.0188751220703125,-0.047088623046875,
-0 ....(omitted)
-0.05706787109375,-0.0147705078125]
}
],
"usage":
{
"prompt_tokens":9,
"total_tokens":9,
"completion_tokens":0,
"prompt_tokens_details":null
}
}
Reference
1.3.2 - SDK Reference
SDK Reference Overview
AIOS models are compatible with OpenAI’s API, so they are also compatible with OpenAI’s SDK. The following is a list of OpenAI and Cohere compatible APIs supported by Samsung Cloud Platform AIOS service.
| API Name | API | Detailed Description | Supported SDK |
|---|---|---|---|
| Text Completion API | Generates a natural sentence that follows the given input string. |
| |
| Conversation Completion API | Generates a response that follows the conversation content. |
| |
| Embeddings API | Converts text into a high-dimensional vector (embedding) that can be used for various natural language processing (NLP) tasks such as text similarity calculation, clustering, and search. |
| |
| Rerank API | Applies an embedding model or a cross-encoder model to predict the relevance between a single query and each item in a document list. |
|
- The SDK Reference guide is based on a Virtual Server environment with Python installed.
- The actual execution may differ from the example in terms of token count and message content.
OpenAI SDK
Installing the openai Package
Install the OpenAI package.
pip install openai
Text Completion API
The Text Completion API generates a natural sentence that follows the given input string.
/v1/completions
Request
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for AIOS model calls.
model = "<<model>>" # Enter the model ID for AIOS model calls.
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
response = client.completions.create(
model=model,
prompt="Hi"
)
Response
The text field in choices contains the model’s response.
Completion(
id='cmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
CompletionChoice(
finish_reason='length',
index=0,
logprobs=None,
text=' future president of the United States, I hope you’re doing well. As a',
stop_reason=None,
prompt_logprobs=None
)
],
created=1750000000,
model='<<model>>',
object='text_completion',
stream request
stream can be used to receive the completed answer one by one, rather than receiving the entire answer at once, as the model generates tokens.
Request
Set the stream parameter value to True.
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # AIOS model call endpoint-url to be input for AIOS model call
model = "<<model>>" # AIOS model call model ID to be input for AIOS model call
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
response = client.completions.create(
model=model,
prompt="Hi",
stream=True
)
# Receive the response as the model generates tokens.
for chunk in response:
print(chunk)
Response
Each token generates an answer, and each token can be checked in the choices’s text field.
Completion(
id='cmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
CompletionChoice(
finish_reason=None,
index=0,
logprobs=None,
text='.',
stop_reason=None
)
],
created=1750000000,
model='<<model>>',
object='text_completion',
system_fingerprint=None,
usage=None
)
Completion(..., choices=[CompletionChoice(..., text=' I', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text="'m", ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' looking', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' for', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' a', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' way', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' to', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' check', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' if', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' a', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' specific', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' process', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' is', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' running', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' on', ...)], ...)
Completion(..., choices=[], ...,
usage=CompletionUsage(
completion_tokens=16,
prompt_tokens=2,
total_tokens=18,
completion_tokens_details=None,
prompt_tokens_details=None
)
)
conversation completion API
The conversation completion API takes a list of messages in order as input and responds with a message that is suitable for the current context as the next order.
/v1/chat/completions
Request
Text message only, you can call as follows:
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # AIOS model call for aios endpoint-url to enter.
model = "<<model>>" # AIOS model call for model ID to enter.
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
response = client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hi"}
]
)
Response
You can check the model’s answer in the choices’s message.
ChatCompletion(
id='chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
Choice(
finish_reason='stop',
index=0,
logprobs=None,
message=ChatCompletionMessage(
content='Hello. How can I assist you today?',
refusal=None,
role='assistant',
annotations=None,
audio=None,
function_call=None,
tool_calls=[],
reasoning_content=None
),
stop_reason=None
)
],
created=1750000000,
model='<<model>>',
object='chat.completion',
service_tier=None,
system_fingerprint=None,
usage=CompletionUsage(
completion_tokens=10,
prompt_tokens=42,
total_tokens=52,
completion_tokens_details=None,
prompt_tokens_details=None
),
prompt_logprobs=None
)
Stream Request
Using stream, you can wait for the model to generate all answers and receive the response at once, or receive and process the response for each token generated by the model.
Request
Enter True as the value of the stream parameter.
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # AIOS model call for aios endpoint-url to enter.
model = "<<model>>" # AIOS model call for model ID to enter.
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
response = client.chat.completions.create(
model="meta-llama/Llama-3.3-70B-Instruct",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hi"}
],
stream=True
)
# You can receive a response each time the model generates a token.
for chunk in response:
print(chunk)
Response
Each token generates a response, and each token can be checked in the choices field of the delta field.
ChatCompletionChunk(
id='chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
Choice(
delta=ChoiceDelta(
content='',
function_call=None,
refusal=None,
role='assistant',
tool_calls=None
),
finish_reason=None,
index=0,
logprobs=None
)
],
created=1750000000,
model='<<model>>',
object='chat.completion.chunk',
service_tier=None,
system_fingerprint=None,
usage=None
)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content='It', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content="'s", ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' nice', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' to', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content='meet', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' you', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content='.', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' Is', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' there', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' something', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' I', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' can', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' help', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' you', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' with', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' or', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' would', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' you', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' like', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' to', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' chat', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content='?', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content='', ...), ...)], ...)
ChatCompletionChunk(..., choices=[], ...,
usage=CompletionUsage(
completion_tokens=23,
prompt_tokens=42,
total_tokens=65,
completion_tokens_details=None,
prompt_tokens_details=None
)
)
Tool Calling
Tool calling refers to the interface of external tools defined outside the model, allowing the model to generate answers that can perform suitable tools in the current context.
Using tool call, you can define metadata for functions that the model can execute and utilize them to generate answers.
Request
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # AIOS model call endpoint URL
model = "<<model>>" # AIOS model ID
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Function to get weather information
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current temperature for provided coordinates in celsius.",
"parameters": {
"type": "object",
"properties": {
"latitude": {"type": "number"},
"longitude": {"type": "number"}
},
"required": ["latitude", "longitude"],
"additionalProperties": False
},
"strict": True
}
}]
messages = [{“role”: “user”, “content”: “What is the weather like in Paris today?”}]
response = client.chat.completions.create( model=model, messages=messages, tools=tools # Inform the model of the metadata of the tools that can be used. )
Response
choices’s message.tool_calls can be used to check how the model determines the execution method of the tool.
In the following example, you can see that the tool_calls’s function uses the get_weather function and checks what arguments should be inserted.
ChatCompletion(
id='chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
Choice(
finish_reason='tool_calls',
index=0,
logprobs=None,
message=ChatCompletionMessage(
content=None,
refusal=None,
role='assistant',
annotations=None,
audio=None,
function_call=None,
tool_calls=[
ChatCompletionMessageToolCall(
id='chatcmpl-tool-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
function=Function(
arguments='{"latitude": 48.8566, "longitude": 2.3522}',
name='get_weather'
),
type='function'
)
],
reasoning_content=None
),
stop_reason=None
)
],
created=1750000000,
model='<<model>>',
object='chat.completion',
service_tier=None,
system_fingerprint=None,
usage=CompletionUsage(
completion_tokens=19,
prompt_tokens=194,
total_tokens=213,
completion_tokens_details=None,
prompt_tokens_details=None
),
prompt_logprobs=None
)
tool message
After adding the result value of the function as a tool message and generating the model’s response again, you can create an answer using the result value.
Request
Based on tool_calls’s function.arguments in the response data, you can actually call the function.
import json
# example function, always responds with 14 degrees.
def get_weather(latitude, longitude):
return "14℃"
tool_call = response.choices[0].message.tool_calls[0]
args = json.loads(tool_call.function.arguments)
result = get_weather(args["latitude"], args["longitude"]) # "14℃"
After adding the result value of the function as a tool message to the conversation context and calling the model again,
the model can create an appropriate answer using the result value of the function.
# Add the model's tool call message to messages
messages.append(response.choices[0].message)
# Add the result of the actual function call to messages
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": str(result)
})
response_2 = client.chat.completions.create(
model=model,
messages=messages,
# tools=tools
Response
ChatCompletion(
id='chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
Choice(
finish_reason='stop',
index=0,
logprobs=None,
message=ChatCompletionMessage(
content='The current weather in Paris is 14℃.',
refusal=None,
role='assistant',
annotations=None,
audio=None,
function_call=None,
tool_calls=[],
reasoning_content=None
),
stop_reason=None
)
],
created=1750000000,
model='<<model>>',
object='chat.completion',
service_tier=None,
system_fingerprint=None,
usage=CompletionUsage(
completion_tokens=11,
prompt_tokens=74,
total_tokens=85,
completion_tokens_details=None,
prompt_tokens_details=None
),
prompt_logprobs=None
)
Reasoning
Request
Reasoning is supported in models that provide a reasoning value, which can be checked as follows:
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for AIOS model calls.
model = "<<model>>" # Enter the model ID for AIOS model calls.
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
response = client.chat.completions.create(
model=model,
messages=[
{"role": "user", "content": "9.11 and 9.8, which is greater?"}
],
)
Response
The choices of the message field can be checked to see the content and also the reasoning_content, which provides the reasoning tokens.
ChatCompletion(
id='chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
Choice(
finish_reason='stop',
index=0,
logprobs=None,
message=ChatCompletionMessage(
content='''
To determine whether 9.11 or 9.8 is larger, we compare the decimal parts since both numbers have the same whole number part (9).
1. Convert both numbers to the same decimal places for easier comparison:
- 9.11 remains as is.
- 9.8 can be written as 9.80.
2. Compare the tenths place:
- The tenths place of 9.11 is 1.
- The tenths place of 9.80 is 8.
3. Since 8 (from 9.80) is greater than 1 (from 9.11), 9.80 (or 9.8) is larger.
4. Verification by subtraction:
- Subtracting 9.11 from 9.8 gives \(9.80 - 9.11 = 0.69\), which is positive, confirming 9.8 is larger.
Thus, the larger number is \(\boxed{9.8}\).
''',
refusal=None,
role='assistant',
annotations=None,
audio=None,
function_call=None,
tool_calls=[],
reasoning_content="""Okay, so I need to figure out whether 9.11 is bigger than 9.8 or vice versa.
Hmm, let me think. Both numbers start with 9, so the whole number part is the same.
Here is the translation of the given text:
That means the difference must be in the decimal parts.
First, I remember that when comparing decimals, you look at the digits one by one after the decimal point.
The first digit after the decimal is the tenths place, then hundredths, and so on.
Since both numbers have 9 in the units place, I can focus on the decimal parts: 0.11 versus 0.08.
Wait, let me write them out to be clear. 9.11 is 9 and 11 hundredths, and 9.8 is 9 and 8 tenths.
Wait a second, maybe I should convert them to the same decimal places to make it easier.
So, 9.8 is the same as 9.80, right? Because adding a zero at the end of a decimal doesn't change its value.
That way, both numbers have two decimal places. Now comparing 9.11 and 9.80.
Looking at the tenths place first: 9.11 has 1 tenth, and 9.80 has 8 tenths.
Since 8 is more than 1, that means 9.80 is larger than 9.11. Wait, but hold on, the original numbers were 9.11 and 9.8. Did I make a mistake here?
Let me double-check. 9.8 is indeed equal to 9.80.
So when comparing tenths, 8 tenths (from 9.80) is definitely more than 1 tenth (from 9.11). So that would mean 9.8 is bigger.
But wait, maybe I confused the decimal places. Let me think again.
The first number after the decimal is tenths, so in 9.11, the first digit is 1 (tenths place) and the second is 1 (hundredths).
In 9.8, the first digit is 8 (tenths place), and there's no hundredths, which is equivalent to 0.
So comparing the tenths place first: 1 vs 8. Since 8 is greater than 1, 9.8 is larger.
But why did I initially think maybe 9.11 was bigger? Maybe because 11 is a two-digit number?
But no, in decimals, each position is a different place value.
The tenths place is more significant than the hundredths. So even though 11 hundredths is 0.11, that's still less than 0.8 (which is 8 tenths).
Another way to think about it: Convert both to fractions.
9.11 is 9 + 11/100, and 9.8 is 9 + 8/10.
Converting 8/10 to hundredths, that's 80/100.
So 9.8 is 9 + 80/100, whereas 9.11 is 9 + 11/100. Comparing the fractions: 80/100 is more than 11/100, so 9.8 is bigger.
Alternatively, maybe subtract them to see the difference. 9.8 minus 9.11 is 0.69, which is positive, so 9.8 is larger.
Wait, let me do the subtraction step by step.
9.8 minus 9.11:
First, write them aligned by decimal:
9.80
-9.11
-------
0.69
Yes, so the result is positive, so 9.8 is bigger.
Alternatively, if I had to do it without converting, maybe I can think in terms of money.
If these were dollars, 9.11 is $9.11 and 9.8 is $9.80.
Clearly, $9.80 is more than $9.11.
Hmm, so all these methods point to 9.8 being larger.
Maybe my initial confusion was because I saw 11 as a two, but
...omitted...
**Final Answer**
The number 9.8 is larger than 9.11. This is because when comparing the decimal parts, 0.8 (from 9.8) is greater than 0.11 (from 9.11).
Specifically, 9.8 can be written as 9.80, and comparing the tenths place (8 vs. 1) shows that 9.8 is larger.
The difference between them is 0.69, confirming that 9.8 is indeed the larger number.
**Final Answer**
\\boxed{9.8}"""
),
stop_reason=None
)
], created=1750000000, model=’«model»’, object=‘chat.completion’, service_tier=None, system_fingerprint=None, usage=CompletionUsage( completion_tokens=4167, prompt_tokens=27, total_tokens=4194, completion_tokens_details=None, prompt_tokens_details=None ), prompt_logprobs=None, kv_transfer_params=None )
### image to text
**vision**을 지원하는 모델의 경우, 다음과 같이 이미지를 입력할 수 있습니다.

<div class="scp-textbox scp-textbox-type-error">
<div class="scp-textbox-title">Note</div>
<div class="scp-textbox-contents">
<p>For models that support <strong>vision</strong>, there are limitations on the size and number of input images.</p>
<p>Please refer to <a href="/en/userguide/ai_ml/aios/overview/#provided-models">Provided Models</a> for more information on image input limitations.</p>
</div>
</div>
#### Request
You can input an image with **MIME type** and **base64**.
```python
import base64
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # AIOS endpoint-url for model calls
model = "<<model>>" # Model ID for AIOS model calls
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
image_path = "image/path.jpg"
def encode_image(image_path: str):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")
base64_image = encode_image(image_path)
response = client.chat.completions.create( model=model, messages=[ { “role”: “user”, “content”: [ {“type”: “text”, “text”: “what’s in this image?”}, { “type”: “image_url”, “image_url”: { “url”: f"data:image/jpeg;base64,{base64_image}", }, }, ] }, ], )
Response
The following is an analysis of the image to generate text.
ChatCompletion(
id='chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
Choice(
finish_reason='stop',
index=0,
logprobs=None,
message=ChatCompletionMessage(
content="""Here's what's in the image:
* **A golden retriever puppy:** The main subject is a light-colored golden retriever puppy lying on green grass.
* **A bone:** The puppy is holding a large bone in its paws and appears to be enjoying chewing on it.
* **Grass:** The puppy is lying on a well-maintained lawn.
* **Vegetation:** Behind the puppy, there are some shrubs and other greenery.
* **Outdoor setting:** The scene is outdoors, likely a backyard.""",
refusal=None,
role='assistant',
annotations=None,
audio=None,
function_call=None,
tool_calls=[],
reasoning_content=None
),
stop_reason=106
)
],
created=1750000000,
model='<<model>>',
object='chat.completion',
service_tier=None,
system_fingerprint=None,
usage=CompletionUsage(
completion_tokens=114,
prompt_tokens=276,
total_tokens=390,
completion_tokens_details=None,
prompt_tokens_details=None
),
prompt_logprobs=None,
kv_transfer_params=None
)
Embeddings API
Embeddings converts input text into a high-dimensional vector of a fixed dimension. The generated vector can be used for various natural language processing tasks such as text similarity, clustering, and search.
/v1/embeddings
Request
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # AIOS endpoint-url for model calls
model = "<<model>>" # Model ID for AIOS model calls
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
response = client.embeddings.create(
input="What is the capital of France?",
model=model
)
Response
data receives the converted value in vector form as a response.
CreateEmbeddingResponse(
data=[
Embedding(
embedding=[
0.01319122314453125,
0.057220458984375,
-0.028533935546875,
-0.0008697509765625,
-0.01422119140625,
...omitted...
],
index=0,
object='embedding'
)
],
model='<<model>>',
object='list',
usage=Usage(
prompt_tokens=9,
total_tokens=9,
completion_tokens=0,
prompt_tokens_details=None
),
id='embd-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
created=1750000000
)
Cohere SDK
The Rerank API is compatible with the Cohere SDK.
Installing the Cohere Package
The Cohere SDK can be used by installing the Cohere package.
pip install cohere
Rerank API
Rerank calculates the relevance between the given query and documents, and ranks them. It can help improve the performance of RAG (Retrieval-Augmented Generation) structure applications by adjusting relevant documents to the front.
/v2/rerank
Request
import cohere
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for AIOS model calls.
model = "<<model>>" # Enter the model ID for AIOS model calls.
client = cohere.ClientV2("EMPTY_KEY", base_url=aios_base_url)
docs = [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France."
]
response = client.rerank(
model=model,
query="What is the capital of France?",
documents=docs,
top_n=3,
)
Response
In results, you can check the documents sorted in order of relevance to the query.
V2RerankResponse(
id='rerank-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
results=[
V2RerankResponseResultsItem(
document=V2RerankResponseResultsItemDocument(
text='The capital of France is Paris.'
),
index=0,
relevance_score=1.0
),
V2RerankResponseResultsItem(
Here is the translated text:
document=V2RerankResponseResultsItemDocument(
text='France capital city is known for the Eiffel Tower.'
),
index=1,
relevance_score=1.0
),
V2RerankResponseResultsItem(
document=V2RerankResponseResultsItemDocument(
text='Paris is located in the north-central part of France.'
),
index=2,
relevance_score=0.982421875
)
], meta=None, model=’«model»’, usage={ ’total_tokens’: 62 } )
Langchain SDK
Langchain’s SDK is also composed of OpenAI and Cohere SDKs, so you can use the Langchain SDK.
langchain package installation
The Langchain SDK can be used with the AIOS model after installing the langchain package.
pip install langchain langchain-openai langchain-cohere langchain-together
The langchain-openai package can be used to utilize the text completion API and conversation completion API.
langchain_openai.OpenAI
When the text completion model (langchain_openai.OpenAI) is invoked, the result value is generated as text.
Request
from langchain_openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for AIOS model calls.
model = "<<model>>" # Enter the model ID for AIOS model calls.
llm = OpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
llm.invoke("Can you introduce yourself in 5 words?")
Response
"""Hi, I'm a fun artist!
...omitted..."""
langchain_openai.ChatOpenAI
When the conversation completion model (langchain_openai.ChatOpenAI) is invoked, the result value is generated as an AIMessage or Message object.
Request
from langchain_openai import ChatOpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for AIOS model calls.
model = "<<model>>" # Enter the model ID for AIOS model calls.
chat_llm = ChatOpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
chat_completion = chat_llm.invoke("Can you introduce yourself in 5 words?")
chat_completion.pretty_print()
Response
================================== Ai Message ==================================
I am an AI assistant.
embeddings
Embeddings models such as langchain-together, langchain-fireworks can be used.
Request
from langchain_together import TogetherEmbeddings
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for AIOS model invocation.
model = "<<model>>" # Enter the model ID for AIOS model invocation.
embedding = TogetherEmbeddings(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
embedding.embed_query("What is the capital of France?")
Response
[
0.01319122314453125,
0.057220458984375,
-0.028533935546875,
-0.0008697509765625,
-0.01422119140625,
...omitted...
]
rerank
Rerank models can utilize langchain-cohere’s CohereRerank.
Request
from langchain_cohere.rerank import CohereRerank
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for AIOS model invocation.
model = "<<model>>" # Enter the model ID for AIOS model invocation.
rerank = CohereRerank(
base_url=aios_base_url,
cohere_api_key="EMPTY_KEY",
model=model
)
docs = [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France."
]
rerank.rerank(
documents=docs,
query="What is the capital of France?",
top_n=3
)
Response
[
{'index': 0, 'relevance_score': 1.0},
{'index': 1, 'relevance_score': 1.0},
{'index': 2, 'relevance_score': 0.982421875}
]
1.3.3 - Tutorial
Tutorial
We provide a tutorial that can be used with AIOS.
| Category | Description |
|---|---|
| Chat Playground | 웹 기반 Playground을 만들고 활용하는 방법
|
| RAG | Creating a RAG-based PR review assistance chatbot
|
| Autogen | Creating an agent application using Autogen
|
1.3.3.1 - Chat Playground
Goal
This tutorial introduces how to create and utilize a web-based Playground to easily test the APIs of various AI models provided by AIOS using Streamlit in an SCP for Enterprise environment.
Environment
To proceed with this tutorial, the following environment must be prepared:
System Environment
- Python 3.10 +
- pip
Required installation packages
pip install streamlitpip install streamlitPython-based open-source web application framework, it is a very suitable tool for visually expressing and sharing data science, machine learning, and data analysis results. Without complex web development knowledge, you can quickly create a web interface by writing just a few lines of code.
Implementation
Pre-check
The application checks if the model call is normal with curl in the environment where it is running. Here, AIOS_LLM_Private_Endpoint refers to the LLM usage guide please refer to it.
- Example: {AIOS LLM Private Endpoint}/{API}
curl -H "Content-Type: application/json" \
-d '{"model": "meta-llama/Llama-3.3-70B-Instruct"
, "prompt" : "Hello, I am jihye, who are you"
, "temperature": 0
, "max_tokens": 100
, "stream": false}' -L AIOS_LLM_Private_Endpointcurl -H "Content-Type: application/json" \
-d '{"model": "meta-llama/Llama-3.3-70B-Instruct"
, "prompt" : "Hello, I am jihye, who are you"
, "temperature": 0
, "max_tokens": 100
, "stream": false}' -L AIOS_LLM_Private_Endpointchoices’s text field contains the model’s answer, which can be confirmed.
{"id":"cmpl-4ac698a99c014d758300a3ec5583d73b","object":"text_completion","created":1750140201,"model":"meta-llama/Llama-3.3-70B-Instruct","choices":[{"index":0,"text":"?\nI am a student who is studying English.\nI am interested in learning about different cultures and making friends from around the world.\nI like to watch movies, listen to music, and read books in my free time.\nI am looking forward to chatting with you and learning more about your culture and way of life.\nNice to meet you, jihye! I'm happy to chat with you and learn more about culture. What kind of movies, music, and books do you enjoy? Do","logprobs":null,"finish_reason":"length","stop_reason":null,"prompt_logprobs":null}],"usage":{"prompt_tokens":11,"total_tokens":111,"completion_tokens":100}}
Project Structure
chat-playground
├── app.py # streamlit main web app file
├── endpoints.json # AIOS model's call type definition
├── img
│ └── aios.png
└── models.json # AIOS model list
Chat Playground code
- models.json, endpoints.json files must exist and be configured in the appropriate format, please refer to the code below.
- 코드 내 BASE_URL 은 LLM 이용 가이드를 참고하여 AIOS LLM Private Endpoint 주소로 수정해야 합니다 should be translated to: - The BASE_URL in the code must be modified to the AIOS LLM Private Endpoint address, referring to the LLM usage guide.
- This Playground is designed with a one-time request-based structure, so users can provide input values, press a button, send a request once, and check the result in this way, which allows for quick testing and response verification without complex session management.
- The parameters of Model, Type, Temperature, Max Tokens configured in the sidebar are an interface configured through st.sidebar, and can be freely extended or modified as needed.
- st.file_uploader() uploaded images (files) exist as temporary BytesIO objects on the server memory and are not automatically saved to disk.
app.py
streamlit main web app file. here, the BASE_URL, AIOS_LLM_Private_Endpoint, please refer to the LLM usage guide
import streamlit as st
import base64
import json
import requests
from urllib.parse import urljoin
BASE_URL = "AIOS_LLM_Private_Endpoint"
# ===== Setting =====
st.set_page_config(page_title="AIOS Chat Playground", layout="wide")
st.title("🤖 AIOS Chat Playground")
# ===== Common Functions =====
def load_models():
with open("models.json", "r") as f:
return json.load(f)
def load_endpoints():
with open("endpoints.json", "r") as f:
return json.load(f)
models = load_models()
endpoints_config = load_endpoints()
# ===== Sidebar Settings =====
st.sidebar.title('Hello!')
st.sidebar.image("img/aios.png")
st.sidebar.header("⚙️ Setting")
model = st.sidebar.selectbox("Model", models)
endpoint_labels = [ep["label"] for ep in endpoints_config]
endpoint_label = st.sidebar.selectbox("Type", endpoint_labels)
selected_endpoint = next(ep for ep in endpoints_config if ep["label"] == endpoint_label)
temperature = st.sidebar.slider("🔥 Temperature", 0.0, 1.0, 0.7)
max_tokens = st.sidebar.number_input("🧮 Max Tokens", min_value=1, max_value=5000, value=100)
base_url = BASE_URL
path = selected_endpoint["path"]
endpoint_type = selected_endpoint["type"]
api_style = selected_endpoint.get("style", "openai") # openai or cohere
# ===== Input UI =====
prompt = ""
docs = []
image_base64 = None
if endpoint_type == "image":
prompt = st.text_area("✍️ Enter your question:", "Explain this image.")
uploaded_image = st.file_uploader("🖼️ Upload an image", type=["png", "jpg", "jpeg"])
if uploaded_image:
st.image(uploaded_image, caption="Uploaded image", use_container_width=300)
image_bytes = uploaded_image.read()
image_base64 = base64.b64encode(image_bytes).decode("utf-8")
elif endpoint_type == "rerank":
prompt = st.text_area("✍️ Enter your query:", "What is the capital of France?")
raw_docs = st.text_area("📄 Documents (one per line)", "The capital of France is Paris.\nFrance capital city is known for the Eiffel Tower.\nParis is located in the north-central part of France.")
docs = raw_docs.strip().splitlines()
elif endpoint_type == "reasoning":
prompt = st.text_area("✍️ Enter prompt:", "9.11 and 9.8, which is greater?")
elif endpoint_type == "embedding":
prompt = st.text_area("✍️ Enter prompt:", "What is the capital of France?")
else:
prompt = st.text_area("✍️ Enter prompt:", "Hello, who are you?")
uploaded_image = st.file_uploader("🖼️ Upload an image (Optional)", type=["png", "jpg", "jpeg"])
if uploaded_image:
image_bytes = uploaded_image.read()
image_base64 = base64.b64encode(image_bytes).decode("utf-8")
# ===== Call Button =====
if st.button("🚀 Invoke model"):
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer EMPTY_KEY"
}
try:
if endpoint_type == "chat":
url = urljoin(base_url, "v1/chat/completions")
payload = {
"model": model,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
],
"temperature": temperature,
"max_tokens": max_tokens
}
elif endpoint_type == "completion":
url = urljoin(base_url, "v1/completions")
payload = {
"model": model,
"prompt": prompt,
"temperature": temperature,
"max_tokens": max_tokens
}
elif endpoint_type == "embedding":
url = urljoin(base_url, "v1/embeddings")
payload = {
"model": model,
"input": prompt
}
elif endpoint_type == "reasoning":
url = urljoin(BASE_URL, "v1/chat/completions")
payload = {
"model": model,
"messages": [
{"role": "user", "content": prompt}
],
"temperature": temperature,
"max_tokens": max_tokens
}
elif endpoint_type == "image":
url = urljoin(base_url, "v1/chat/completions")
if not image_base64:
st.warning("🖼️ Upload an image")
st.stop()
payload = {
"model": model,
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": prompt},
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_base64}"}}
]
}
]
}
elif endpoint_type == "rerank":
url = urljoin(base_url, "v2/rerank")
payload = {
"model": model,
"query": prompt,
"documents": docs,
"top_n": len(docs)
}
else:
st.error("❌ Unknown endpoint type")
st.stop()
st.expander("📤 Request payload").code(json.dumps(payload, indent=2), language="json")
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
res = response.json()
# ===== Response Parsing =====
if endpoint_type == "chat" or endpoint_type == "image":
output = res["choices"][0]["message"]["content"]
elif endpoint_type == "completion":
output = res["choices"][0]["text"]
elif endpoint_type == "embedding":
vec = res["data"][0]["embedding"]
output = f"🔢 Vector dimensions: {len(vec)}"
st.expander("📐 Vector preview").code(vec[:20])
elif endpoint_type == "rerank":
results = res["results"]
output = "\n\n".join(
[f"{i+1}. The document text (score: {r['relevance_score']:.3f})" for i, r in enumerate(results)]
)
elif endpoint_type == "reasoning":
message = res.get("choices", [{}])[0].get("message", {})
reasoning = message.get("reasoning_content", "❌ No reasoning_content")
content = message.get("content", "❌ No content")
output = f"""📘 <b>response:</b><br>{content}<br><br>🧠 <b>Reasoning:</b><br>{reasoning}"""
st.success("✅ Model response:")
st.markdown(f"<div style='padding:1rem;background:#f0f0f0;border-radius:8px'>{output}</div>", unsafe_allow_html=True)
st.expander("📦 View full response").json(res)
except requests.RequestException as e:
st.error("❌ Request failed")
st.code(str(e))import streamlit as st
import base64
import json
import requests
from urllib.parse import urljoin
BASE_URL = "AIOS_LLM_Private_Endpoint"
# ===== Setting =====
st.set_page_config(page_title="AIOS Chat Playground", layout="wide")
st.title("🤖 AIOS Chat Playground")
# ===== Common Functions =====
def load_models():
with open("models.json", "r") as f:
return json.load(f)
def load_endpoints():
with open("endpoints.json", "r") as f:
return json.load(f)
models = load_models()
endpoints_config = load_endpoints()
# ===== Sidebar Settings =====
st.sidebar.title('Hello!')
st.sidebar.image("img/aios.png")
st.sidebar.header("⚙️ Setting")
model = st.sidebar.selectbox("Model", models)
endpoint_labels = [ep["label"] for ep in endpoints_config]
endpoint_label = st.sidebar.selectbox("Type", endpoint_labels)
selected_endpoint = next(ep for ep in endpoints_config if ep["label"] == endpoint_label)
temperature = st.sidebar.slider("🔥 Temperature", 0.0, 1.0, 0.7)
max_tokens = st.sidebar.number_input("🧮 Max Tokens", min_value=1, max_value=5000, value=100)
base_url = BASE_URL
path = selected_endpoint["path"]
endpoint_type = selected_endpoint["type"]
api_style = selected_endpoint.get("style", "openai") # openai or cohere
# ===== Input UI =====
prompt = ""
docs = []
image_base64 = None
if endpoint_type == "image":
prompt = st.text_area("✍️ Enter your question:", "Explain this image.")
uploaded_image = st.file_uploader("🖼️ Upload an image", type=["png", "jpg", "jpeg"])
if uploaded_image:
st.image(uploaded_image, caption="Uploaded image", use_container_width=300)
image_bytes = uploaded_image.read()
image_base64 = base64.b64encode(image_bytes).decode("utf-8")
elif endpoint_type == "rerank":
prompt = st.text_area("✍️ Enter your query:", "What is the capital of France?")
raw_docs = st.text_area("📄 Documents (one per line)", "The capital of France is Paris.\nFrance capital city is known for the Eiffel Tower.\nParis is located in the north-central part of France.")
docs = raw_docs.strip().splitlines()
elif endpoint_type == "reasoning":
prompt = st.text_area("✍️ Enter prompt:", "9.11 and 9.8, which is greater?")
elif endpoint_type == "embedding":
prompt = st.text_area("✍️ Enter prompt:", "What is the capital of France?")
else:
prompt = st.text_area("✍️ Enter prompt:", "Hello, who are you?")
uploaded_image = st.file_uploader("🖼️ Upload an image (Optional)", type=["png", "jpg", "jpeg"])
if uploaded_image:
image_bytes = uploaded_image.read()
image_base64 = base64.b64encode(image_bytes).decode("utf-8")
# ===== Call Button =====
if st.button("🚀 Invoke model"):
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer EMPTY_KEY"
}
try:
if endpoint_type == "chat":
url = urljoin(base_url, "v1/chat/completions")
payload = {
"model": model,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
],
"temperature": temperature,
"max_tokens": max_tokens
}
elif endpoint_type == "completion":
url = urljoin(base_url, "v1/completions")
payload = {
"model": model,
"prompt": prompt,
"temperature": temperature,
"max_tokens": max_tokens
}
elif endpoint_type == "embedding":
url = urljoin(base_url, "v1/embeddings")
payload = {
"model": model,
"input": prompt
}
elif endpoint_type == "reasoning":
url = urljoin(BASE_URL, "v1/chat/completions")
payload = {
"model": model,
"messages": [
{"role": "user", "content": prompt}
],
"temperature": temperature,
"max_tokens": max_tokens
}
elif endpoint_type == "image":
url = urljoin(base_url, "v1/chat/completions")
if not image_base64:
st.warning("🖼️ Upload an image")
st.stop()
payload = {
"model": model,
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": prompt},
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_base64}"}}
]
}
]
}
elif endpoint_type == "rerank":
url = urljoin(base_url, "v2/rerank")
payload = {
"model": model,
"query": prompt,
"documents": docs,
"top_n": len(docs)
}
else:
st.error("❌ Unknown endpoint type")
st.stop()
st.expander("📤 Request payload").code(json.dumps(payload, indent=2), language="json")
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
res = response.json()
# ===== Response Parsing =====
if endpoint_type == "chat" or endpoint_type == "image":
output = res["choices"][0]["message"]["content"]
elif endpoint_type == "completion":
output = res["choices"][0]["text"]
elif endpoint_type == "embedding":
vec = res["data"][0]["embedding"]
output = f"🔢 Vector dimensions: {len(vec)}"
st.expander("📐 Vector preview").code(vec[:20])
elif endpoint_type == "rerank":
results = res["results"]
output = "\n\n".join(
[f"{i+1}. The document text (score: {r['relevance_score']:.3f})" for i, r in enumerate(results)]
)
elif endpoint_type == "reasoning":
message = res.get("choices", [{}])[0].get("message", {})
reasoning = message.get("reasoning_content", "❌ No reasoning_content")
content = message.get("content", "❌ No content")
output = f"""📘 <b>response:</b><br>{content}<br><br>🧠 <b>Reasoning:</b><br>{reasoning}"""
st.success("✅ Model response:")
st.markdown(f"<div style='padding:1rem;background:#f0f0f0;border-radius:8px'>{output}</div>", unsafe_allow_html=True)
st.expander("📦 View full response").json(res)
except requests.RequestException as e:
st.error("❌ Request failed")
st.code(str(e))models.json
AIOS model list. Refer to the LLM usage guide to set the model to be used.
[
"meta-llama/Llama-3.3-70B-Instruct",
"qwen/Qwen3-30B-A3B",
"qwen/QwQ-32B",
"google/gemma-3-27b-it",
"meta-llama/Llama-4-Scout",
"meta-llama/Llama-Guard-4-12B",
"sds/bge-m3",
"sds/bge-reranker-v2-m3"
There is no Korean text to translate.[
"meta-llama/Llama-3.3-70B-Instruct",
"qwen/Qwen3-30B-A3B",
"qwen/QwQ-32B",
"google/gemma-3-27b-it",
"meta-llama/Llama-4-Scout",
"meta-llama/Llama-Guard-4-12B",
"sds/bge-m3",
"sds/bge-reranker-v2-m3"
There is no Korean text to translate.endpoints.json
The call type of the AIOS model is defined, and the input screen and result are output differently according to the type.
[
{
"label": "Chat Model",
"path": "/v1/chat/completions",
"type": "chat"
},
{
"label": "Completion Model",
"path": "/v1/completions",
"type": "completion"
},
{
"label": "Embedding Model",
"path": "/v1/embeddings",
"type": "embedding"
},
{
"label": "Image Chat Model",
"path": "/v1/chat/completions",
"type": "image"
},
{
"label": "Rerank Model",
"path": "/v2/rerank",
"type": "rerank"
},
{
"label": "Reasoning Model",
"path": "/v1/chat/completions",
"type": "reasoning"
}
There is no Korean text to translate.[
{
"label": "Chat Model",
"path": "/v1/chat/completions",
"type": "chat"
},
{
"label": "Completion Model",
"path": "/v1/completions",
"type": "completion"
},
{
"label": "Embedding Model",
"path": "/v1/embeddings",
"type": "embedding"
},
{
"label": "Image Chat Model",
"path": "/v1/chat/completions",
"type": "image"
},
{
"label": "Rerank Model",
"path": "/v2/rerank",
"type": "rerank"
},
{
"label": "Reasoning Model",
"path": "/v1/chat/completions",
"type": "reasoning"
}
There is no Korean text to translate.Playground usage method
This document covers two ways to run Playground.
Run on Virtual Server
1. Running Streamlit on a Virtual Server
streamlit run app.py --server.port 8501 --server.address 0.0.0.0streamlit run app.py --server.port 8501 --server.address 0.0.0.0You can now view your Streamlit app in your browser.
URL: http://0.0.0.0:8501
Access from http://{your_server_ip}:8501 in the browser or after setting up server SSH tunneling http://localhost:8501. Refer to the following for SSH tunneling:
2. Accessing Virtual Server through tunneling on a local PC (when accessing http://localhost:8501)
ssh -i {your_pemkey.pem} -L 8501:localhost:8501 ubuntu@{your_server_ip}ssh -i {your_pemkey.pem} -L 8501:localhost:8501 ubuntu@{your_server_ip}Running on SCP Kubernetes Engine
1. Deployment and Service startup
The following YAML is executed to start the Deployment and Service. It provides a container image packaged with code and Python library files to run the Chat Playground tutorial.
apiVersion: apps/v1
kind: Deployment
metadata:
name: streamlit-deployment
spec:
replicas: 1
selector:
matchLabels:
app: streamlit
template:
metadata:
labels:
app: streamlit
spec:
containers:
- name: streamlit-app
image: aios-zcavifox.scr.private.kr-west1.e.samsungsdscloud.com/tutorial/chat-playground:v1.0
ports:
- containerPort: 8501
---
apiVersion: v1
kind: Service
metadata:
name: streamlit-service
spec:
type: NodePort
selector:
app: streamlit
ports:
- protocol: TCP
port: 80
targetPort: 8501
nodePort: 30081apiVersion: apps/v1
kind: Deployment
metadata:
name: streamlit-deployment
spec:
replicas: 1
selector:
matchLabels:
app: streamlit
template:
metadata:
labels:
app: streamlit
spec:
containers:
- name: streamlit-app
image: aios-zcavifox.scr.private.kr-west1.e.samsungsdscloud.com/tutorial/chat-playground:v1.0
ports:
- containerPort: 8501
---
apiVersion: v1
kind: Service
metadata:
name: streamlit-service
spec:
type: NodePort
selector:
app: streamlit
ports:
- protocol: TCP
port: 80
targetPort: 8501
nodePort: 30081kubectl apply -f run.yamlkubectl apply -f run.yaml$ kubectl get pod
NAME READY STATUS RESTARTS AGE
streamlit-deployment-8bfcd5959-6xpx9 1/1 Running 0 17s
$ kubectl logs streamlit-deployment-8bfcd5959-6xpx9
Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
You can now view your Streamlit app in your browser.
URL: http://0.0.0.0:8501
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 172.20.0.1 <none> 443/TCP 46h
streamlit-service NodePort 172.20.95.192 <none> 80:30081/TCP 130m
You can access it from the browser at http://{worker_node_ip}:30081 or after setting up the server SSH tunneling at http://localhost:8501. Please refer to the following for SSH tunneling.
2. Accessing worker nodes through tunneling on a local PC (when accessing http://localhost:8501)
ssh -i {your_pemkey.pem} -L 8501:{worker_node_ip}:30081 ubuntu@{worker_node_ip}ssh -i {your_pemkey.pem} -L 8501:{worker_node_ip}:30081 ubuntu@{worker_node_ip}3. Accessing worker nodes through a relay server by tunneling from a local PC (when accessing http://localhost:8501)
ssh -i {your_pemkey.pem} -L 8501:{worker_node_ip}:30081 ubuntu@{your_server_ip}ssh -i {your_pemkey.pem} -L 8501:{worker_node_ip}:30081 ubuntu@{your_server_ip}Usage example
Main screen composition
| Item | Description | |
|---|---|---|
| 1 | Model | This is a list of callable models set in the models.json file. |
| 2 | Endpoint type | must be selected according to the model call type set in the endpoints.json file to match the model |
| 3 | Temperature | The parameter that controls the degree of “randomness” or “creativity” of the model output. In this tutorial, it is specified in the range of 0.00 ~ 1.00.
|
| 4 | Max Tokens | Sets the maximum number of tokens that can be generated in the response text as an output length limit parameter. In this tutorial, it is specified in the range of 1 to 5000. |
| 5 | Input Area | The way to receive prompts, images, etc. varies depending on the endpoint type.
|
Calling the Chat Model
Image model calling
Reasoning model calling
Conclusion
Through this tutorial, I hope you have learned how to build and utilize a Playground UI that can easily test various AI model APIs provided by AIOS, and you can flexibly customize it to fit your desired model and endpoint structure for actual service purposes.
Reference link
1.3.3.3 - Autogen
Goal
Using the AI model provided by AIOS, create an Autogen AI Agent application.
Autogen is an open-source framework that can easily build and manage LLM-based multi-agent collaboration and event-driven automation workflows.
environment
To proceed with this tutorial, the following environment must be prepared.
System Environment
- Python 3.10 +
- pip
Required packages for installation
pip install autogen-agentchat==0.6.1 autogen-ext[openai,mcp]==0.6.1 mcp-server-time==0.6.2pip install autogen-agentchat==0.6.1 autogen-ext[openai,mcp]==0.6.1 mcp-server-time==0.6.2System Architecture
Shows the entire flow of the agent architecture using multi AI agent architecture and MCP.
Travel Planning Agent Flow
- The user requests to set up a 3-day Nepal travel plan
- Groupchat manger adjusts the execution order of registered agents (travel plan, local information, travel conversation, comprehensive summary)
- Each agent performs the given tasks collaboratively according to its respective role
- Once the final travel plan result is derived, deliver it to the user
MCP Flow
MCP
MCP(Model Context Protocol) is an open standard protocol that coordinates interactions between the model and external data or tools.
The MCP server is a server that implements this, using tool metadata to mediate and execute function calls.
- User queries about the current time in Korea
- mcp_server_time model request including metadata of a tool that can retrieve the current time via the server
get_current_timecalling the function tool calls message generation- Through the MCP server, by executing the
get_current_timefunction and passing the result to the model request, generate the final response and deliver it to the user.
Implementation
Travel Planning Agent
- Please refer to the LLM usage guide for the AIOS_BASE_URL AIOS_LLM_Private_Endpoint and the MODEL_ID of the MODEL.
autogen_travel_planning.py
from urllib.parse import urljoin
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_core.models import ModelFamily
# Set the API URL and model name for model access.
AIOS_BASE_URL = "AIOS_LLM_Private_Endpoint"
MODEL = "MODEL_ID"
# Create a model client using OpenAIChatCompletionClient.
model_client = OpenAIChatCompletionClient(
model=MODEL,
base_url=urljoin(AIOS_BASE_URL, "v1"),
api_key="EMPTY_KEY",
model_info={
# Set to True if images are supported.
"vision": False,
# Set to True if function calls are supported.
"function_calling": True,
# Set to True if JSON output is supported.
"json_output": True,
# If the model you want to use is not provided by ModelFamily, use UNKNOWN.
# "family": ModelFamily.UNKNOWN,
"family": ModelFamily.LLAMA_3_3_70B,
# Set to True if supporting structured output.
"structured_output": True,
},
)
# Create multiple agents.
# Each agent performs roles such as travel planning, local activity recommendations, providing language tips, and summarizing travel plans.
planner_agent = AssistantAgent(
"planner_agent",
model_client=model_client,
description="A helpful assistant that can plan trips.",
system_message=("You are a helpful assistant that can suggest a travel plan "
"for a user based on their request."),
)
local_agent = AssistantAgent(
"local_agent",
model_client=model_client,
description="A local assistant that can suggest local activities or places to visit.",
system_message=("You are a helpful assistant that can suggest authentic and ""
"interesting local activities or places to visit for a user "
"and can utilize any context information provided."),
)
language_agent = AssistantAgent(
"language_agent",
model_client=model_client,
description="A helpful assistant that can provide language tips for a given destination.",
system_message=("You are a helpful assistant that can review travel plans, "
"providing feedback on important/critical tips about how best to address ""
"language or communication challenges for the given destination. ""
"If the plan already includes language tips, "
"you can mention that the plan is satisfactory, with rationale."),
)
travel_summary_agent = AssistantAgent(
"travel_summary_agent",
model_client=model_client,
description="A helpful assistant that can summarize the travel plan.",
system_message=("You are a helpful assistant that can take in all of the suggestions "
"and advice from the other agents and provide a detailed final travel plan. ""
"You must ensure that the final plan is integrated and complete. "
"YOUR FINAL RESPONSE MUST BE THE COMPLETE PLAN. "
"When the plan is complete and all perspectives are integrated, "
"you can respond with TERMINATE."),
)
# Group the agents and create a RoundRobinGroupChat.
# RoundRobinGroupChat adjusts so that agents perform tasks in the order they are registered, taking turns.
# This group enables agents to interact and make travel plans.
# The termination condition uses TextMentionTermination to end the group chat when the text "TERMINATE" is mentioned.
termination = TextMentionTermination("TERMINATE")
group_chat = RoundRobinGroupChat(
[planner_agent, local_agent, language_agent, travel_summary_agent],
termination_condition=termination,
)
async def main():
"""Main function, runs group chat and makes travel plans."""
# Run a group chat to make travel plans.
# User requests the task "Plan a 3 day trip to Nepal."
# Print the results using the console.
await Console(group_chat.run_stream(task="Plan a 3 day trip to Nepal."))
await model_client.close()
if __name__ == "__main__":
import asyncio
asyncio.run(main())from urllib.parse import urljoin
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_core.models import ModelFamily
# Set the API URL and model name for model access.
AIOS_BASE_URL = "AIOS_LLM_Private_Endpoint"
MODEL = "MODEL_ID"
# Create a model client using OpenAIChatCompletionClient.
model_client = OpenAIChatCompletionClient(
model=MODEL,
base_url=urljoin(AIOS_BASE_URL, "v1"),
api_key="EMPTY_KEY",
model_info={
# Set to True if images are supported.
"vision": False,
# Set to True if function calls are supported.
"function_calling": True,
# Set to True if JSON output is supported.
"json_output": True,
# If the model you want to use is not provided by ModelFamily, use UNKNOWN.
# "family": ModelFamily.UNKNOWN,
"family": ModelFamily.LLAMA_3_3_70B,
# Set to True if supporting structured output.
"structured_output": True,
},
)
# Create multiple agents.
# Each agent performs roles such as travel planning, local activity recommendations, providing language tips, and summarizing travel plans.
planner_agent = AssistantAgent(
"planner_agent",
model_client=model_client,
description="A helpful assistant that can plan trips.",
system_message=("You are a helpful assistant that can suggest a travel plan "
"for a user based on their request."),
)
local_agent = AssistantAgent(
"local_agent",
model_client=model_client,
description="A local assistant that can suggest local activities or places to visit.",
system_message=("You are a helpful assistant that can suggest authentic and ""
"interesting local activities or places to visit for a user "
"and can utilize any context information provided."),
)
language_agent = AssistantAgent(
"language_agent",
model_client=model_client,
description="A helpful assistant that can provide language tips for a given destination.",
system_message=("You are a helpful assistant that can review travel plans, "
"providing feedback on important/critical tips about how best to address ""
"language or communication challenges for the given destination. ""
"If the plan already includes language tips, "
"you can mention that the plan is satisfactory, with rationale."),
)
travel_summary_agent = AssistantAgent(
"travel_summary_agent",
model_client=model_client,
description="A helpful assistant that can summarize the travel plan.",
system_message=("You are a helpful assistant that can take in all of the suggestions "
"and advice from the other agents and provide a detailed final travel plan. ""
"You must ensure that the final plan is integrated and complete. "
"YOUR FINAL RESPONSE MUST BE THE COMPLETE PLAN. "
"When the plan is complete and all perspectives are integrated, "
"you can respond with TERMINATE."),
)
# Group the agents and create a RoundRobinGroupChat.
# RoundRobinGroupChat adjusts so that agents perform tasks in the order they are registered, taking turns.
# This group enables agents to interact and make travel plans.
# The termination condition uses TextMentionTermination to end the group chat when the text "TERMINATE" is mentioned.
termination = TextMentionTermination("TERMINATE")
group_chat = RoundRobinGroupChat(
[planner_agent, local_agent, language_agent, travel_summary_agent],
termination_condition=termination,
)
async def main():
"""Main function, runs group chat and makes travel plans."""
# Run a group chat to make travel plans.
# User requests the task "Plan a 3 day trip to Nepal."
# Print the results using the console.
await Console(group_chat.run_stream(task="Plan a 3 day trip to Nepal."))
await model_client.close()
if __name__ == "__main__":
import asyncio
asyncio.run(main())When you run a file using python, you can see multiple agents working together, each performing its role for a single task.
python autogen_travel_planning.pypython autogen_travel_planning.pyExecution Result
---------- TextMessage (user) ----------
Plan a 3 day trip to Nepal.
---------- TextMessage (planner_agent) ----------
Nepal! A country with a rich cultural heritage, breathtaking natural beauty, and warm hospitality. Here's a suggested 3-day itinerary for your trip to Nepal:
**Day 1: Arrival in Kathmandu and Exploration of the City**
* Arrive at Tribhuvan International Airport in Kathmandu, the capital city of Nepal.
* Check-in to your hotel and freshen up.
* Visit the famous **Boudhanath Stupa**, one of the largest Buddhist stupas in the world.
* Explore the **Thamel** area, a popular tourist hub known for its narrow streets, shops, and restaurants.
* In the evening, enjoy a traditional Nepali dinner and watch a cultural performance at a local restaurant.
**Day 2: Kathmandu Valley Tour**
* Start the day with a visit to the **Pashupatinath Temple**, a sacred Hindu temple dedicated to Lord Shiva.
* Next, head to the **Kathmandu Durbar Square**, a UNESCO World Heritage Site and the former royal palace of the Malla kings.
* Visit the **Swayambhunath Stupa**, also known as the Monkey Temple, which offers stunning views of the city.
* In the afternoon, take a short drive to the **Patan City**, known for its rich cultural heritage and traditional crafts.
* Explore the **Patan Durbar Square** and visit the **Krishna Temple**, a beautiful example of Nepali architecture.
**Day 3: Bhaktapur and Nagarkot**
* Drive to **Bhaktapur**, a medieval town and a UNESCO World Heritage Site (approximately 1 hour).
* Explore the **Bhaktapur Durbar Square**, which features stunning architecture, temples, and palaces.
* Visit the **Pottery Square**, where you can see traditional pottery-making techniques.
* In the afternoon, drive to **Nagarkot**, a scenic hill station with breathtaking views of the Himalayas (approximately 1.5 hours).
* Watch the sunset over the Himalayas and enjoy the peaceful atmosphere.
**Additional Tips:**
* Make sure to try some local Nepali cuisine, such as momos, dal bhat, and gorkhali lamb.
* Bargain while shopping in the markets, as it's a common practice in Nepal.
* Respect local customs and traditions, especially when visiting temples and cultural sites.
* Stay hydrated and bring sunscreen, as the sun can be strong in Nepal.
**Accommodation:**
Kathmandu has a wide range of accommodation options, from budget-friendly guesthouses to luxury hotels. Some popular areas to stay include Thamel, Lazimpat, and Boudha.
**Transportation:**
You can hire a taxi or a private vehicle for the day to travel between destinations. Alternatively, you can use public transportation, such as buses or microbuses, which are affordable and convenient.
**Budget:**
The budget for a 3-day trip to Nepal can vary depending on your accommodation choices, transportation, and activities. However, here's a rough estimate:
* Accommodation: $20-50 per night
* Transportation: $10-20 per day
* Food: $10-20 per meal
* Activities: $10-20 per person
Total estimated budget for 3 days: $200-500 per person
I hope this helps, and you have a wonderful trip to Nepal!
---------- TextMessage (local_agent) ----------
Your 3-day itinerary for Nepal is well-planned and covers many of the country's cultural and natural highlights. Here are a few additional suggestions and tips to enhance your trip:
**Day 1:**
* After visiting the Boudhanath Stupa, consider exploring the surrounding streets, which are filled with Tibetan shops, restaurants, and monasteries.
* In the Thamel area, be sure to try some of the local street food, such as momos or sel roti.
* For dinner, consider trying a traditional Nepali restaurant, such as the Kathmandu Guest House or the Northfield Cafe.
**Day 2:**
* At the Pashupatinath Temple, be respectful of the Hindu rituals and customs. You can also take a stroll along the Bagmati River, which runs through the temple complex.
* At the Kathmandu Durbar Square, consider hiring a guide to provide more insight into the history and significance of the temples and palaces.
* In the afternoon, visit the Patan Museum, which showcases the art and culture of the Kathmandu Valley.
**Day 3:**
* In Bhaktapur, be sure to try some of the local pottery and handicrafts. You can also visit the Bhaktapur National Art Gallery, which features traditional Nepali art.
* At Nagarkot, consider taking a short hike to the nearby villages, which offer stunning views of the Himalayas.
* For sunset, find a spot with a clear view of the mountains, and enjoy the peaceful atmosphere.
**Additional Tips:**
* Nepal is a relatively conservative country, so dress modestly and respect local customs.
* Try to learn some basic Nepali phrases, such as "namaste" (hello) and "dhanyabaad" (thank you).
* Be prepared for crowds and chaos in the cities, especially in Thamel and Kathmandu Durbar Square.
* Consider purchasing a local SIM card or portable Wi-Fi hotspot to stay connected during your trip.
**Accommodation:**
* Consider staying in a hotel or guesthouse that is centrally located and has good reviews.
* Look for accommodations that offer amenities such as free Wi-Fi, hot water, and a restaurant or cafe.
**Transportation:**
* Consider hiring a private vehicle or taxi for the day, as this will give you more flexibility and convenience.
* Be sure to negotiate the price and agree on the itinerary before setting off.
**Budget:**
* Be prepared for variable prices and exchange rates, and have some local currency (Nepali rupees) on hand.
* Consider budgeting extra for unexpected expenses, such as transportation or food.
Overall, your itinerary provides a good balance of culture, history, and natural beauty, and with these additional tips and suggestions, you'll be well-prepared for an unforgettable trip to Nepal!
---------- TextMessage (language_agent) ----------
Your 3-day itinerary for Nepal is well-planned and covers many of the country's cultural and natural highlights. The additional suggestions and tips you provided are excellent and will help enhance the trip experience.
One aspect that is well-covered in your plan is the cultural and historical significance of the destinations. You have included a mix of temples, stupas, and cultural sites, which will give visitors a good understanding of Nepal's rich heritage.
Regarding language and communication challenges, your tip to "try to learn some basic Nepali phrases, such as 'namaste' (hello) and 'dhanyabaad' (thank you)" is excellent. This will help visitors show respect for the local culture and people, and can also facilitate interactions with locals.
Additionally, your suggestion to "consider purchasing a local SIM card or portable Wi-Fi hotspot to stay connected during your trip" is practical and will help visitors stay in touch with family and friends back home, as well as navigate the local area.
Your plan is satisfactory, and with the additional tips and suggestions, visitors will be well-prepared for an unforgettable trip to Nepal. The itinerary provides a good balance of culture, history, and natural beauty, and the tips on language, communication, and logistics will help ensure a smooth and enjoyable journey.
Overall, your plan is well-thought-out, and the additional suggestions and tips will help visitors make the most of their trip to Nepal. Well done!
However, one minor suggestion I might make is to consider including a few phrases in the local language for emergency situations, such as "where is the hospital?" or "how do I get to the airport?" This can help visitors in case of an unexpected situation, and can also give them more confidence when navigating unfamiliar areas.
But overall, your plan is excellent, and with these minor suggestions, it can be even more comprehensive and helpful for visitors to Nepal.
---------- TextMessage (travel_summary_agent) ----------
TERMINATE
Here is the complete and integrated 3-day travel plan to Nepal:
**Day 1: Arrival in Kathmandu and Exploration of the City**
* Arrive at Tribhuvan International Airport in Kathmandu, the capital city of Nepal.
* Check-in to your hotel and freshen up.
* Visit the famous **Boudhanath Stupa**, one of the largest Buddhist stupas in the world.
* Explore the surrounding streets, which are filled with Tibetan shops, restaurants, and monasteries.
* Explore the **Thamel** area, a popular tourist hub known for its narrow streets, shops, and restaurants. Be sure to try some of the local street food, such as momos or sel roti.
* In the evening, enjoy a traditional Nepali dinner and watch a cultural performance at a local restaurant, such as the Kathmandu Guest House or the Northfield Cafe.
**Day 2: Kathmandu Valley Tour**
* Start the day with a visit to the **Pashupatinath Temple**, a sacred Hindu temple dedicated to Lord Shiva. Be respectful of the Hindu rituals and customs, and take a stroll along the Bagmati River, which runs through the temple complex.
* Next, head to the **Kathmandu Durbar Square**, a UNESCO World Heritage Site and the former royal palace of the Malla kings. Consider hiring a guide to provide more insight into the history and significance of the temples and palaces.
* Visit the **Swayambhunath Stupa**, also known as the Monkey Temple, which offers stunning views of the city.
* In the afternoon, visit the **Patan City**, known for its rich cultural heritage and traditional crafts. Explore the **Patan Durbar Square** and visit the **Krishna Temple**, a beautiful example of Nepali architecture. Also, visit the Patan Museum, which showcases the art and culture of the Kathmandu Valley.
**Day 3: Bhaktapur and Nagarkot**
* Drive to **Bhaktapur**, a medieval town and a UNESCO World Heritage Site (approximately 1 hour). Explore the **Bhaktapur Durbar Square**, which features stunning architecture, temples, and palaces. Be sure to try some of the local pottery and handicrafts, and visit the Bhaktapur National Art Gallery, which features traditional Nepali art.
* In the afternoon, drive to **Nagarkot**, a scenic hill station with breathtaking views of the Himalayas (approximately 1.5 hours). Consider taking a short hike to the nearby villages, which offer stunning views of the Himalayas. Find a spot with a clear view of the mountains, and enjoy the peaceful atmosphere during sunset.
**Additional Tips:**
* Make sure to try some local Nepali cuisine, such as momos, dal bhat, and gorkhali lamb.
* Bargain while shopping in the markets, as it's a common practice in Nepal.
* Respect local customs and traditions, especially when visiting temples and cultural sites.
* Stay hydrated and bring sunscreen, as the sun can be strong in Nepal.
* Dress modestly and respect local customs, as Nepal is a relatively conservative country.
* Try to learn some basic Nepali phrases, such as "namaste" (hello), "dhanyabaad" (thank you), "where is the hospital?" and "how do I get to the airport?".
* Consider purchasing a local SIM card or portable Wi-Fi hotspot to stay connected during your trip.
* Be prepared for crowds and chaos in the cities, especially in Thamel and Kathmandu Durbar Square.
**Accommodation:**
* Consider staying in a hotel or guesthouse that is centrally located and has good reviews.
* Look for accommodations that offer amenities such as free Wi-Fi, hot water, and a restaurant or cafe.
**Transportation:**
* Consider hiring a private vehicle or taxi for the day, as this will give you more flexibility and convenience.
* Be sure to negotiate the price and agree on the itinerary before setting off.
**Budget:**
* The budget for a 3-day trip to Nepal can vary depending on your accommodation choices, transportation, and activities. However, here's a rough estimate:
+ Accommodation: $20-50 per night
+ Transportation: $10-20 per day
+ Food: $10-20 per meal
+ Activities: $10-20 per person
* Total estimated budget for 3 days: $200-500 per person
* Be prepared for variable prices and exchange rates, and have some local currency (Nepali rupees) on hand.
* Consider budgeting extra for unexpected expenses, such as transportation or food.
Agent-specific conversation summary
| Agent | Conversation summary |
|---|---|
| planner_agent | I propose a 3-day travel itinerary for Nepal. Additional tips: Respect local customs, try local food, choose transportation options, etc |
| local_agent | Based on planner_agent’s 3-day travel itinerary, we provide additional suggestions and tips. Additional tips: Respect local customs, learn basic Nepali, use local facilities, etc |
| language_agent | Travel itinerary evaluation and provide additional suggestions. Basic Nepali learning, use of local facilities, language preparation for emergency situations, etc. |
| travel_summary_agent | Summarizes the overall 3-day travel plan. Additional tips: Respect local customs, try local food, choose transportation options, etc. |
MCP Utilization Agent
- Please refer to the LLM usage guide for the AIOS_BASE_URL AIOS_LLM_Private_Endpoint and the MODEL_ID of the MODEL.
autogen_mcp.py
from urllib.parse import urljoin
from autogen_core.models import ModelFamily
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.tools.mcp import McpWorkbench, StdioServerParams
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.ui import Console
# Set the API URL and model name for model access.
AIOS_BASE_URL = "AIOS_LLM_Private_Endpoint"
MODEL = "MODEL_ID"
# Create a model client using OpenAIChatCompletionClient.
model_client = OpenAIChatCompletionClient(
model=MODEL,
base_url=urljoin(AIOS_BASE_URL, "v1"),
api_key="EMPTY_KEY",
model_info={
# Set to True if images are supported.
"vision": False,
# Set to True if function calls are supported.
"function_calling": True,
# Set to True if JSON output is supported.
"json_output": True,
# If the model you want to use is not provided by ModelFamily, use UNKNOWN.
# "family": ModelFamily.UNKNOWN,
"family": ModelFamily.LLAMA_3_3_70B,
# Set to True if supporting structured output.
"structured_output": True,
}
")"
# Set MCP server parameters.
# mcp_server_time is an MCP server implemented in python,
# It includes the get_current_time function that provides the current time internally, and the convert_time function that converts time zones.
# This parameter sets the MCP server to the local timezone so that the time can be checked.
# For example, if you set it to "Asia/Seoul", you can check the time according to the Korean time zone.
mcp_server_params = StdioServerParams(
command="python",
args=["-m", "mcp_server_time", "--local-timezone", "Asia/Seoul"],
)
async def main():
"""Runs the agent that checks the time using the MCP workbench as the main function."""
# Create and run an agent that checks the time using the MCP workbench.
# The agent performs the task "What time is it now in South Korea?"
# Print the results using the console.
# while the MCP Workbench is running, the agent checks the time
# Output the results in streaming mode.
# If MCP Workbench terminates, the agent also terminates.
async with McpWorkbench(mcp_server_params) as workbench:
time_agent = AssistantAgent(
"time_assistant",
model_client=model_client,
workbench=workbench,
reflect_on_tool_use=True,
)
await Console(time_agent.run_stream(task="What time is it now in South Korea?"))
await model_client.close()
if __name__ == "__main__":
import asyncio
asyncio.run(main())from urllib.parse import urljoin
from autogen_core.models import ModelFamily
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.tools.mcp import McpWorkbench, StdioServerParams
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.ui import Console
# Set the API URL and model name for model access.
AIOS_BASE_URL = "AIOS_LLM_Private_Endpoint"
MODEL = "MODEL_ID"
# Create a model client using OpenAIChatCompletionClient.
model_client = OpenAIChatCompletionClient(
model=MODEL,
base_url=urljoin(AIOS_BASE_URL, "v1"),
api_key="EMPTY_KEY",
model_info={
# Set to True if images are supported.
"vision": False,
# Set to True if function calls are supported.
"function_calling": True,
# Set to True if JSON output is supported.
"json_output": True,
# If the model you want to use is not provided by ModelFamily, use UNKNOWN.
# "family": ModelFamily.UNKNOWN,
"family": ModelFamily.LLAMA_3_3_70B,
# Set to True if supporting structured output.
"structured_output": True,
}
")"
# Set MCP server parameters.
# mcp_server_time is an MCP server implemented in python,
# It includes the get_current_time function that provides the current time internally, and the convert_time function that converts time zones.
# This parameter sets the MCP server to the local timezone so that the time can be checked.
# For example, if you set it to "Asia/Seoul", you can check the time according to the Korean time zone.
mcp_server_params = StdioServerParams(
command="python",
args=["-m", "mcp_server_time", "--local-timezone", "Asia/Seoul"],
)
async def main():
"""Runs the agent that checks the time using the MCP workbench as the main function."""
# Create and run an agent that checks the time using the MCP workbench.
# The agent performs the task "What time is it now in South Korea?"
# Print the results using the console.
# while the MCP Workbench is running, the agent checks the time
# Output the results in streaming mode.
# If MCP Workbench terminates, the agent also terminates.
async with McpWorkbench(mcp_server_params) as workbench:
time_agent = AssistantAgent(
"time_assistant",
model_client=model_client,
workbench=workbench,
reflect_on_tool_use=True,
)
await Console(time_agent.run_stream(task="What time is it now in South Korea?"))
await model_client.close()
if __name__ == "__main__":
import asyncio
asyncio.run(main())When you run the file using python, it fetches the tool’s metadata from the MCP server, calls the model, and when the model generates a tool calls message
You can see that the get_current_time function is executed to retrieve the current time.
python autogen_mcp.pypython autogen_mcp.pyExecution result
# TextMessage (user): Input message given by the user
---------- TextMessage (user) ----------
What time is it now in South Korea?
# Query metadata of tools that can be used on the MCP server
INFO:mcp.server.lowlevel.server:Processing request of type ListToolsRequest
...omission...
INFO:autogen_core.events:{
# Metadata of tools available on the MCP server
"tools": [
{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Get current time in a specific timezones",
"parameters": {
"type": "object",
"properties": {
"timezone": {
"type": "string",
"description": "IANA timezone name (e.g., 'America/New_York', 'Europe/London'). Use 'Asia/Seoul' as local timezone if no timezone provided by the user."
}
},
"required": ["
"timezone
],
"additionalProperties": false
},
"strict": false
}
},
{
"type": "function",
"function": {
"name": "convert_time",
"description": "Convert time between timezones",
"parameters": {
"type": "object",
"properties": {
"source_timezone": {
"type": "string",
"description": "Source IANA timezone name (e.g., 'America/New_York', 'Europe/London'). Use 'Asia/Seoul' as local timezone if no source timezone provided by the user."
},
"time": {
"type": "string",
"description": "Time to convert in 24-hour format (HH:MM)"
},
"target_timezone": {
"type": "string",
"description": "Target IANA timezone name (e.g., 'Asia/Tokyo', 'America/San_Francisco'). Use 'Asia/Seoul' as local timezone if no target timezone provided by the user."
}
},
"required": [
"source_timezone",
"time",
"target_timezone"
],
"additionalProperties": false
},
"strict": false
}
}
],
"type": "LLMCall",
# input message
"messages": [
{
"content": "You are a helpful AI assistant. Solve tasks using your tools. Reply with TERMINATE when the task has been completed.",
"role": "system"
},
{
"role": "user",
"name": "user",
"content": "What time is it now in South Korea?"
}
],
# Model Response
"response": {
"id": "chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"choices": [
{
"finish_reason": "tool_calls",
"index": 0,
"logprobs": null,
"message": {
"content": null,
"refusal": null,
"role": "assistant",
"annotations": null,
"audio": null,
"function_call": null,
"tool_calls": ["
{
"id": "chatcmpl-tool-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"function": {
"arguments": "{\"timezone\": \"Asia/Seoul\"}",
"name": "get_current_time"
},
"type": "function"
}
],
"reasoning_content": null
},
"stop_reason": 128008
}
],
"created": 1751278737,
"model": "MODEL_ID",
"object": "chat.completion",
"service_tier": null,
"system_fingerprint": null,
"usage": {
"completion_tokens": 21,
"prompt_tokens": 508,
"total_tokens": 529,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"prompt_logprobs": null
},
"prompt_tokens": 508,
"completion_tokens": 21,
"agent_id": null
}
# ToolCallRequestEvent: Receiving a tool call message from the model
---------- ToolCallRequestEvent (time_assistant) ----------
[FunctionCall(id='chatcmpl-tool-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx', arguments='{"timezone": "Asia/Seoul"}', name='get_current_time')]
INFO:mcp.server.lowlevel.server:Processing request of type ListToolsRequest
# Execute function of tool call message via MCP server
INFO:mcp.server.lowlevel.server:Processing request of type CallToolRequest
# ToolCallExecutionEvent: Deliver the function execution result to the model
---------- ToolCallExecutionEvent (time_assistant) ----------
[FunctionExecutionResult(content='{
"timezone": "Asia/Seoul",
"datetime": "2025-06-30T19:18:58+09:00",
"is_dst": false
}', name='get_current_time', call_id='chatcmpl-tool-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx', is_error=False)]
...omission...
# TextMessage (time_assistant): Final answer generated by the model
---------- TextMessage (time_assistant) ----------
The current time in South Korea is 19:18:58 KST.
TERMINATE
MCP Server Time Query System Log Analysis Result
MCP(Model Control Protocol) server-based time query system execution process log analysis result.
Request Information
| Item | Content |
|---|---|
| User request | What time is it now in South Korea? |
| Request Time | 2025-06-30 19:18:58 KST |
| Processing method | MCP server tool call |
Available tools
| Tool Name | Description | Parameter | Default Value |
|---|---|---|---|
get_current_time | Retrieve current time of a specific timezone | timezone (IANA timezone name) | Asia/Seoul |
convert_time | Time conversion between time zones | source_timezone, time, target_timezone | Asia/Seoul |
Processing steps
| Step | Action | Details |
|---|---|---|
| 1 | Tool metadata lookup | Verify the list of tools available on the MCP server |
| 2 | AI model response | get_current_time function called in the Asia/Seoul timezone |
| 3 | Function execution | MCP server runs time lookup tool |
| 4 | Return result | Provide time information in structured JSON format |
| 5 | Final Answer | Deliver time to the user in an easy-to-read format |
Function Call Details
| Item | Value |
|---|---|
| function name | get_current_time |
| Parameter | {"timezone": "Asia/Seoul"} |
| Call ID | chatcmpl-tool-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx |
| type | function |
Execution result
| Field | Value | Description |
|---|---|---|
timezone | Asia/Seoul | Time zone |
datetime | 2025-06-30T19:18:58+09:00 | ISO 8601 format time |
is_dst | false | Daylight saving time applied |
final response
| Item | Content |
|---|---|
| Response Message | The current time in South Korea is 19:18:58 KST. |
| Completion mark | TERMINATE |
| Response Time | 19:18:58 KST |
Usage metric table
| indicator | value |
|---|---|
| Prompt Token | 508 |
| completion token | 21 |
| Total token usage | 529 |
| Processing time | Immediate (real-time) |
Main features
| Feature | Description |
|---|---|
| MCP protocol utilization | Smooth integration with external tools |
| Korean time zone default setting | Asia/Seoul used as default |
| Structured response | Clear data return in JSON format |
| Auto-complete display | Work completion notification with TERMINATE |
| Real-time information provision | Accurate current time lookup |
Technical significance
This is an example of a modern architecture where an AI assistant integrates with external systems to provide real-time information. Through MCP, the AI model can access various external tools and services, enabling more practical and dynamic responses.
Conclusion
In this tutorial, we implemented an application that creates travel itineraries using multiple agents by leveraging the AI model provided by AIOS and autogen, and an agent application that can use external tools by utilizing the MCP server. Through this, we learned that problems can be solved from multiple angles using several agents with different perspectives, and external tools can be utilized. This system can be expanded and customized to fit user environments in the following ways.
- Agent flow control: Various techniques can be used when selecting the agent to perform the task. For reliable results, you can fix the order of agents and implement it, or you can let the AI model choose the agents for flexible processing. Additionally, you can use event techniques to implement multiple agents processing tasks in parallel.
- Introduction of various MCP servers: In addition to mcp_server_time, various MCP servers that have already been implemented exist. By utilizing these, the AI model can flexibly use various external tools to implement useful applications.
Based on this tutorial, we hope you will directly build a suitable AIOS-based collaborative assistant according to the actual service purpose.
Reference link
https://microsoft.github.io/autogen
https://modelcontextprotocol.io/
https://github.com/modelcontextprotocol/servers
1.3.4 - Request Examples
SDKs Supported by API
AIOS models are compatible with OpenAI and Cohere APIs, so they are also compatible with OpenAI and Cohere SDKs. The following is the list of OpenAI and Cohere compatible APIs supported by Samsung Cloud Platform AIOS service.
| API Name | API | Description | Supported SDK |
|---|---|---|---|
| Text Completion API | Generates natural sentences that follow the given string as input. |
| |
| Chat Completion API | Generates a response that follows the conversation history. |
| |
| Embeddings API | Converts text to high-dimensional vectors (embeddings), which can be used for various natural language processing (NLP) tasks such as text similarity calculation, clustering, and search. |
| |
| Rerank API | Predicts the relevance between a single query and each item in a list of documents by applying embedding models or cross-encoder models. |
|
- Request Examples guide is explained based on a Virtual Server environment with Python/NodeJS/Go runtime environments configured.
- Actual execution may result in different token counts and message content compared to the examples.
Package Installation
You can install SDK packages that support AIOS model API requests according to your execution environment.
pip install requests openai cohere \
langchain langchain-openai langchain-cohere langchain-togetherpip install requests openai cohere \
langchain langchain-openai langchain-cohere langchain-togethernpm install openai cohere-ai langchain \
@langchain/core @langchain/openai @langchain/coherenpm install openai cohere-ai langchain \
@langchain/core @langchain/openai @langchain/coherego get github.com/openai/openai-go \
github.com/cohere-ai/cohere-go/v2go get github.com/openai/openai-go \
github.com/cohere-ai/cohere-go/v2Text Completion API
The Text Completion API generates natural sentences that immediately follow the given string as input.
non-stream request
Request
import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Configure the request data.
# This includes the model ID to use and the prompt.
data = {
"model": model,
"prompt": "Hi"
}
# Send a POST request to the AIOS API's v1/completions endpoint.
# Use urljoin function to combine the base URL and endpoint path.
response = requests.post(urljoin(aios_base_url, "v1/completions"), json=data)
# Parse the response body in JSON format.
body = json.loads(response.text)
# response.choices[0].text is the response text generated by the AI model.
print(body["choices"][0]["text"])import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Configure the request data.
# This includes the model ID to use and the prompt.
data = {
"model": model,
"prompt": "Hi"
}
# Send a POST request to the AIOS API's v1/completions endpoint.
# Use urljoin function to combine the base URL and endpoint path.
response = requests.post(urljoin(aios_base_url, "v1/completions"), json=data)
# Parse the response body in JSON format.
body = json.loads(response.text)
# response.choices[0].text is the response text generated by the AI model.
print(body["choices"][0]["text"])from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Generate a completion using the AIOS model.
# model parameter specifies the model ID to use,
# prompt parameter is the input text to provide to the AI.
response = client.completions.create(
model=model,
prompt="Hi"
)
# response.choices[0].text is the response text generated by the AI model.
print(response.choices[0].text)from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Generate a completion using the AIOS model.
# model parameter specifies the model ID to use,
# prompt parameter is the input text to provide to the AI.
response = client.completions.create(
model=model,
prompt="Hi"
)
# response.choices[0].text is the response text generated by the AI model.
print(response.choices[0].text)from langchain_openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an LLM (Large Language Model) instance using LangChain's OpenAI class.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# model parameter specifies the model ID to use.
llm = OpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
# Pass the prompt "Hi" to the LLM and receive a response.
# The invoke method returns the model's output.
print(llm.invoke("Hi"))from langchain_openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an LLM (Large Language Model) instance using LangChain's OpenAI class.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# model parameter specifies the model ID to use.
llm = OpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
# Pass the prompt "Hi" to the LLM and receive a response.
# The invoke method returns the model's output.
print(llm.invoke("Hi"))const aios_base_url = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>" // Enter the model ID for calling the AIOS model.
// Configure the request data.
// This includes the model ID to use and the prompt.
const data = {
model: model,
prompt: "Hi",
};
// Create the AIOS API's v1/completions endpoint URL.
let url = new URL("/v1/completions", aios_base_url);
// Send a POST request to the AIOS API.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
// Parse the response body in JSON format.
const body = await response.json();
// response.choices[0].text is the response text generated by the AI model.
console.log(body.choices[0].text);const aios_base_url = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>" // Enter the model ID for calling the AIOS model.
// Configure the request data.
// This includes the model ID to use and the prompt.
const data = {
model: model,
prompt: "Hi",
};
// Create the AIOS API's v1/completions endpoint URL.
let url = new URL("/v1/completions", aios_base_url);
// Send a POST request to the AIOS API.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
// Parse the response body in JSON format.
const body = await response.json();
// response.choices[0].text is the response text generated by the AI model.
console.log(body.choices[0].text);import OpenAI from "openai";
const aios_base_url = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>" // Enter the model ID for calling the AIOS model.
// Create an OpenAI client.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// baseURL points to the v1 endpoint of the AIOS API.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Generate a completion using the AIOS model.
// model parameter specifies the model ID to use,
// prompt parameter is the input text to provide to the AI.
const completions = await client.completions.create({
model: model,
prompt: "Hi",
});
// response.choices[0].text is the response text generated by the AI model.
console.log(completions.choices[0].text);import OpenAI from "openai";
const aios_base_url = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>" // Enter the model ID for calling the AIOS model.
// Create an OpenAI client.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// baseURL points to the v1 endpoint of the AIOS API.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Generate a completion using the AIOS model.
// model parameter specifies the model ID to use,
// prompt parameter is the input text to provide to the AI.
const completions = await client.completions.create({
model: model,
prompt: "Hi",
});
// response.choices[0].text is the response text generated by the AI model.
console.log(completions.choices[0].text);import { OpenAI } from "@langchain/openai";
const aios_base_url = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>" // Enter the model ID for calling the AIOS model.
// Create an LLM (Large Language Model) instance using LangChain's OpenAI class.
// model parameter specifies the model ID to use.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// configuration.baseURL points to the v1 endpoint of the AIOS API.
const llm = new OpenAI({
model: model,
apiKey: "EMPTY_KEY",
configuration: {
baseURL: new URL("v1", aios_base_url).href,
},
});
// Pass the prompt "Hi" to the LLM and receive a streaming response.
// The stream method returns a stream that generates tokens in real-time.
const completion = await llm.invoke("Hi");
// Output the generated response.
// This text is the response generated by the AI model.
console.log(completion);import { OpenAI } from "@langchain/openai";
const aios_base_url = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>" // Enter the model ID for calling the AIOS model.
// Create an LLM (Large Language Model) instance using LangChain's OpenAI class.
// model parameter specifies the model ID to use.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// configuration.baseURL points to the v1 endpoint of the AIOS API.
const llm = new OpenAI({
model: model,
apiKey: "EMPTY_KEY",
configuration: {
baseURL: new URL("v1", aios_base_url).href,
},
});
// Pass the prompt "Hi" to the LLM and receive a streaming response.
// The stream method returns a stream that generates tokens in real-time.
const completion = await llm.invoke("Hi");
// Output the generated response.
// This text is the response generated by the AI model.
console.log(completion);package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the data structure to be used for POST requests.
// Model: Model ID to use
// Prompt: Input text to provide to the AI
// Stream: Whether to stream response (optional)
type PostData struct {
Model string `json:"model"`
Prompt string `json:"prompt"`
Stream bool `json:"stream,omitempty"`
}
func main() {
// Create request data.
data := PostData{
Model: model,
Prompt: "Hi",
}
// Marshal data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/completions endpoint.
response, err := http.Post(aiosBaseUrl + "/v1/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// Read the entire response body.
body, err := io.ReadAll(response.Body)
if err != nil {
panic(err)
}
var v map[string]interface{}
json.Unmarshal(body, &v)
// Extract the choices array from the response.
choices := v["choices"].([]interface{})
// Extract the first data's text.
choice := choices[0].(map[string]interface{})
text := choice["text"]
// Output the response text generated by the AI model.
fmt.Println(text)
}package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the data structure to be used for POST requests.
// Model: Model ID to use
// Prompt: Input text to provide to the AI
// Stream: Whether to stream response (optional)
type PostData struct {
Model string `json:"model"`
Prompt string `json:"prompt"`
Stream bool `json:"stream,omitempty"`
}
func main() {
// Create request data.
data := PostData{
Model: model,
Prompt: "Hi",
}
// Marshal data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/completions endpoint.
response, err := http.Post(aiosBaseUrl + "/v1/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// Read the entire response body.
body, err := io.ReadAll(response.Body)
if err != nil {
panic(err)
}
var v map[string]interface{}
json.Unmarshal(body, &v)
// Extract the choices array from the response.
choices := v["choices"].([]interface{})
// Extract the first data's text.
choice := choices[0].(map[string]interface{})
text := choice["text"]
// Output the response text generated by the AI model.
fmt.Println(text)
}package main
import (
"context"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
"github.com/openai/openai-go/packages/param"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
func main() {
// Create an OpenAI client.
// Use option.WithBaseURL to set the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl+"/v1"),
)
// Generate a completion using the AIOS model.
// Use openai.CompletionNewParams to set the model and prompt.
completion, err := client.Completions.New(context.TODO(), openai.CompletionNewParams{
Model: openai.CompletionNewParamsModel(model),
Prompt: openai.CompletionNewParamsPromptUnion{OfString: param.Opt[string]{Value: "Hi"}},
})
if err != nil {
panic(err)
}
// response.choices[0].text is the response text generated by the AI model.
fmt.Println(completion.Choices[0].Text)
}package main
import (
"context"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
"github.com/openai/openai-go/packages/param"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
func main() {
// Create an OpenAI client.
// Use option.WithBaseURL to set the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl+"/v1"),
)
// Generate a completion using the AIOS model.
// Use openai.CompletionNewParams to set the model and prompt.
completion, err := client.Completions.New(context.TODO(), openai.CompletionNewParams{
Model: openai.CompletionNewParamsModel(model),
Prompt: openai.CompletionNewParamsPromptUnion{OfString: param.Opt[string]{Value: "Hi"}},
})
if err != nil {
panic(err)
}
// response.choices[0].text is the response text generated by the AI model.
fmt.Println(completion.Choices[0].Text)
}Response
You can see that the model’s answer is included in the text field of choices.
future president of the United States, I hope you're doing well. As a
stream request
Using the stream feature, you can receive responses token by token as the model generates tokens, without waiting for the model to complete the entire response.
Request
Enter True for the stream parameter value.
import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Configure the request data.
# This includes the model ID to use, the prompt, and whether to stream.
data = {
"model": model,
"prompt": "Hi",
"stream": True
}
# Send a POST request to the AIOS API's v1/completions endpoint.
# Set stream=True to receive real-time streaming responses.
response = requests.post(urljoin(aios_base_url, "v1/completions"), json=data, stream=True)
# You can receive responses as the model generates tokens.
# Responses are sent separated by each line, so process with iter_lines().
for line in response.iter_lines():
if line:
try:
# Remove the 'data: ' prefix and parse the JSON data.
body = json.loads(line[len("data: "):])
# response.choices[0].text is the response text generated by the AI model.
print(body["choices"][0]["text"])
except:
passimport json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Configure the request data.
# This includes the model ID to use, the prompt, and whether to stream.
data = {
"model": model,
"prompt": "Hi",
"stream": True
}
# Send a POST request to the AIOS API's v1/completions endpoint.
# Set stream=True to receive real-time streaming responses.
response = requests.post(urljoin(aios_base_url, "v1/completions"), json=data, stream=True)
# You can receive responses as the model generates tokens.
# Responses are sent separated by each line, so process with iter_lines().
for line in response.iter_lines():
if line:
try:
# Remove the 'data: ' prefix and parse the JSON data.
body = json.loads(line[len("data: "):])
# response.choices[0].text is the response text generated by the AI model.
print(body["choices"][0]["text"])
except:
passfrom openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Generate a completion using the AIOS model.
# model parameter specifies the model ID to use,
# prompt parameter is the input text to provide to the AI.
# Set stream=True to receive real-time streaming responses.
response = client.completions.create(
model=model,
prompt="Hi",
stream=True
)
# You can receive responses as the model generates tokens.
# response is sent in stream format, so you can process it iteratively.
for chunk in response:
# Each chunk's choices[0].text is the response text generated by the AI model.
print(chunk.choices[0].text)from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Generate a completion using the AIOS model.
# model parameter specifies the model ID to use,
# prompt parameter is the input text to provide to the AI.
# Set stream=True to receive real-time streaming responses.
response = client.completions.create(
model=model,
prompt="Hi",
stream=True
)
# You can receive responses as the model generates tokens.
# response is sent in stream format, so you can process it iteratively.
for chunk in response:
# Each chunk's choices[0].text is the response text generated by the AI model.
print(chunk.choices[0].text)from langchain_openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an LLM (Large Language Model) instance using LangChain's OpenAI class.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# model parameter specifies the model ID to use.
llm = OpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
# Pass the prompt "Hi" to the LLM and receive a streaming response.
# The stream method returns a stream that generates tokens in real-time.
response = llm.stream("Hi")
# You can receive responses as the model generates tokens.
# response is sent in stream format, so you can process it iteratively.
for chunk in response:
# Output each chunk.
# This chunk is the response token generated by the AI model.
print(chunk)from langchain_openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an LLM (Large Language Model) instance using LangChain's OpenAI class.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# model parameter specifies the model ID to use.
llm = OpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
# Pass the prompt "Hi" to the LLM and receive a streaming response.
# The stream method returns a stream that generates tokens in real-time.
response = llm.stream("Hi")
# You can receive responses as the model generates tokens.
# response is sent in stream format, so you can process it iteratively.
for chunk in response:
# Output each chunk.
# This chunk is the response token generated by the AI model.
print(chunk)const aios_base_url = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>" // Enter the model ID for calling the AIOS model.
// Configure the request data.
// This includes the model ID to use, the prompt, and whether to stream.
const data = {
model: model,
prompt: "Hi",
stream: true,
};
// Create the AIOS API's v1/completions endpoint URL.
let url = new URL("/v1/completions", aios_base_url);
// Send a POST request to the AIOS API.
// Set stream: true to receive real-time streaming responses.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
// You can receive responses as the model generates tokens.
// Convert the response body to a text decoder stream and read it.
const reader = response.body.pipeThrough(new TextDecoderStream()).getReader();
let buf = "";
while (true) {
const { value, done } = await reader.read();
if (done) break;
// Add received data to buffer.
buf += value;
let sep;
// Find newline characters (\n\n) in the buffer and separate data.
while ((sep = buf.indexOf("\n\n")) >= 0) {
const data = buf.slice(0, sep);
buf = buf.slice(sep + 2);
// Process each line.
for (const rawLine of data.split("\n")) {
const line = rawLine.trim();
if (!line.startsWith("data: ")) continue;
// Remove the "data: " prefix and extract JSON data.
const payload = line.slice("data: ".length).trim();
if (payload === "[DONE]") break;
// Parse the JSON data.
const json = JSON.parse(payload);
// choices[0].text is the response text generated by the AI model.
console.log(json.choices[0].text);
}
}
}const aios_base_url = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>" // Enter the model ID for calling the AIOS model.
// Configure the request data.
// This includes the model ID to use, the prompt, and whether to stream.
const data = {
model: model,
prompt: "Hi",
stream: true,
};
// Create the AIOS API's v1/completions endpoint URL.
let url = new URL("/v1/completions", aios_base_url);
// Send a POST request to the AIOS API.
// Set stream: true to receive real-time streaming responses.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
// You can receive responses as the model generates tokens.
// Convert the response body to a text decoder stream and read it.
const reader = response.body.pipeThrough(new TextDecoderStream()).getReader();
let buf = "";
while (true) {
const { value, done } = await reader.read();
if (done) break;
// Add received data to buffer.
buf += value;
let sep;
// Find newline characters (\n\n) in the buffer and separate data.
while ((sep = buf.indexOf("\n\n")) >= 0) {
const data = buf.slice(0, sep);
buf = buf.slice(sep + 2);
// Process each line.
for (const rawLine of data.split("\n")) {
const line = rawLine.trim();
if (!line.startsWith("data: ")) continue;
// Remove the "data: " prefix and extract JSON data.
const payload = line.slice("data: ".length).trim();
if (payload === "[DONE]") break;
// Parse the JSON data.
const json = JSON.parse(payload);
// choices[0].text is the response text generated by the AI model.
console.log(json.choices[0].text);
}
}
}import OpenAI from "openai";
const aios_base_url = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>" // Enter the model ID for calling the AIOS model.
// Create an OpenAI client.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// baseURL points to the v1 endpoint of the AIOS API.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Generate a completion using the AIOS model.
// model parameter specifies the model ID to use,
// prompt parameter is the input text to provide to the AI.
// Set stream: true to receive real-time streaming responses.
const completions = await client.completions.create({
model: model,
prompt: "Hi",
stream: true,
});
// You can receive responses as the model generates tokens.
// Use for await...of loop to sequentially process stream events.
for await (const event of completions) {
// Each event's choices[0].text is the response text generated by the AI model.
console.log(event.choices[0].text);
}import OpenAI from "openai";
const aios_base_url = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>" // Enter the model ID for calling the AIOS model.
// Create an OpenAI client.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// baseURL points to the v1 endpoint of the AIOS API.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Generate a completion using the AIOS model.
// model parameter specifies the model ID to use,
// prompt parameter is the input text to provide to the AI.
// Set stream: true to receive real-time streaming responses.
const completions = await client.completions.create({
model: model,
prompt: "Hi",
stream: true,
});
// You can receive responses as the model generates tokens.
// Use for await...of loop to sequentially process stream events.
for await (const event of completions) {
// Each event's choices[0].text is the response text generated by the AI model.
console.log(event.choices[0].text);
}import { OpenAI } from "@langchain/openai";
const aios_base_url = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>" // Enter the model ID for calling the AIOS model.
// Create an LLM (Large Language Model) instance using LangChain's OpenAI class.
// model parameter specifies the model ID to use.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// configuration.baseURL points to the v1 endpoint of the AIOS API.
const llm = new OpenAI({
model: model,
apiKey: "EMPTY_KEY",
configuration: {
baseURL: new URL("v1", aios_base_url).href,
},
});
// Pass the prompt "Hi" to the LLM and receive a streaming response.
// The stream method returns a stream that generates tokens in real-time.
const completion = await llm.stream("Hi");
// You can receive responses as the model generates tokens.
// Use for await...of loop to sequentially process stream chunks.
for await (const chunk of completion) {
// Output each chunk.
// This chunk is the response token generated by the AI model.
console.log(chunk);
}import { OpenAI } from "@langchain/openai";
const aios_base_url = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>" // Enter the model ID for calling the AIOS model.
// Create an LLM (Large Language Model) instance using LangChain's OpenAI class.
// model parameter specifies the model ID to use.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// configuration.baseURL points to the v1 endpoint of the AIOS API.
const llm = new OpenAI({
model: model,
apiKey: "EMPTY_KEY",
configuration: {
baseURL: new URL("v1", aios_base_url).href,
},
});
// Pass the prompt "Hi" to the LLM and receive a streaming response.
// The stream method returns a stream that generates tokens in real-time.
const completion = await llm.stream("Hi");
// You can receive responses as the model generates tokens.
// Use for await...of loop to sequentially process stream chunks.
for await (const chunk of completion) {
// Output each chunk.
// This chunk is the response token generated by the AI model.
console.log(chunk);
}package main
import (
"bufio"
"bytes"
"encoding/json"
"fmt"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the data structure to be used for POST requests.
// Model: Model ID to use
// Prompt: Input text to provide to the AI
// Stream: Whether to stream response (optional)
type PostData struct {
Model string `json:"model"`
Prompt string `json:"prompt"`
Stream bool `json:"stream,omitempty"`
}
func main() {
// Create request data.
// Set Stream: true to receive real-time streaming responses.
data := PostData{
Model: model,
Prompt: "Hi",
Stream: true,
}
// Marshal data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/completions endpoint.
response, err := http.Post(aiosBaseUrl+"/v1/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// You can receive responses as the model generates tokens.
// Scan the HTTP response body and process line by line.
var v map[string]interface{}
scanner := bufio.NewScanner(response.Body)
for scanner.Scan() {
line := bytes.TrimSpace(scanner.Bytes())
// Skip lines that don't start with "data: ".
if !bytes.HasPrefix(line, []byte("data: ")) {
continue
}
// Remove the "data: " prefix.
payload := bytes.TrimPrefix(line, []byte("data: "))
// If payload is "[DONE]", end streaming.
if bytes.Equal(payload, []byte("[DONE]")) {
break
}
// Parse the JSON data.
json.Unmarshal(payload, &v)
// Extract the choices array from the response.
choices := v["choices"].([]interface{})
// Extract the first data.
choice := choices[0].(map[string]interface{})
// Extract the response token generated by the AI model.
text := choice["text"]
fmt.Println(text)
}
}package main
import (
"bufio"
"bytes"
"encoding/json"
"fmt"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the data structure to be used for POST requests.
// Model: Model ID to use
// Prompt: Input text to provide to the AI
// Stream: Whether to stream response (optional)
type PostData struct {
Model string `json:"model"`
Prompt string `json:"prompt"`
Stream bool `json:"stream,omitempty"`
}
func main() {
// Create request data.
// Set Stream: true to receive real-time streaming responses.
data := PostData{
Model: model,
Prompt: "Hi",
Stream: true,
}
// Marshal data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/completions endpoint.
response, err := http.Post(aiosBaseUrl+"/v1/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// You can receive responses as the model generates tokens.
// Scan the HTTP response body and process line by line.
var v map[string]interface{}
scanner := bufio.NewScanner(response.Body)
for scanner.Scan() {
line := bytes.TrimSpace(scanner.Bytes())
// Skip lines that don't start with "data: ".
if !bytes.HasPrefix(line, []byte("data: ")) {
continue
}
// Remove the "data: " prefix.
payload := bytes.TrimPrefix(line, []byte("data: "))
// If payload is "[DONE]", end streaming.
if bytes.Equal(payload, []byte("[DONE]")) {
break
}
// Parse the JSON data.
json.Unmarshal(payload, &v)
// Extract the choices array from the response.
choices := v["choices"].([]interface{})
// Extract the first data.
choice := choices[0].(map[string]interface{})
// Extract the response token generated by the AI model.
text := choice["text"]
fmt.Println(text)
}
}package main
import (
"context"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
"github.com/openai/openai-go/packages/param"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
func main() {
// Create an OpenAI client.
// Use option.WithBaseURL to set the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl + "/v1"),
)
// Generate a streaming completion using the AIOS model.
// Use openai.CompletionNewParams to set the model and prompt.
completion := client.Completions.NewStreaming(context.TODO(), openai.CompletionNewParams{
Model: openai.CompletionNewParamsModel(model),
Prompt: openai.CompletionNewParamsPromptUnion{OfString: param.Opt[string]{Value: "Hi"}},
})
// You can receive responses as the model generates tokens.
// The Next() method returns true when there is a next chunk.
for completion.Next() {
// Get the choices slice of the current chunk.
chunk := completion.Current().Choices
// choices[0].text is the response text generated by the AI model.
fmt.Println(chunk[0].Text)
}
}package main
import (
"context"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
"github.com/openai/openai-go/packages/param"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
func main() {
// Create an OpenAI client.
// Use option.WithBaseURL to set the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl + "/v1"),
)
// Generate a streaming completion using the AIOS model.
// Use openai.CompletionNewParams to set the model and prompt.
completion := client.Completions.NewStreaming(context.TODO(), openai.CompletionNewParams{
Model: openai.CompletionNewParamsModel(model),
Prompt: openai.CompletionNewParamsPromptUnion{OfString: param.Opt[string]{Value: "Hi"}},
})
// You can receive responses as the model generates tokens.
// The Next() method returns true when there is a next chunk.
for completion.Next() {
// Get the choices slice of the current chunk.
chunk := completion.Current().Choices
// choices[0].text is the response text generated by the AI model.
fmt.Println(chunk[0].Text)
}
}Response
Answers are generated for each token, and each token can be checked in the text field of choices.
I
'm
looking
for
a
way
to
check
if
a
specific
process
is
running
on
Chat Completion API
The Chat Completion API receives a list of messages (context) listed in order and generates an appropriate message as the next response.
non-stream request
Request
If the messages consist only of text messages, you can call them as follows.
import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Configure the request data.
# This includes the model ID to use and the messages list.
# The messages list includes system messages and user messages.
data = {
"model": model,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hi"}
]
}
# Send a POST request to the AIOS API's v1/chat/completions endpoint.
response = requests.post(urljoin(aios_base_url, "v1/chat/completions"), json=data)
# Parse the response body in JSON format.
body = json.loads(response.text)
# choices[0].message is the response generated by the AI model.
print(body["choices"][0]["message"])import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Configure the request data.
# This includes the model ID to use and the messages list.
# The messages list includes system messages and user messages.
data = {
"model": model,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hi"}
]
}
# Send a POST request to the AIOS API's v1/chat/completions endpoint.
response = requests.post(urljoin(aios_base_url, "v1/chat/completions"), json=data)
# Parse the response body in JSON format.
body = json.loads(response.text)
# choices[0].message is the response generated by the AI model.
print(body["choices"][0]["message"])from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Generate a chat completion using the AIOS model.
# model parameter specifies the model ID to use.
# messages parameter is a list of messages including system and user messages.
response = client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hi"}
]
)
# Output choices[0].message from the generated response.
print(response.choices[0].message.model_dump())from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Generate a chat completion using the AIOS model.
# model parameter specifies the model ID to use.
# messages parameter is a list of messages including system and user messages.
response = client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hi"}
]
)
# Output choices[0].message from the generated response.
print(response.choices[0].message.model_dump())from langchain_openai import ChatOpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# model parameter specifies the model ID to use.
chat_llm = ChatOpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
# Configure the chat messages list.
# Include system messages and user messages.
messages = [
("system", "You are a helpful assistant."),
("human", "Hi"),
]
# Pass the messages list to the chat LLM and receive a response.
# The invoke method returns the model's output.
chat_completion = chat_llm.invoke(messages)
# Output the generated response.
print(chat_completion.model_dump())from langchain_openai import ChatOpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# model parameter specifies the model ID to use.
chat_llm = ChatOpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
# Configure the chat messages list.
# Include system messages and user messages.
messages = [
("system", "You are a helpful assistant."),
("human", "Hi"),
]
# Pass the messages list to the chat LLM and receive a response.
# The invoke method returns the model's output.
chat_completion = chat_llm.invoke(messages)
# Output the generated response.
print(chat_completion.model_dump())const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Configure the request data.
// This includes the model ID to use and the messages list.
// The messages list includes system messages and user messages.
const data = {
model: model,
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hi" },
],
};
// Create the AIOS API's v1/chat/completions endpoint URL.
let url = new URL("/v1/chat/completions", aios_base_url);
// Send a POST request to the AIOS API.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
// Parse the response body in JSON format.
const body = await response.json();
// Output choices[0].message from the generated response.
console.log(body.choices[0].message);const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Configure the request data.
// This includes the model ID to use and the messages list.
// The messages list includes system messages and user messages.
const data = {
model: model,
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hi" },
],
};
// Create the AIOS API's v1/chat/completions endpoint URL.
let url = new URL("/v1/chat/completions", aios_base_url);
// Send a POST request to the AIOS API.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
// Parse the response body in JSON format.
const body = await response.json();
// Output choices[0].message from the generated response.
console.log(body.choices[0].message);import OpenAI from "openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create an OpenAI client.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// baseURL points to the v1 endpoint of the AIOS API.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Generate a chat completion using the AIOS model.
// model parameter specifies the model ID to use.
// messages parameter is a list of messages including system and user messages.
const response = await client.chat.completions.create({
model: model,
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hi" },
],
});
// Output choices[0].message from the generated response.
console.log(response.choices[0].message);import OpenAI from "openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create an OpenAI client.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// baseURL points to the v1 endpoint of the AIOS API.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Generate a chat completion using the AIOS model.
// model parameter specifies the model ID to use.
// messages parameter is a list of messages including system and user messages.
const response = await client.chat.completions.create({
model: model,
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hi" },
],
});
// Output choices[0].message from the generated response.
console.log(response.choices[0].message);import { HumanMessage, SystemMessage } from "@langchain/core/messages";
import { ChatOpenAI } from "@langchain/openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
// model parameter specifies the model ID to use.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// configuration.baseURL points to the v1 endpoint of the AIOS API.
const llm = new ChatOpenAI({
model: model,
apiKey: "EMPTY_KEY",
configuration: {
baseURL: new URL("v1", aios_base_url).href,
},
});
// Configure the chat messages list.
// Include system messages and user messages using SystemMessage and HumanMessage objects.
const messages = [
new SystemMessage("You are a helpful assistant."),
new HumanMessage("Hi"),
];
// Pass the messages list to the chat LLM and receive a response.
// The invoke method returns the model's output.
const response = await llm.invoke(messages);
// Output the content of the generated response.
// This content is the response text generated by the AI model.
console.log(response.content);import { HumanMessage, SystemMessage } from "@langchain/core/messages";
import { ChatOpenAI } from "@langchain/openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
// model parameter specifies the model ID to use.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// configuration.baseURL points to the v1 endpoint of the AIOS API.
const llm = new ChatOpenAI({
model: model,
apiKey: "EMPTY_KEY",
configuration: {
baseURL: new URL("v1", aios_base_url).href,
},
});
// Configure the chat messages list.
// Include system messages and user messages using SystemMessage and HumanMessage objects.
const messages = [
new SystemMessage("You are a helpful assistant."),
new HumanMessage("Hi"),
];
// Pass the messages list to the chat LLM and receive a response.
// The invoke method returns the model's output.
const response = await llm.invoke(messages);
// Output the content of the generated response.
// This content is the response text generated by the AI model.
console.log(response.content);package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the message structure.
// Role: Message role (e.g., system, user)
// Content: Message content
type Message struct {
Role string `json:"role"`
Content string `json:"content"`
}
// Define the data structure to be used for POST requests.
// Model: Model ID to use
// Messages: List of messages
// Stream: Whether to stream response (optional)
type PostData struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
Stream bool `json:"stream,omitempty"`
}
func main() {
// Create request data.
// The messages list includes system messages and user messages.
data := PostData{
Model: model,
Messages: []Message{
{
Role: "system",
Content: "You are a helpful assistant.",
},
{
Role: "user",
Content: "Hi",
},
},
}
// Marshal data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/chat/completions endpoint.
response, err := http.Post(aiosBaseUrl+"/v1/chat/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// Read the entire response body.
body, err := io.ReadAll(response.Body)
if err != nil {
panic(err)
}
// Unmarshal the response body into map format.
var v map[string]interface{}
json.Unmarshal(body, &v)
// Extract the choices array from the response.
choices := v["choices"].([]interface{})
// Extract the first data.
choice := choices[0].(map[string]interface{})
// Format and output the response message generated by the AI model in JSON format.
message, err := json.MarshalIndent(choice["message"], "", " ")
if err != nil {
panic(err)
}
fmt.Println(string(message))
}package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the message structure.
// Role: Message role (e.g., system, user)
// Content: Message content
type Message struct {
Role string `json:"role"`
Content string `json:"content"`
}
// Define the data structure to be used for POST requests.
// Model: Model ID to use
// Messages: List of messages
// Stream: Whether to stream response (optional)
type PostData struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
Stream bool `json:"stream,omitempty"`
}
func main() {
// Create request data.
// The messages list includes system messages and user messages.
data := PostData{
Model: model,
Messages: []Message{
{
Role: "system",
Content: "You are a helpful assistant.",
},
{
Role: "user",
Content: "Hi",
},
},
}
// Marshal data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/chat/completions endpoint.
response, err := http.Post(aiosBaseUrl+"/v1/chat/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// Read the entire response body.
body, err := io.ReadAll(response.Body)
if err != nil {
panic(err)
}
// Unmarshal the response body into map format.
var v map[string]interface{}
json.Unmarshal(body, &v)
// Extract the choices array from the response.
choices := v["choices"].([]interface{})
// Extract the first data.
choice := choices[0].(map[string]interface{})
// Format and output the response message generated by the AI model in JSON format.
message, err := json.MarshalIndent(choice["message"], "", " ")
if err != nil {
panic(err)
}
fmt.Println(string(message))
}package main
import (
"context"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
func main() {
// Create an OpenAI client.
// Use option.WithBaseURL to set the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl + "/v1"),
)
// Generate a chat completion using the AIOS model.
// Use openai.ChatCompletionNewParams to set the model and messages list.
// The messages list includes system messages and user messages.
response, err := client.Chat.Completions.New(context.TODO(), openai.ChatCompletionNewParams{
Model: model,
Messages: []openai.ChatCompletionMessageParamUnion{
openai.SystemMessage("You are a helpful assistant."),
openai.UserMessage("Hi"),
},
})
if err != nil {
panic(err)
}
// Format and output the response message generated by the AI model in JSON format.
fmt.Println(response.Choices[0].Message.RawJSON())
}package main
import (
"context"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
func main() {
// Create an OpenAI client.
// Use option.WithBaseURL to set the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl + "/v1"),
)
// Generate a chat completion using the AIOS model.
// Use openai.ChatCompletionNewParams to set the model and messages list.
// The messages list includes system messages and user messages.
response, err := client.Chat.Completions.New(context.TODO(), openai.ChatCompletionNewParams{
Model: model,
Messages: []openai.ChatCompletionMessageParamUnion{
openai.SystemMessage("You are a helpful assistant."),
openai.UserMessage("Hi"),
},
})
if err != nil {
panic(err)
}
// Format and output the response message generated by the AI model in JSON format.
fmt.Println(response.Choices[0].Message.RawJSON())
}Response
You can check the model’s answer content in message of choices.
{
'annotations': None,
'audio': None,
'content': 'Hello! How can I help you today?',
'function_call': None,
'reasoning_content': 'The user says "Hi". We respond politely.',
'refusal': None,
'role': 'assistant',
'tool_calls': []
}
stream request
Using stream, you can receive and process responses for each token the model generates, instead of waiting for the model to generate all answers and receiving them at once.
Request
Enter True for the stream parameter value.
import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Configure the request data.
# This includes the model ID to use, the messages list, and whether to stream.
# The messages list includes system messages and user messages.
data = {
"model": model,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hi"}
],
"stream": True
}
# Send a POST request to the AIOS API's v1/chat/completions endpoint.
# Set stream=True to receive real-time streaming responses.
response = requests.post(urljoin(aios_base_url, "v1/chat/completions"), json=data, stream=True)
# You can receive responses as the model generates tokens.
# Responses are sent separated by each line, so process with iter_lines().
for line in response.iter_lines():
if line:
try:
# Remove the 'data: ' prefix and parse the JSON data.
body = json.loads(line[len("data: "):])
# Output the delta (choices[0].delta).
# The delta is the response token generated by the AI model.
print(body["choices"][0]["delta"])
except:
passimport json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Configure the request data.
# This includes the model ID to use, the messages list, and whether to stream.
# The messages list includes system messages and user messages.
data = {
"model": model,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hi"}
],
"stream": True
}
# Send a POST request to the AIOS API's v1/chat/completions endpoint.
# Set stream=True to receive real-time streaming responses.
response = requests.post(urljoin(aios_base_url, "v1/chat/completions"), json=data, stream=True)
# You can receive responses as the model generates tokens.
# Responses are sent separated by each line, so process with iter_lines().
for line in response.iter_lines():
if line:
try:
# Remove the 'data: ' prefix and parse the JSON data.
body = json.loads(line[len("data: "):])
# Output the delta (choices[0].delta).
# The delta is the response token generated by the AI model.
print(body["choices"][0]["delta"])
except:
passfrom openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Generate a chat completion using the AIOS model.
# model parameter specifies the model ID to use.
# messages parameter is a list of messages including system and user messages.
# Set stream=True to receive real-time streaming responses.
response = client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hi"}
],
stream=True
)
# You can receive responses as the model generates tokens.
# response is sent in stream format, so you can process it iteratively.
for chunk in response:
# Output the delta (choices[0].delta).
# The delta is the response token generated by the AI model.
print(chunk.choices[0].delta.model_dump())from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Generate a chat completion using the AIOS model.
# model parameter specifies the model ID to use.
# messages parameter is a list of messages including system and user messages.
# Set stream=True to receive real-time streaming responses.
response = client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hi"}
],
stream=True
)
# You can receive responses as the model generates tokens.
# response is sent in stream format, so you can process it iteratively.
for chunk in response:
# Output the delta (choices[0].delta).
# The delta is the response token generated by the AI model.
print(chunk.choices[0].delta.model_dump())from langchain_openai import ChatOpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# model parameter specifies the model ID to use.
llm = ChatOpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
# Configure the chat messages list.
# Include system messages and user messages.
messages = [
("system", "You are a helpful assistant."),
("human", "Hi"),
]
# You can receive responses as the model generates tokens.
# The llm.stream method returns a stream that generates tokens in real-time.
for chunk in llm.stream(messages):
# Output each chunk.
# This chunk is the response token generated by the AI model.
print(chunk)from langchain_openai import ChatOpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
# base_url points to the v1 endpoint of the AIOS API,
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# model parameter specifies the model ID to use.
llm = ChatOpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
# Configure the chat messages list.
# Include system messages and user messages.
messages = [
("system", "You are a helpful assistant."),
("human", "Hi"),
]
# You can receive responses as the model generates tokens.
# The llm.stream method returns a stream that generates tokens in real-time.
for chunk in llm.stream(messages):
# Output each chunk.
# This chunk is the response token generated by the AI model.
print(chunk)const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Configure the request data.
// This includes the model ID to use, the messages list, and whether to stream.
// The messages list includes system messages and user messages.
const data = {
model: model,
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hi" },
],
stream: true,
};
// Create the AIOS API's v1/chat/completions endpoint URL.
let url = new URL("/v1/chat/completions", aios_base_url);
// Send a POST request to the AIOS API.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
// You can receive responses as the model generates tokens.
// Convert the response body to a text decoder stream and read it.
const reader = response.body.pipeThrough(new TextDecoderStream()).getReader();
let buf = "";
while (true) {
const { value, done } = await reader.read();
if (done) break;
// Add received data to buffer.
buf += value;
let sep;
// Find newline characters (\n\n) in the buffer and separate data.
while ((sep = buf.indexOf("\n\n")) >= 0) {
const data = buf.slice(0, sep);
buf = buf.slice(sep + 2);
// Process each line.
for (const rawLine of data.split("\n")) {
const line = rawLine.trim();
if (!line.startsWith("data: ")) continue;
// Remove the "data: " prefix and extract JSON data.
const payload = line.slice("data: ".length).trim();
if (payload === "[DONE]") break;
// Parse the JSON data.
const json = JSON.parse(payload);
// Output the delta (choices[0].delta).
// The delta is the response token generated by the AI model.
console.log(json.choices[0].delta);
}
}
}const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Configure the request data.
// This includes the model ID to use, the messages list, and whether to stream.
// The messages list includes system messages and user messages.
const data = {
model: model,
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hi" },
],
stream: true,
};
// Create the AIOS API's v1/chat/completions endpoint URL.
let url = new URL("/v1/chat/completions", aios_base_url);
// Send a POST request to the AIOS API.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
// You can receive responses as the model generates tokens.
// Convert the response body to a text decoder stream and read it.
const reader = response.body.pipeThrough(new TextDecoderStream()).getReader();
let buf = "";
while (true) {
const { value, done } = await reader.read();
if (done) break;
// Add received data to buffer.
buf += value;
let sep;
// Find newline characters (\n\n) in the buffer and separate data.
while ((sep = buf.indexOf("\n\n")) >= 0) {
const data = buf.slice(0, sep);
buf = buf.slice(sep + 2);
// Process each line.
for (const rawLine of data.split("\n")) {
const line = rawLine.trim();
if (!line.startsWith("data: ")) continue;
// Remove the "data: " prefix and extract JSON data.
const payload = line.slice("data: ".length).trim();
if (payload === "[DONE]") break;
// Parse the JSON data.
const json = JSON.parse(payload);
// Output the delta (choices[0].delta).
// The delta is the response token generated by the AI model.
console.log(json.choices[0].delta);
}
}
}import OpenAI from "openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create an OpenAI client.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// baseURL points to the v1 endpoint of the AIOS API.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Generate a chat completion using the AIOS model.
// model parameter specifies the model ID to use.
// messages parameter is a list of messages including system and user messages.
// Set stream: true to receive real-time streaming responses.
const response = await client.chat.completions.create({
model: model,
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hi" },
],
stream: true,
});
// You can receive responses as the model generates tokens.
// Use for await...of loop to sequentially process stream events.
for await (const event of response) {
// Output the delta (choices[0].delta).
// The delta is the response token generated by the AI model.
console.log(event.choices[0].delta);
}import OpenAI from "openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create an OpenAI client.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// baseURL points to the v1 endpoint of the AIOS API.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Generate a chat completion using the AIOS model.
// model parameter specifies the model ID to use.
// messages parameter is a list of messages including system and user messages.
// Set stream: true to receive real-time streaming responses.
const response = await client.chat.completions.create({
model: model,
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hi" },
],
stream: true,
});
// You can receive responses as the model generates tokens.
// Use for await...of loop to sequentially process stream events.
for await (const event of response) {
// Output the delta (choices[0].delta).
// The delta is the response token generated by the AI model.
console.log(event.choices[0].delta);
}import { ChatOpenAI } from "@langchain/openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
// model parameter specifies the model ID to use.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// configuration.baseURL points to the v1 endpoint of the AIOS API.
const llm = new ChatOpenAI({
model: model,
apiKey: "EMPTY_KEY",
configuration: {
baseURL: new URL("v1", aios_base_url).href,
},
});
// Configure the chat messages list.
// Include system messages and user messages.
const messages = [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hi" },
];
// You can receive responses as the model generates tokens.
// The llm.stream method returns a stream that generates tokens in real-time.
const completion = await llm.stream(messages);
for await (const chunk of completion) {
// Output the content of each chunk.
// This content is the response token generated by the AI model.
console.log(chunk.content);
}import { ChatOpenAI } from "@langchain/openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
// model parameter specifies the model ID to use.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// configuration.baseURL points to the v1 endpoint of the AIOS API.
const llm = new ChatOpenAI({
model: model,
apiKey: "EMPTY_KEY",
configuration: {
baseURL: new URL("v1", aios_base_url).href,
},
});
// Configure the chat messages list.
// Include system messages and user messages.
const messages = [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hi" },
];
// You can receive responses as the model generates tokens.
// The llm.stream method returns a stream that generates tokens in real-time.
const completion = await llm.stream(messages);
for await (const chunk of completion) {
// Output the content of each chunk.
// This content is the response token generated by the AI model.
console.log(chunk.content);
}package main
import (
"bufio"
"bytes"
"encoding/json"
"fmt"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the message structure.
// Role: Message role (e.g., system, user)
// Content: Message content
type Message struct {
Role string `json:"role"`
Content string `json:"content"`
}
// Define the data structure to be used for POST requests.
// Model: Model ID to use
// Messages: List of messages
// Stream: Whether to stream response (optional)
type PostData struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
Stream bool `json:"stream,omitempty"`
}
func main() {
// Create request data.
// The messages list includes system messages and user messages.
// Set Stream: true to receive real-time streaming responses.
data := PostData{
Model: model,
Messages: []Message{
{
Role: "system",
Content: "You are a helpful assistant.",
},
{
Role: "user",
Content: "Hi",
},
},
Stream: true,
}
// Marshal data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/chat/completions endpoint.
response, err := http.Post(aiosBaseUrl+"/v1/chat/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// You can receive responses as the model generates tokens.
// Scan the HTTP response body and process line by line.
var v map[string]interface{}
scanner := bufio.NewScanner(response.Body)
for scanner.Scan() {
line := bytes.TrimSpace(scanner.Bytes())
// Skip lines that don't start with "data: ".
if !bytes.HasPrefix(line, []byte("data: ")) {
continue
}
// Remove the "data: " prefix.
payload := bytes.TrimPrefix(line, []byte("data: "))
// If payload is "[DONE]", end streaming.
if bytes.Equal(payload, []byte("[DONE]")) {
break
}
// Parse the JSON data.
json.Unmarshal(payload, &v)
// Extract the choices array from the response.
choices := v["choices"].([]interface{})
// Extract the first data.
choice := choices[0].(map[string]interface{})
// Serialize the delta to JSON format and output it.
message, err := json.Marshal(choice["delta"])
if err != nil {
panic(err)
}
fmt.Println(string(message))
}
}package main
import (
"bufio"
"bytes"
"encoding/json"
"fmt"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the message structure.
// Role: Message role (e.g., system, user)
// Content: Message content
type Message struct {
Role string `json:"role"`
Content string `json:"content"`
}
// Define the data structure to be used for POST requests.
// Model: Model ID to use
// Messages: List of messages
// Stream: Whether to stream response (optional)
type PostData struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
Stream bool `json:"stream,omitempty"`
}
func main() {
// Create request data.
// The messages list includes system messages and user messages.
// Set Stream: true to receive real-time streaming responses.
data := PostData{
Model: model,
Messages: []Message{
{
Role: "system",
Content: "You are a helpful assistant.",
},
{
Role: "user",
Content: "Hi",
},
},
Stream: true,
}
// Marshal data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/chat/completions endpoint.
response, err := http.Post(aiosBaseUrl+"/v1/chat/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// You can receive responses as the model generates tokens.
// Scan the HTTP response body and process line by line.
var v map[string]interface{}
scanner := bufio.NewScanner(response.Body)
for scanner.Scan() {
line := bytes.TrimSpace(scanner.Bytes())
// Skip lines that don't start with "data: ".
if !bytes.HasPrefix(line, []byte("data: ")) {
continue
}
// Remove the "data: " prefix.
payload := bytes.TrimPrefix(line, []byte("data: "))
// If payload is "[DONE]", end streaming.
if bytes.Equal(payload, []byte("[DONE]")) {
break
}
// Parse the JSON data.
json.Unmarshal(payload, &v)
// Extract the choices array from the response.
choices := v["choices"].([]interface{})
// Extract the first data.
choice := choices[0].(map[string]interface{})
// Serialize the delta to JSON format and output it.
message, err := json.Marshal(choice["delta"])
if err != nil {
panic(err)
}
fmt.Println(string(message))
}
}package main
import (
"context"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
func main() {
// Create an OpenAI client.
// Use option.WithBaseURL to set the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl + "/v1"),
)
// Generate a streaming chat completion using the AIOS model.
// Use openai.ChatCompletionNewParams to set the model and messages list.
completion := client.Chat.Completions.NewStreaming(context.TODO(), openai.ChatCompletionNewParams{
Model: model,
Messages: []openai.ChatCompletionMessageParamUnion{
openai.SystemMessage("You are a helpful assistant."),
openai.UserMessage("Hi"),
},
})
// You can receive responses as the model generates tokens.
// The Next() method returns true when there is a next chunk.
for completion.Next() {
// Get the choices slice of the current chunk.
chunk := completion.Current().Choices
// choices[0].text is the response text generated by the AI model.
fmt.Println(chunk[0].Message.Content)
}
}package main
import (
"context"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
func main() {
// Create an OpenAI client.
// Use option.WithBaseURL to set the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl + "/v1"),
)
// Generate a streaming chat completion using the AIOS model.
// Use openai.ChatCompletionNewParams to set the model and messages list.
completion := client.Chat.Completions.NewStreaming(context.TODO(), openai.ChatCompletionNewParams{
Model: model,
Messages: []openai.ChatCompletionMessageParamUnion{
openai.SystemMessage("You are a helpful assistant."),
openai.UserMessage("Hi"),
},
})
// You can receive responses as the model generates tokens.
// The Next() method returns true when there is a next chunk.
for completion.Next() {
// Get the choices slice of the current chunk.
chunk := completion.Current().Choices
// choices[0].text is the response text generated by the AI model.
fmt.Println(chunk[0].Message.Content)
}
}Response
Answers are generated for each token, and each token can be checked in the text field of choices.
I
'm
looking
for
a
way
to
check
if
a
specific
process
is
running
on
}
tool calling
Tool Calling allows the model to call external functions to perform specific tasks.
The model analyzes the user’s request, selects the necessary tools, and generates the arguments as a response to call those tools.
After executing the actual tool using the tool call message generated by the model, you compose the result as a tool message and request the model again, then the model generates a natural response to the user based on the tool execution results.
- The openai/gpt-oss-120b model does not support the tool calling feature.
Request
import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Define a function to get weather information.
# This function returns the current temperature in Celsius for the provided coordinates.
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current temperature for provided coordinates in celsius.",
"parameters": {
"type": "object",
"properties": {
"latitude": {"type": "number"},
"longitude": {"type": "number"}
},
"required": ["latitude", "longitude"],
"additionalProperties": False
},
"strict": True
}
}]
# Define the user message.
# The user is asking about today's weather in Paris.
messages = [{"role": "user", "content": "What is the weather like in Paris today?"}]
# Configure the request data.
# This includes the model ID to use, the messages list, and the tools list.
data = {
"model": model,
"messages": messages,
"tools": tools
}
# Send a POST request to the AIOS API's v1/chat/completions endpoint.
# This request instructs the model to process the user's question and determine the necessary tool calls.
response = requests.post(urljoin(aios_base_url, "v1/chat/completions"), json=data)
# Parse the response body in JSON format.
body = json.loads(response.text)
# Print the tool call information from the response generated by the AI model.
# This information indicates which tool the model should call.
print(body["choices"][0]["message"]["tool_calls"])
# Implementation of the weather function, always returns 14 degrees.
def get_weather(latitude, longitude):
return "14℃"
# Extract tool call information from the first response.
# This retrieves the tool call information requested by the model.
tool_call = body["choices"][0]["message"]["tool_calls"][0]
# Parse the arguments of the tool call from JSON string format to dict format.
args = json.loads(tool_call["function"]["arguments"])
# Call the actual function to get the result. (e.g., "14℃")
# At this step, the actual weather information lookup logic is executed.
result = get_weather(args["latitude"], args["longitude"]) # "14℃"
# Add the function call result as a **tool** message to the conversation context and call the model again,
# then the model generates an appropriate response using the function call result.
# Add the model's tool call message to messages to maintain the conversation context.
messages.append(body["choices"][0]["message"])
# Add the result of calling the actual function to messages.
# This allows the model to generate a final response based on the tool call result.
messages.append({
"role": "tool",
"tool_call_id": tool_call["id"],
"content": str(result)
})
# Configure the second request data.
# This includes the model ID to use and the updated messages list.
data = {
"model": model,
"messages": messages,
}
# Send a POST request to the AIOS API's v1/chat/completions endpoint.
# This request generates a final response based on the tool call result.
response_2 = requests.post(urljoin(aios_base_url, "v1/chat/completions"), json=data)
# Parse the response body in JSON format.
body = json.loads(response_2.text)
# Print the message generated by the AI in the second response.
# This is the final answer to the user's question.
print(body["choices"][0]["message"])import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Define a function to get weather information.
# This function returns the current temperature in Celsius for the provided coordinates.
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current temperature for provided coordinates in celsius.",
"parameters": {
"type": "object",
"properties": {
"latitude": {"type": "number"},
"longitude": {"type": "number"}
},
"required": ["latitude", "longitude"],
"additionalProperties": False
},
"strict": True
}
}]
# Define the user message.
# The user is asking about today's weather in Paris.
messages = [{"role": "user", "content": "What is the weather like in Paris today?"}]
# Configure the request data.
# This includes the model ID to use, the messages list, and the tools list.
data = {
"model": model,
"messages": messages,
"tools": tools
}
# Send a POST request to the AIOS API's v1/chat/completions endpoint.
# This request instructs the model to process the user's question and determine the necessary tool calls.
response = requests.post(urljoin(aios_base_url, "v1/chat/completions"), json=data)
# Parse the response body in JSON format.
body = json.loads(response.text)
# Print the tool call information from the response generated by the AI model.
# This information indicates which tool the model should call.
print(body["choices"][0]["message"]["tool_calls"])
# Implementation of the weather function, always returns 14 degrees.
def get_weather(latitude, longitude):
return "14℃"
# Extract tool call information from the first response.
# This retrieves the tool call information requested by the model.
tool_call = body["choices"][0]["message"]["tool_calls"][0]
# Parse the arguments of the tool call from JSON string format to dict format.
args = json.loads(tool_call["function"]["arguments"])
# Call the actual function to get the result. (e.g., "14℃")
# At this step, the actual weather information lookup logic is executed.
result = get_weather(args["latitude"], args["longitude"]) # "14℃"
# Add the function call result as a **tool** message to the conversation context and call the model again,
# then the model generates an appropriate response using the function call result.
# Add the model's tool call message to messages to maintain the conversation context.
messages.append(body["choices"][0]["message"])
# Add the result of calling the actual function to messages.
# This allows the model to generate a final response based on the tool call result.
messages.append({
"role": "tool",
"tool_call_id": tool_call["id"],
"content": str(result)
})
# Configure the second request data.
# This includes the model ID to use and the updated messages list.
data = {
"model": model,
"messages": messages,
}
# Send a POST request to the AIOS API's v1/chat/completions endpoint.
# This request generates a final response based on the tool call result.
response_2 = requests.post(urljoin(aios_base_url, "v1/chat/completions"), json=data)
# Parse the response body in JSON format.
body = json.loads(response_2.text)
# Print the message generated by the AI in the second response.
# This is the final answer to the user's question.
print(body["choices"][0]["message"])import json
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url points to the v1 endpoint of the AIOS API,
# and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Define a function to get weather information.
# This function returns the current temperature in Celsius for the provided coordinates.
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current temperature for provided coordinates in celsius.",
"parameters": {
"type": "object",
"properties": {
"latitude": {"type": "number"},
"longitude": {"type": "number"}
},
"required": ["latitude", "longitude"],
"additionalProperties": False
},
"strict": True
}
}]
# Define the user message.
# The user is asking about today's weather in Paris.
messages = [{"role": "user", "content": "What is the weather like in Paris today?"}]
# Generate a chat completion using the AIOS model.
# The model parameter specifies the model ID to use.
# The messages parameter is a list of messages containing the user message.
# The tools parameter provides metadata about the tools available to the model.
response = client.chat.completions.create(
model=model,
messages=messages,
tools=tools # Provides metadata about the tools available to the model.
)
# Print the tool call information from the response generated by the AI model.
# This information indicates which tool the model should call.
print(response.choices[0].message.tool_calls[0].model_dump())
# Implementation of the weather function, always returns 14 degrees.
def get_weather(latitude, longitude):
return "14℃"
# Extract tool call information from the first response.
# This retrieves the tool call information requested by the model.
tool_call = response.choices[0].message.tool_calls[0]
# Parse the tool call arguments in JSON format.
args = json.loads(tool_call.function.arguments)
# Call the actual function to get the result. (e.g., "14℃")
# At this step, the actual weather information lookup logic is executed.
result = get_weather(args["latitude"], args["longitude"]) # "14℃"
# Add the function call result as a **tool** message to the conversation context and call the model again,
# then the model generates an appropriate response using the function call result.
# Add the model's tool call message to messages to maintain the conversation context.
messages.append(response.choices[0].message)
# Add the result of calling the actual function to messages.
# This allows the model to generate a final response based on the tool call result.
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": str(result)
})
# Generate a second chat completion.
# This includes the model ID to use and the updated messages list.
# This request generates a final response based on the tool call result.
response_2 = client.chat.completions.create(
model=model,
messages=messages,
)
# Print the message generated by the AI in the second response.
# This is the final answer to the user's question.
print(response_2.choices[0].message.model_dump())import json
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url points to the v1 endpoint of the AIOS API,
# and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Define a function to get weather information.
# This function returns the current temperature in Celsius for the provided coordinates.
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current temperature for provided coordinates in celsius.",
"parameters": {
"type": "object",
"properties": {
"latitude": {"type": "number"},
"longitude": {"type": "number"}
},
"required": ["latitude", "longitude"],
"additionalProperties": False
},
"strict": True
}
}]
# Define the user message.
# The user is asking about today's weather in Paris.
messages = [{"role": "user", "content": "What is the weather like in Paris today?"}]
# Generate a chat completion using the AIOS model.
# The model parameter specifies the model ID to use.
# The messages parameter is a list of messages containing the user message.
# The tools parameter provides metadata about the tools available to the model.
response = client.chat.completions.create(
model=model,
messages=messages,
tools=tools # Provides metadata about the tools available to the model.
)
# Print the tool call information from the response generated by the AI model.
# This information indicates which tool the model should call.
print(response.choices[0].message.tool_calls[0].model_dump())
# Implementation of the weather function, always returns 14 degrees.
def get_weather(latitude, longitude):
return "14℃"
# Extract tool call information from the first response.
# This retrieves the tool call information requested by the model.
tool_call = response.choices[0].message.tool_calls[0]
# Parse the tool call arguments in JSON format.
args = json.loads(tool_call.function.arguments)
# Call the actual function to get the result. (e.g., "14℃")
# At this step, the actual weather information lookup logic is executed.
result = get_weather(args["latitude"], args["longitude"]) # "14℃"
# Add the function call result as a **tool** message to the conversation context and call the model again,
# then the model generates an appropriate response using the function call result.
# Add the model's tool call message to messages to maintain the conversation context.
messages.append(response.choices[0].message)
# Add the result of calling the actual function to messages.
# This allows the model to generate a final response based on the tool call result.
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": str(result)
})
# Generate a second chat completion.
# This includes the model ID to use and the updated messages list.
# This request generates a final response based on the tool call result.
response_2 = client.chat.completions.create(
model=model,
messages=messages,
)
# Print the message generated by the AI in the second response.
# This is the final answer to the user's question.
print(response_2.choices[0].message.model_dump())from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Define a tool function to get weather information.
# This function returns the current temperature in Celsius for the provided coordinates.
@tool
def get_weather(latitude: float, longitude: float) -> str:
"""Get current temperature for provided coordinates in celsius."""
return "14℃"
# Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
# base_url points to the v1 endpoint of the AIOS API,
# and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# The model parameter specifies the model ID to use.
chat_llm = ChatOpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
# Bind tools to the model.
# The get_weather function returns the current temperature in Celsius for the provided coordinates.
llm_with_tools = chat_llm.bind_tools([get_weather])
# Configure the list of chat messages.
# The user is asking about today's weather in Paris.
messages = [("human", "What is the weather like in Paris today?")]
# Pass the list of messages to the chat LLM to get a response.
# The invoke method returns the model's output.
# At this step, the model analyzes the user's question and determines the necessary tool calls.
response = llm_with_tools.invoke(messages)
# Print the tool call information from the response generated by the AI model.
# This information indicates which tool the model should call.
print(response.tool_calls)
# Add the model's tool call message to messages to maintain the conversation context.
# This allows the model to remember and connect previous conversation content.
messages.append(response)
# Call the actual tool function to get the result.
# At this step, the get_weather function is executed to return weather information.
tool_call = response.tool_calls[0]
tool_message = get_weather.invoke(tool_call)
# Add the tool call result to messages.
# This allows the model to generate a final response based on the tool call result.
messages.append(tool_message)
# Perform a second request to get the final answer.
# Now the model generates an appropriate response to the user based on the tool call result.
response2 = chat_llm.invoke(messages)
# Print the final AI model response.
# This is the final answer to the user's question.
print(response2.model_dump())from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Define a tool function to get weather information.
# This function returns the current temperature in Celsius for the provided coordinates.
@tool
def get_weather(latitude: float, longitude: float) -> str:
"""Get current temperature for provided coordinates in celsius."""
return "14℃"
# Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
# base_url points to the v1 endpoint of the AIOS API,
# and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# The model parameter specifies the model ID to use.
chat_llm = ChatOpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
# Bind tools to the model.
# The get_weather function returns the current temperature in Celsius for the provided coordinates.
llm_with_tools = chat_llm.bind_tools([get_weather])
# Configure the list of chat messages.
# The user is asking about today's weather in Paris.
messages = [("human", "What is the weather like in Paris today?")]
# Pass the list of messages to the chat LLM to get a response.
# The invoke method returns the model's output.
# At this step, the model analyzes the user's question and determines the necessary tool calls.
response = llm_with_tools.invoke(messages)
# Print the tool call information from the response generated by the AI model.
# This information indicates which tool the model should call.
print(response.tool_calls)
# Add the model's tool call message to messages to maintain the conversation context.
# This allows the model to remember and connect previous conversation content.
messages.append(response)
# Call the actual tool function to get the result.
# At this step, the get_weather function is executed to return weather information.
tool_call = response.tool_calls[0]
tool_message = get_weather.invoke(tool_call)
# Add the tool call result to messages.
# This allows the model to generate a final response based on the tool call result.
messages.append(tool_message)
# Perform a second request to get the final answer.
# Now the model generates an appropriate response to the user based on the tool call result.
response2 = chat_llm.invoke(messages)
# Print the final AI model response.
# This is the final answer to the user's question.
print(response2.model_dump())const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Define a function to get weather information.
// This function returns the current temperature in Celsius for the provided coordinates.
const tools = [
{
type: "function",
function: {
name: "get_weather",
description:
"Get current temperature for provided coordinates in celsius.",
parameters: {
type: "object",
properties: {
latitude: { type: "number" },
longitude: { type: "number" },
},
required: ["latitude", "longitude"],
additionalProperties: false,
},
strict: true,
},
},
];
// Define the user message.
// The user is asking about today's weather in Paris.
const messages = [
{ role: "user", content: "What is the weather like in Paris today?" },
];
// Configure the request data.
// This includes the model ID to use, the messages list, and the tools list.
let data = {
model: model,
messages: messages,
tools: tools,
};
// Generate the AIOS API's v1/chat/completions endpoint URL.
let url = new URL("/v1/chat/completions", aios_base_url);
// Send a POST request to the AIOS API.
// This request instructs the model to process the user's question and determine the necessary tool calls.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
// Parse the response body in JSON format.
let body = await response.json();
// Print the tool call information from the response generated by the AI model.
// This information indicates which tool the model should call.
console.log(JSON.stringify(body.choices[0].message.tool_calls));
// Implementation of the weather function, always returns 14 degrees.
function getWeather(latitude, longitude) {
return "14℃";
}
// Extract tool call information from the first response.
// This retrieves the tool call information requested by the model.
const toolCall = body.choices[0].message.tool_calls[0];
// Parse the tool call arguments in JSON format.
// This extracts the parameters needed for the tool call.
const args = JSON.parse(toolCall.function.arguments);
// Call the actual function to get the result. (e.g., "14℃")
// At this step, the actual weather information lookup logic is executed.
const result = getWeather(args.latitude, args.longitude);
// Add the function call result as a **tool** message to the conversation context and call the model again,
// then the model generates an appropriate response using the function call result.
// Add the model's tool call message to messages to maintain the conversation context.
messages.push(body.choices[0].message);
// Add the result of calling the actual function to messages.
// This allows the model to generate a final response based on the tool call result.
messages.push({
role: "tool",
tool_call_id: toolCall.id,
content: String(result),
});
// Configure the second request data.
// This includes the model ID to use and the updated messages list.
// This request generates a final response based on the tool call result.
data = {
model: model,
messages: messages,
};
// Send another POST request to the AIOS API.
// This request generates a final response based on the tool call result.
const response2 = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
body = await response2.json();
// Print the message generated by the AI in the second response.
// This is the final answer to the user's question.
console.log(JSON.stringify(body.choices[0].message));const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Define a function to get weather information.
// This function returns the current temperature in Celsius for the provided coordinates.
const tools = [
{
type: "function",
function: {
name: "get_weather",
description:
"Get current temperature for provided coordinates in celsius.",
parameters: {
type: "object",
properties: {
latitude: { type: "number" },
longitude: { type: "number" },
},
required: ["latitude", "longitude"],
additionalProperties: false,
},
strict: true,
},
},
];
// Define the user message.
// The user is asking about today's weather in Paris.
const messages = [
{ role: "user", content: "What is the weather like in Paris today?" },
];
// Configure the request data.
// This includes the model ID to use, the messages list, and the tools list.
let data = {
model: model,
messages: messages,
tools: tools,
};
// Generate the AIOS API's v1/chat/completions endpoint URL.
let url = new URL("/v1/chat/completions", aios_base_url);
// Send a POST request to the AIOS API.
// This request instructs the model to process the user's question and determine the necessary tool calls.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
// Parse the response body in JSON format.
let body = await response.json();
// Print the tool call information from the response generated by the AI model.
// This information indicates which tool the model should call.
console.log(JSON.stringify(body.choices[0].message.tool_calls));
// Implementation of the weather function, always returns 14 degrees.
function getWeather(latitude, longitude) {
return "14℃";
}
// Extract tool call information from the first response.
// This retrieves the tool call information requested by the model.
const toolCall = body.choices[0].message.tool_calls[0];
// Parse the tool call arguments in JSON format.
// This extracts the parameters needed for the tool call.
const args = JSON.parse(toolCall.function.arguments);
// Call the actual function to get the result. (e.g., "14℃")
// At this step, the actual weather information lookup logic is executed.
const result = getWeather(args.latitude, args.longitude);
// Add the function call result as a **tool** message to the conversation context and call the model again,
// then the model generates an appropriate response using the function call result.
// Add the model's tool call message to messages to maintain the conversation context.
messages.push(body.choices[0].message);
// Add the result of calling the actual function to messages.
// This allows the model to generate a final response based on the tool call result.
messages.push({
role: "tool",
tool_call_id: toolCall.id,
content: String(result),
});
// Configure the second request data.
// This includes the model ID to use and the updated messages list.
// This request generates a final response based on the tool call result.
data = {
model: model,
messages: messages,
};
// Send another POST request to the AIOS API.
// This request generates a final response based on the tool call result.
const response2 = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
body = await response2.json();
// Print the message generated by the AI in the second response.
// This is the final answer to the user's question.
console.log(JSON.stringify(body.choices[0].message));import OpenAI from "openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Define a function to get weather information.
// This function returns the current temperature in Celsius for the provided coordinates.
const tools = [
{
type: "function",
function: {
name: "get_weather",
description:
"Get current temperature for provided coordinates in celsius.",
parameters: {
type: "object",
properties: {
latitude: { type: "number" },
longitude: { type: "number" },
},
required: ["latitude", "longitude"],
additionalProperties: false,
},
strict: true,
},
},
];
// Define the user message.
// The user is asking about today's weather in Paris.
const messages = [
{ role: "user", content: "What is the weather like in Paris today?" },
];
// Create an OpenAI client.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// baseURL points to the v1 endpoint of the AIOS API.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Generate a chat completion using the AIOS model.
// The model parameter specifies the model ID to use.
// The messages parameter is a list of messages containing the user message.
// The tools parameter provides metadata about the tools available to the model.
const response = await client.chat.completions.create({
model: model,
messages: messages,
tools: tools,
});
// Print the tool call information from the response generated by the AI model.
// This information indicates which tool the model should call.
console.log(JSON.stringify(response.choices[0].message.tool_calls));
// Implementation of the weather function, always returns 14 degrees.
function getWeather(latitude, longitude) {
return "14℃";
}
// Extract tool call information from the first response.
// This retrieves the tool call information requested by the model.
const toolCall = response.choices[0].message.tool_calls[0];
// Parse the tool call arguments in JSON format.
// This extracts the parameters needed for the tool call.
const args = JSON.parse(toolCall.function.arguments);
// Call the actual function to get the result. (e.g., "14℃")
// At this step, the actual weather information lookup logic is executed.
const result = getWeather(args.latitude, args.longitude);
// Add the function call result as a **tool** message to the conversation context and call the model again,
// then the model generates an appropriate response using the function call result.
// Add the model's tool call message to messages to maintain the conversation context.
messages.push(response.choices[0].message);
// Add the result of calling the actual function to messages.
// This allows the model to generate a final response based on the tool call result.
messages.push({
role: "tool",
tool_call_id: toolCall.id,
content: String(result),
});
// Generate a second chat completion.
// This includes the model ID to use and the updated messages list.
// This request generates a final response based on the tool call result.
const response2 = await client.chat.completions.create({
model: model,
messages: messages,
});
// Print the message generated by the AI in the second response.
// This is the final answer to the user's question.
console.log(JSON.stringify(response2.choices[0].message));import OpenAI from "openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Define a function to get weather information.
// This function returns the current temperature in Celsius for the provided coordinates.
const tools = [
{
type: "function",
function: {
name: "get_weather",
description:
"Get current temperature for provided coordinates in celsius.",
parameters: {
type: "object",
properties: {
latitude: { type: "number" },
longitude: { type: "number" },
},
required: ["latitude", "longitude"],
additionalProperties: false,
},
strict: true,
},
},
];
// Define the user message.
// The user is asking about today's weather in Paris.
const messages = [
{ role: "user", content: "What is the weather like in Paris today?" },
];
// Create an OpenAI client.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// baseURL points to the v1 endpoint of the AIOS API.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Generate a chat completion using the AIOS model.
// The model parameter specifies the model ID to use.
// The messages parameter is a list of messages containing the user message.
// The tools parameter provides metadata about the tools available to the model.
const response = await client.chat.completions.create({
model: model,
messages: messages,
tools: tools,
});
// Print the tool call information from the response generated by the AI model.
// This information indicates which tool the model should call.
console.log(JSON.stringify(response.choices[0].message.tool_calls));
// Implementation of the weather function, always returns 14 degrees.
function getWeather(latitude, longitude) {
return "14℃";
}
// Extract tool call information from the first response.
// This retrieves the tool call information requested by the model.
const toolCall = response.choices[0].message.tool_calls[0];
// Parse the tool call arguments in JSON format.
// This extracts the parameters needed for the tool call.
const args = JSON.parse(toolCall.function.arguments);
// Call the actual function to get the result. (e.g., "14℃")
// At this step, the actual weather information lookup logic is executed.
const result = getWeather(args.latitude, args.longitude);
// Add the function call result as a **tool** message to the conversation context and call the model again,
// then the model generates an appropriate response using the function call result.
// Add the model's tool call message to messages to maintain the conversation context.
messages.push(response.choices[0].message);
// Add the result of calling the actual function to messages.
// This allows the model to generate a final response based on the tool call result.
messages.push({
role: "tool",
tool_call_id: toolCall.id,
content: String(result),
});
// Generate a second chat completion.
// This includes the model ID to use and the updated messages list.
// This request generates a final response based on the tool call result.
const response2 = await client.chat.completions.create({
model: model,
messages: messages,
});
// Print the message generated by the AI in the second response.
// This is the final answer to the user's question.
console.log(JSON.stringify(response2.choices[0].message));import { HumanMessage } from "@langchain/core/messages";
import { tool } from "@langchain/core/tools";
import { ChatOpenAI } from "@langchain/openai";
import { z } from "zod";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Define a tool function to get weather information.
// This function returns the current temperature in Celsius for the provided coordinates.
const getWeather = tool(
function (latitude, longitude) {
/**
* Get current temperature for provided coordinates in celsius.
*/
return "14℃";
},
{
name: "get_weather",
description: "Get current temperature for provided coordinates in celsius.",
schema: z.object({
latitude: z.number(),
longitude: z.number(),
}),
}
);
// Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
// base_url points to the v1 endpoint of the AIOS API,
// and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
// The model parameter specifies the model ID to use.
const llm = new ChatOpenAI({
model: model,
apiKey: "EMPTY_KEY",
configuration: {
baseURL: new URL("v1", aios_base_url).href,
},
});
// Bind tools to the model.
// The getWeather function returns the current temperature in Celsius for the provided coordinates.
const llmWithTools = llm.bindTools([getWeather]);
// Configure the list of chat messages.
// The user is asking about today's weather in Paris.
const messages = [new HumanMessage("What is the weather like in Paris today?")];
// Pass the list of messages to the chat LLM to get a response.
// The invoke method returns the model's output.
// This request instructs the model to process the user's question and determine the necessary tool calls.
const response = await llmWithTools.invoke(messages);
// Print the tool call information from the response generated by the AI model.
// This information indicates which tool the model should call.
console.log(response.tool_calls);
// Add the model's tool call message to messages to maintain the conversation context.
// This allows the model to remember and connect previous conversation content.
messages.push(response);
// Call the actual tool function to get the result.
// At this step, the getWeather function is executed to return weather information.
const toolCall = response.tool_calls[0];
const toolMessage = await getWeather.invoke(toolCall);
// Add the tool call result to messages.
// This allows the model to generate a final response based on the tool call result.
messages.push(toolMessage);
// Perform a second request to get the final answer.
// Now the model generates an appropriate response to the user based on the tool call result.
const response2 = await llm.invoke(messages);
// Print the final AI model response.
console.log(response2.content);import { HumanMessage } from "@langchain/core/messages";
import { tool } from "@langchain/core/tools";
import { ChatOpenAI } from "@langchain/openai";
import { z } from "zod";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Define a tool function to get weather information.
// This function returns the current temperature in Celsius for the provided coordinates.
const getWeather = tool(
function (latitude, longitude) {
/**
* Get current temperature for provided coordinates in celsius.
*/
return "14℃";
},
{
name: "get_weather",
description: "Get current temperature for provided coordinates in celsius.",
schema: z.object({
latitude: z.number(),
longitude: z.number(),
}),
}
);
// Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
// base_url points to the v1 endpoint of the AIOS API,
// and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
// The model parameter specifies the model ID to use.
const llm = new ChatOpenAI({
model: model,
apiKey: "EMPTY_KEY",
configuration: {
baseURL: new URL("v1", aios_base_url).href,
},
});
// Bind tools to the model.
// The getWeather function returns the current temperature in Celsius for the provided coordinates.
const llmWithTools = llm.bindTools([getWeather]);
// Configure the list of chat messages.
// The user is asking about today's weather in Paris.
const messages = [new HumanMessage("What is the weather like in Paris today?")];
// Pass the list of messages to the chat LLM to get a response.
// The invoke method returns the model's output.
// This request instructs the model to process the user's question and determine the necessary tool calls.
const response = await llmWithTools.invoke(messages);
// Print the tool call information from the response generated by the AI model.
// This information indicates which tool the model should call.
console.log(response.tool_calls);
// Add the model's tool call message to messages to maintain the conversation context.
// This allows the model to remember and connect previous conversation content.
messages.push(response);
// Call the actual tool function to get the result.
// At this step, the getWeather function is executed to return weather information.
const toolCall = response.tool_calls[0];
const toolMessage = await getWeather.invoke(toolCall);
// Add the tool call result to messages.
// This allows the model to generate a final response based on the tool call result.
messages.push(toolMessage);
// Perform a second request to get the final answer.
// Now the model generates an appropriate response to the user based on the tool call result.
const response2 = await llm.invoke(messages);
// Print the final AI model response.
console.log(response2.content);package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the Message structure.
// Role: Message role (user, assistant, tool, etc.)
// Content: Message content
// ToolCalls: Tool call information
// ToolCallId: Tool call identifier
type Message struct {
Role string `json:"role"`
Content string `json:"content,omitempty"`
ToolCalls []map[string]any `json:"tool_calls,omitempty"`
ToolCallId string `json:"tool_call_id,omitempty"`
}
// Define the POST request data structure.
// Model: Model ID to use
// Messages: List of messages
// Tools: List of available tools
// Stream: Whether to stream
type PostData struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
Tools []map[string]any `json:"tools,omitempty"`
Stream bool `json:"stream,omitempty"`
}
// Define a function to get weather information.
// This function always returns 14 degrees (sample implementation).
func getWeather(latitude float32, longitude float32) string {
_ = fmt.Sprintf("latitude: %f, longitude: %f", latitude, longitude)
return "14℃"
}
func main() {
// Define the user message.
// The user is asking about today's weather in Paris.
messages := []Message{
{
Role: "user",
Content: "What is the weather like in Paris today?",
},
}
// Define a function to get weather information.
// This tool returns the current temperature in Celsius for the provided coordinates.
tools := []map[string]any{
{
"type": "function",
"function": map[string]any{
"name": "get_weather",
"description": "Get current temperature for provided coordinates in celsius.",
"parameters": map[string]any{
"type": "object",
"properties": map[string]any{
"latitude": map[string]string{"type": "number"},
"longitude": map[string]string{"type": "number"},
},
"required": []string{"latitude", "longitude"},
"additionalProperties": false,
},
"strict": true,
},
},
}
// Configure the request data.
// This includes the model ID to use, the messages list, and the tools list.
data := PostData{
Model: model,
Messages: messages,
Tools: tools,
}
// Serialize the request data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/chat/completions endpoint.
// This request instructs the model to process the user's question and determine the necessary tool calls.
response, err := http.Post(aiosBaseUrl+"/v1/chat/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// Read the response body.
body, err := io.ReadAll(response.Body)
if err != nil {
panic(err)
}
// Parse the response body to map format.
var v map[string]interface{}
json.Unmarshal(body, &v)
// Extract message information from the first response.
choices := v["choices"].([]interface{})
choice := choices[0].(map[string]interface{})
message, err := json.MarshalIndent(choice["message"], "", " ")
if err != nil {
panic(err)
}
messageData := choice["message"].(map[string]interface{})
toolCalls := messageData["tool_calls"].([]interface{})
// Print the tool call information from the response generated by the AI model.
// This information indicates which tool the model should call.
toolCallJson, err := json.MarshalIndent(toolCalls, "", " ")
if err != nil {
panic(err)
}
fmt.Println(string(toolCallJson))
// Extract tool call information from the first response.
// This retrieves the tool call information requested by the model.
toolCall := toolCalls[0].(map[string]interface{})
function := toolCall["function"].(map[string]interface{})
// Parse the tool call arguments from JSON string format to map format.
// This extracts the parameters needed for the tool call.
var args map[string]float32
err = json.Unmarshal([]byte(function["arguments"].(string)), &args)
if err != nil {
panic(err)
}
// Call the actual function to get the result. (e.g., "14℃")
// At this step, the actual weather information lookup logic is executed.
result := getWeather(args["latitude"], args["longitude"])
// Convert the tool call result to a message.
var toolMessage Message
err = json.Unmarshal(message, &toolMessage)
if err != nil {
panic(err)
}
// Add the model's tool call message to messages to maintain the conversation context.
messages = append(messages, toolMessage)
// Add the result of calling the actual function to messages.
// This allows the model to generate a final response based on the tool call result.
messages = append(messages, Message{
Role: "tool",
ToolCallId: toolCall["id"].(string),
Content: string(result),
})
// Configure the second request data.
// This includes the model ID to use and the updated messages list.
// This request generates a final response based on the tool call result.
data = PostData{
Model: model,
Messages: messages,
}
jsonData, err = json.Marshal(data)
if err != nil {
panic(err)
}
// Send another POST request to the AIOS API.
// This request generates a final response based on the tool call result.
response2, err := http.Post(aiosBaseUrl+"/v1/chat/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response2.Body.Close()
// Read the second response body.
body, err = io.ReadAll(response2.Body)
if err != nil {
panic(err)
}
// Parse the second response to JSON format.
json.Unmarshal(body, &v)
// Print the message generated by the AI in the second response.
// This is the final answer to the user's question.
choices = v["choices"].([]interface{})
choice = choices[0].(map[string]interface{})
message, err = json.MarshalIndent(choice["message"], "", " ")
if err != nil {
panic(err)
}
fmt.Println(string(message))
}package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the Message structure.
// Role: Message role (user, assistant, tool, etc.)
// Content: Message content
// ToolCalls: Tool call information
// ToolCallId: Tool call identifier
type Message struct {
Role string `json:"role"`
Content string `json:"content,omitempty"`
ToolCalls []map[string]any `json:"tool_calls,omitempty"`
ToolCallId string `json:"tool_call_id,omitempty"`
}
// Define the POST request data structure.
// Model: Model ID to use
// Messages: List of messages
// Tools: List of available tools
// Stream: Whether to stream
type PostData struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
Tools []map[string]any `json:"tools,omitempty"`
Stream bool `json:"stream,omitempty"`
}
// Define a function to get weather information.
// This function always returns 14 degrees (sample implementation).
func getWeather(latitude float32, longitude float32) string {
_ = fmt.Sprintf("latitude: %f, longitude: %f", latitude, longitude)
return "14℃"
}
func main() {
// Define the user message.
// The user is asking about today's weather in Paris.
messages := []Message{
{
Role: "user",
Content: "What is the weather like in Paris today?",
},
}
// Define a function to get weather information.
// This tool returns the current temperature in Celsius for the provided coordinates.
tools := []map[string]any{
{
"type": "function",
"function": map[string]any{
"name": "get_weather",
"description": "Get current temperature for provided coordinates in celsius.",
"parameters": map[string]any{
"type": "object",
"properties": map[string]any{
"latitude": map[string]string{"type": "number"},
"longitude": map[string]string{"type": "number"},
},
"required": []string{"latitude", "longitude"},
"additionalProperties": false,
},
"strict": true,
},
},
}
// Configure the request data.
// This includes the model ID to use, the messages list, and the tools list.
data := PostData{
Model: model,
Messages: messages,
Tools: tools,
}
// Serialize the request data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/chat/completions endpoint.
// This request instructs the model to process the user's question and determine the necessary tool calls.
response, err := http.Post(aiosBaseUrl+"/v1/chat/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// Read the response body.
body, err := io.ReadAll(response.Body)
if err != nil {
panic(err)
}
// Parse the response body to map format.
var v map[string]interface{}
json.Unmarshal(body, &v)
// Extract message information from the first response.
choices := v["choices"].([]interface{})
choice := choices[0].(map[string]interface{})
message, err := json.MarshalIndent(choice["message"], "", " ")
if err != nil {
panic(err)
}
messageData := choice["message"].(map[string]interface{})
toolCalls := messageData["tool_calls"].([]interface{})
// Print the tool call information from the response generated by the AI model.
// This information indicates which tool the model should call.
toolCallJson, err := json.MarshalIndent(toolCalls, "", " ")
if err != nil {
panic(err)
}
fmt.Println(string(toolCallJson))
// Extract tool call information from the first response.
// This retrieves the tool call information requested by the model.
toolCall := toolCalls[0].(map[string]interface{})
function := toolCall["function"].(map[string]interface{})
// Parse the tool call arguments from JSON string format to map format.
// This extracts the parameters needed for the tool call.
var args map[string]float32
err = json.Unmarshal([]byte(function["arguments"].(string)), &args)
if err != nil {
panic(err)
}
// Call the actual function to get the result. (e.g., "14℃")
// At this step, the actual weather information lookup logic is executed.
result := getWeather(args["latitude"], args["longitude"])
// Convert the tool call result to a message.
var toolMessage Message
err = json.Unmarshal(message, &toolMessage)
if err != nil {
panic(err)
}
// Add the model's tool call message to messages to maintain the conversation context.
messages = append(messages, toolMessage)
// Add the result of calling the actual function to messages.
// This allows the model to generate a final response based on the tool call result.
messages = append(messages, Message{
Role: "tool",
ToolCallId: toolCall["id"].(string),
Content: string(result),
})
// Configure the second request data.
// This includes the model ID to use and the updated messages list.
// This request generates a final response based on the tool call result.
data = PostData{
Model: model,
Messages: messages,
}
jsonData, err = json.Marshal(data)
if err != nil {
panic(err)
}
// Send another POST request to the AIOS API.
// This request generates a final response based on the tool call result.
response2, err := http.Post(aiosBaseUrl+"/v1/chat/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response2.Body.Close()
// Read the second response body.
body, err = io.ReadAll(response2.Body)
if err != nil {
panic(err)
}
// Parse the second response to JSON format.
json.Unmarshal(body, &v)
// Print the message generated by the AI in the second response.
// This is the final answer to the user's question.
choices = v["choices"].([]interface{})
choice = choices[0].(map[string]interface{})
message, err = json.MarshalIndent(choice["message"], "", " ")
if err != nil {
panic(err)
}
fmt.Println(string(message))
}package main
import (
"context"
"encoding/json"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define a function to get weather information.
// This function always returns 14 degrees (sample implementation).
func getWeather(latitude float32, longitude float32) string {
_ = fmt.Sprintf("latitude: %f, longitude: %f", latitude, longitude)
return "14℃"
}
func main() {
// Create an OpenAI client.
// base_url points to the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl + "/v1"),
)
// Define the user message.
// The user is asking about today's weather in Paris.
messages := []openai.ChatCompletionMessageParamUnion{
openai.UserMessage("What is the weather like in Paris today?"),
}
// Generate a chat completion using the AIOS model.
// The model parameter specifies the model ID to use.
// The messages parameter is a list of messages containing the user message.
// The tools parameter provides metadata about the tools available to the model.
response, err := client.Chat.Completions.New(context.TODO(), openai.ChatCompletionNewParams{
Model: model,
Messages: messages,
Tools: []openai.ChatCompletionToolParam{
{
Function: openai.FunctionDefinitionParam{
Name: "get_weather",
Description: openai.String("Get current temperature for provided coordinates in celsius."),
Parameters: openai.FunctionParameters{
"type": "object",
"properties": map[string]interface{}{
"latitude": map[string]string{
"type": "number",
},
"longitude": map[string]string{
"type": "number",
},
},
"required": []string{"latitude", "longitude"},
"additionalProperties": false,
},
Strict: openai.Bool(true),
},
},
},
})
if err != nil {
panic(err)
}
// Print the response generated by the AI model.
// This response includes tool call information.
fmt.Println([]string{response.Choices[0].Message.ToolCalls[0].RawJSON()})
// Extract tool call information from the first response.
// This retrieves the tool call information requested by the model.
var v map[string]float32
toolCall := response.Choices[0].Message.ToolCalls[0]
args := toolCall.Function.Arguments
// Parse the tool call arguments from JSON string format to map format.
// This extracts the parameters needed for the tool call.
err = json.Unmarshal([]byte(args), &v)
if err != nil {
panic(err)
}
// Call the actual function to get the result. (e.g., "14℃")
// At this step, the actual weather information lookup logic is executed.
result := getWeather(v["latitude"], v["longitude"])
// Add the function call result as a **tool** message to the conversation context and call the model again,
// then the model generates an appropriate response using the function call result.
// Add the model's tool call message to messages to maintain the conversation context.
messages = append(messages, response.Choices[0].Message.ToParam())
// Add the result of calling the actual function to messages.
// This allows the model to generate a final response based on the tool call result.
messages = append(messages, openai.ToolMessage(string(result), toolCall.ID))
// Generate a second chat completion.
// This includes the model ID to use and the updated messages list.
// This request generates a final response based on the tool call result.
response2, err := client.Chat.Completions.New(context.TODO(), openai.ChatCompletionNewParams{
Model: model,
Messages: messages,
})
if err != nil {
panic(err)
}
// Print the message generated by the AI in the second response.
// This is the final answer to the user's question.
fmt.Println(response2.Choices[0].Message.RawJSON())
}package main
import (
"context"
"encoding/json"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define a function to get weather information.
// This function always returns 14 degrees (sample implementation).
func getWeather(latitude float32, longitude float32) string {
_ = fmt.Sprintf("latitude: %f, longitude: %f", latitude, longitude)
return "14℃"
}
func main() {
// Create an OpenAI client.
// base_url points to the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl + "/v1"),
)
// Define the user message.
// The user is asking about today's weather in Paris.
messages := []openai.ChatCompletionMessageParamUnion{
openai.UserMessage("What is the weather like in Paris today?"),
}
// Generate a chat completion using the AIOS model.
// The model parameter specifies the model ID to use.
// The messages parameter is a list of messages containing the user message.
// The tools parameter provides metadata about the tools available to the model.
response, err := client.Chat.Completions.New(context.TODO(), openai.ChatCompletionNewParams{
Model: model,
Messages: messages,
Tools: []openai.ChatCompletionToolParam{
{
Function: openai.FunctionDefinitionParam{
Name: "get_weather",
Description: openai.String("Get current temperature for provided coordinates in celsius."),
Parameters: openai.FunctionParameters{
"type": "object",
"properties": map[string]interface{}{
"latitude": map[string]string{
"type": "number",
},
"longitude": map[string]string{
"type": "number",
},
},
"required": []string{"latitude", "longitude"},
"additionalProperties": false,
},
Strict: openai.Bool(true),
},
},
},
})
if err != nil {
panic(err)
}
// Print the response generated by the AI model.
// This response includes tool call information.
fmt.Println([]string{response.Choices[0].Message.ToolCalls[0].RawJSON()})
// Extract tool call information from the first response.
// This retrieves the tool call information requested by the model.
var v map[string]float32
toolCall := response.Choices[0].Message.ToolCalls[0]
args := toolCall.Function.Arguments
// Parse the tool call arguments from JSON string format to map format.
// This extracts the parameters needed for the tool call.
err = json.Unmarshal([]byte(args), &v)
if err != nil {
panic(err)
}
// Call the actual function to get the result. (e.g., "14℃")
// At this step, the actual weather information lookup logic is executed.
result := getWeather(v["latitude"], v["longitude"])
// Add the function call result as a **tool** message to the conversation context and call the model again,
// then the model generates an appropriate response using the function call result.
// Add the model's tool call message to messages to maintain the conversation context.
messages = append(messages, response.Choices[0].Message.ToParam())
// Add the result of calling the actual function to messages.
// This allows the model to generate a final response based on the tool call result.
messages = append(messages, openai.ToolMessage(string(result), toolCall.ID))
// Generate a second chat completion.
// This includes the model ID to use and the updated messages list.
// This request generates a final response based on the tool call result.
response2, err := client.Chat.Completions.New(context.TODO(), openai.ChatCompletionNewParams{
Model: model,
Messages: messages,
})
if err != nil {
panic(err)
}
// Print the message generated by the AI in the second response.
// This is the final answer to the user's question.
fmt.Println(response2.Choices[0].Message.RawJSON())
}Response
In the first response, you can check the execution method of the tool that the model determined to use in message.tool_calls of choices.
In function of tool_calls, you can check that the get_weather function is used and what arguments are passed to execute it.
[
{
'id': 'chatcmpl-tool-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
'type': 'function',
'function': {
'name': 'get_weather',
'arguments': '{"latitude": 48.8566, "longitude": 2.3522}'
}
}
]
The second request included three messages in the messages:
- The initial user message
- The tool calling message generated by the first model
- The tool message containing the result of executing the get_weather tool
In the second response, the model generates a final response using all the content of the above messages.
{
'content': 'The current weather in Paris is 14℃.',
'refusal': None,
'role': 'assistant',
'annotations': None,
'audio': None,
'function_call': None,
'tool_calls': [],
'reasoning_content': 'We have user asking weather in Paris today. We called '
'get_weather function with coordinates and got "14℃" as '
'comment. We need to respond. Should incorporate info '
'and maybe note we are using approximate. Provide '
'answer.',
}
reasoning
Request
For models that support reasoning, you can check the reasoning value as follows.
import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Configure the request data.
# In this example, the user is asked to compare which of two numbers is greater.
# "Think step by step" is a prompt that encourages the model to think through logical steps.
data = {
"model": model,
"messages": [
{"role": "user", "content": "Think step by step. 9.11 and 9.8, which is greater?"}
]
}
# Send a POST request to the AIOS API's v1/chat/completions endpoint.
# This request instructs the model to process the user's question.
response = requests.post(urljoin(aios_base_url, "v1/chat/completions"), json=data)
body = json.loads(response.text)
# Print the response generated by the AI model.
print(body["choices"][0]["message"])import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Configure the request data.
# In this example, the user is asked to compare which of two numbers is greater.
# "Think step by step" is a prompt that encourages the model to think through logical steps.
data = {
"model": model,
"messages": [
{"role": "user", "content": "Think step by step. 9.11 and 9.8, which is greater?"}
]
}
# Send a POST request to the AIOS API's v1/chat/completions endpoint.
# This request instructs the model to process the user's question.
response = requests.post(urljoin(aios_base_url, "v1/chat/completions"), json=data)
body = json.loads(response.text)
# Print the response generated by the AI model.
print(body["choices"][0]["message"])from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url points to the v1 endpoint of the AIOS API,
# and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Generate a chat completion using the AIOS model.
# The model parameter specifies the model ID to use.
# The messages parameter is a list of messages containing the user message.
# "Think step by step" is a prompt that encourages the model to think through logical steps.
response = client.chat.completions.create(
model=model,
messages=[
{"role": "user", "content": "Think step by step. 9.11 and 9.8, which is greater?"}
],
)
# Print the response generated by the AI model.
print(response.choices[0].message.model_dump())from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url points to the v1 endpoint of the AIOS API,
# and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Generate a chat completion using the AIOS model.
# The model parameter specifies the model ID to use.
# The messages parameter is a list of messages containing the user message.
# "Think step by step" is a prompt that encourages the model to think through logical steps.
response = client.chat.completions.create(
model=model,
messages=[
{"role": "user", "content": "Think step by step. 9.11 and 9.8, which is greater?"}
],
)
# Print the response generated by the AI model.
print(response.choices[0].message.model_dump())from langchain_openai import ChatOpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
# base_url points to the v1 endpoint of the AIOS API,
# and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# The model parameter specifies the model ID to use.
chat_llm = ChatOpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
# Configure the list of chat messages.
# The user is asking to compare which of two numbers is greater.
# "Think step by step" is a prompt that encourages the model to think through logical steps.
messages = [
("human", "Think step by step. 9.11 and 9.8, which is greater?"),
]
# Pass the list of messages to the chat LLM to get a response.
# The invoke method returns the model's output.
# This request instructs the model to process the user's question.
chat_completion = chat_llm.invoke(messages)
# Print the response generated by the AI model.
print(chat_completion.model_dump())from langchain_openai import ChatOpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
# base_url points to the v1 endpoint of the AIOS API,
# and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# The model parameter specifies the model ID to use.
chat_llm = ChatOpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
# Configure the list of chat messages.
# The user is asking to compare which of two numbers is greater.
# "Think step by step" is a prompt that encourages the model to think through logical steps.
messages = [
("human", "Think step by step. 9.11 and 9.8, which is greater?"),
]
# Pass the list of messages to the chat LLM to get a response.
# The invoke method returns the model's output.
# This request instructs the model to process the user's question.
chat_completion = chat_llm.invoke(messages)
# Print the response generated by the AI model.
print(chat_completion.model_dump())const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Configure the request data.
// In this example, the user is asked to compare which of two numbers is greater.
// "Think step by step" is a prompt that encourages the model to think through logical steps.
const data = {
model: model,
messages: [
{
role: "user",
content: "Think step by step. 9.11 and 9.8, which is greater?",
},
],
};
// Generate the AIOS API's v1/chat/completions endpoint URL.
let url = new URL("/v1/chat/completions", aios_base_url);
// Send a POST request to the AIOS API.
// This request instructs the model to process the user's question.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
const body = await response.json();
// Print the response generated by the AI model.
console.log(body.choices[0].message);const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Configure the request data.
// In this example, the user is asked to compare which of two numbers is greater.
// "Think step by step" is a prompt that encourages the model to think through logical steps.
const data = {
model: model,
messages: [
{
role: "user",
content: "Think step by step. 9.11 and 9.8, which is greater?",
},
],
};
// Generate the AIOS API's v1/chat/completions endpoint URL.
let url = new URL("/v1/chat/completions", aios_base_url);
// Send a POST request to the AIOS API.
// This request instructs the model to process the user's question.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
const body = await response.json();
// Print the response generated by the AI model.
console.log(body.choices[0].message);import OpenAI from "openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create an OpenAI client.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// baseURL points to the v1 endpoint of the AIOS API.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Generate a chat completion using the AIOS model.
// The model parameter specifies the model ID to use.
// The messages parameter is a list of messages containing the user message.
// "Think step by step" is a prompt that encourages the model to think through logical steps.
const response = await client.chat.completions.create({
model: model,
messages: [
{
role: "user",
content: "Think step by step. 9.11 and 9.8, which is greater?",
},
],
});
// Print the response generated by the AI model.
console.log(response.choices[0].message);import OpenAI from "openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create an OpenAI client.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// baseURL points to the v1 endpoint of the AIOS API.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Generate a chat completion using the AIOS model.
// The model parameter specifies the model ID to use.
// The messages parameter is a list of messages containing the user message.
// "Think step by step" is a prompt that encourages the model to think through logical steps.
const response = await client.chat.completions.create({
model: model,
messages: [
{
role: "user",
content: "Think step by step. 9.11 and 9.8, which is greater?",
},
],
});
// Print the response generated by the AI model.
console.log(response.choices[0].message);package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the Message structure.
// Role: Message role (user, assistant, etc.)
// Content: Message content
type Message struct {
Role string `json:"role"`
Content string `json:"content"`
}
// Define the POST request data structure.
// Model: Model ID to use
// Messages: List of messages
// Stream: Whether to stream
type PostData struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
Stream bool `json:"stream,omitempty"`
}
func main() {
// Configure the request data.
// In this example, the user is asked to compare which of two numbers is greater.
// "Think step by step" is a prompt that encourages the model to think through logical steps.
data := PostData{
Model: model,
Messages: []Message{
{
Role: "user",
Content: "Think step by step. 9.11 and 9.8, which is greater?",
},
},
}
// Serialize the request data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/chat/completions endpoint.
response, err := http.Post(aiosBaseUrl+"/v1/chat/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// Read the response body.
body, err := io.ReadAll(response.Body)
if err != nil {
panic(err)
}
// Parse the response body to JSON format.
// This converts the model's response received from the server into structured data.
var v map[string]interface{}
json.Unmarshal(body, &v)
choices := v["choices"].([]interface{})
choice := choices[0].(map[string]interface{})
message, err := json.MarshalIndent(choice["message"], "", " ")
if err != nil {
panic(err)
}
// Print the response generated by the AI model.
fmt.Println(string(message))
}package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the Message structure.
// Role: Message role (user, assistant, etc.)
// Content: Message content
type Message struct {
Role string `json:"role"`
Content string `json:"content"`
}
// Define the POST request data structure.
// Model: Model ID to use
// Messages: List of messages
// Stream: Whether to stream
type PostData struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
Stream bool `json:"stream,omitempty"`
}
func main() {
// Configure the request data.
// In this example, the user is asked to compare which of two numbers is greater.
// "Think step by step" is a prompt that encourages the model to think through logical steps.
data := PostData{
Model: model,
Messages: []Message{
{
Role: "user",
Content: "Think step by step. 9.11 and 9.8, which is greater?",
},
},
}
// Serialize the request data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/chat/completions endpoint.
response, err := http.Post(aiosBaseUrl+"/v1/chat/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// Read the response body.
body, err := io.ReadAll(response.Body)
if err != nil {
panic(err)
}
// Parse the response body to JSON format.
// This converts the model's response received from the server into structured data.
var v map[string]interface{}
json.Unmarshal(body, &v)
choices := v["choices"].([]interface{})
choice := choices[0].(map[string]interface{})
message, err := json.MarshalIndent(choice["message"], "", " ")
if err != nil {
panic(err)
}
// Print the response generated by the AI model.
fmt.Println(string(message))
}package main
import (
"context"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
func main() {
// Create an OpenAI client.
// base_url points to the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl + "/v1"),
)
// Generate a chat completion using the AIOS model.
// The model parameter specifies the model ID to use.
// The messages parameter is a list of messages containing the user message.
// "Think step by step" is a prompt that encourages the model to think through logical steps.
response, err := client.Chat.Completions.New(context.TODO(), openai.ChatCompletionNewParams{
Model: model,
Messages: []openai.ChatCompletionMessageParamUnion{
openai.UserMessage("Think step by step. 9.11 and 9.8, which is greater?"),
},
})
if err != nil {
panic(err)
}
// Print the response generated by the AI model.
fmt.Println(response.Choices[0].Message.RawJSON())
}package main
import (
"context"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
func main() {
// Create an OpenAI client.
// base_url points to the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl + "/v1"),
)
// Generate a chat completion using the AIOS model.
// The model parameter specifies the model ID to use.
// The messages parameter is a list of messages containing the user message.
// "Think step by step" is a prompt that encourages the model to think through logical steps.
response, err := client.Chat.Completions.New(context.TODO(), openai.ChatCompletionNewParams{
Model: model,
Messages: []openai.ChatCompletionMessageParamUnion{
openai.UserMessage("Think step by step. 9.11 and 9.8, which is greater?"),
},
})
if err != nil {
panic(err)
}
// Print the response generated by the AI model.
fmt.Println(response.Choices[0].Message.RawJSON())
}Response
If you check the message field of choices, you can see reasoning_content in addition to content.
reasoning_content refers to the tokens generated during the reasoning phase before generating the final answer.
{
'annotations': None,
'audio': None,
'content': 'Sure! Let’s compare the two numbers step by step.\n'
'\n'
'1. **Identify the numbers** \n'
' - First number: **9.11** \n'
' - Second number: **9.8**\n'
'\n'
'2. **Look at the whole-number part** \n'
' Both numbers have the same whole-number part, **9**. So the '
'comparison will depend on the decimal part.\n'
'\n'
'3. **Compare the decimal parts** \n'
' - Decimal part of 9.11 = **0.11** \n'
' - Decimal part of 9.8 = **0.80** (since 9.8 = 9.80)\n'
'\n'
'4. **Determine which decimal part is larger** \n'
' - 0.80 is greater than 0.11.\n'
'\n'
'5. **Conclude** \n'
' Because the whole-number parts are equal and the decimal part '
'of 9.8 is larger, **9.8 is greater than 9.11**.',
'function_call': None,
'reasoning_content': 'User asks: "Think step by step. 9.11 and 9.8, which is '
'greater?" We need to compare numbers 9.11 and 9.8. '
'Value: 9.11 < 9.8, so 9.8 is greater. Provide '
'step-by-step reasoning. No policy conflict.',
'refusal': None,
'role': 'assistant',
'tool_calls': []
}
image to text
For models that support vision, you can input images as follows.
Input images for vision support models have size and quantity limits.
For information about image input limits, please refer to Provided Models.
Request
You can input images in base64-encoded data URL format with MIME type.
import base64
import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
image_path = "image/path.jpg"
# Define a function to Base64 encode an image.
# This converts the image to text format so it can be transmitted to the API.
def encode_image(image_path: str):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")
# Encode the image in Base64 format.
base64_image = encode_image(image_path)
# Configure the request data.
# In this example, the user is asked to ask a question about the image.
# The image is transmitted as a Base64-encoded string.
data = {
"model": model,
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "what's in this image?"},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}",
},
},
]
},
]
}
# Send a POST request to the AIOS API's v1/chat/completions endpoint.
# This request asks the model to analyze the image.
response = requests.post(urljoin(aios_base_url, "v1/chat/completions"), json=data)
body = json.loads(response.text)
# Print the response generated by the AI model.
# This response is the model's description of the image content.
print(body["choices"][0]["message"])import base64
import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
image_path = "image/path.jpg"
# Define a function to Base64 encode an image.
# This converts the image to text format so it can be transmitted to the API.
def encode_image(image_path: str):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")
# Encode the image in Base64 format.
base64_image = encode_image(image_path)
# Configure the request data.
# In this example, the user is asked to ask a question about the image.
# The image is transmitted as a Base64-encoded string.
data = {
"model": model,
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "what's in this image?"},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}",
},
},
]
},
]
}
# Send a POST request to the AIOS API's v1/chat/completions endpoint.
# This request asks the model to analyze the image.
response = requests.post(urljoin(aios_base_url, "v1/chat/completions"), json=data)
body = json.loads(response.text)
# Print the response generated by the AI model.
# This response is the model's description of the image content.
print(body["choices"][0]["message"])import base64
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url points to the v1 endpoint of the AIOS API,
# and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
image_path = "image/path.jpg"
# Define a function to Base64 encode an image.
# This converts the image to text format so it can be transmitted to the API.
def encode_image(image_path: str):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")
# Encode the image in Base64 format.
base64_image = encode_image(image_path)
# Generate a chat completion using the AIOS model.
# The model parameter specifies the model ID to use.
# The messages parameter is a list of messages containing the user message.
# In this example, the user is asked to ask a question about the image.
# The image is transmitted as a Base64-encoded string.
response = client.chat.completions.create(
model=model,
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "what's in this image?"},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}",
},
},
]
},
],
)
# Print the response generated by the AI model.
# This response is the model's description of the image content.
print(response.choices[0].message.model_dump())import base64
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url points to the v1 endpoint of the AIOS API,
# and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
image_path = "image/path.jpg"
# Define a function to Base64 encode an image.
# This converts the image to text format so it can be transmitted to the API.
def encode_image(image_path: str):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")
# Encode the image in Base64 format.
base64_image = encode_image(image_path)
# Generate a chat completion using the AIOS model.
# The model parameter specifies the model ID to use.
# The messages parameter is a list of messages containing the user message.
# In this example, the user is asked to ask a question about the image.
# The image is transmitted as a Base64-encoded string.
response = client.chat.completions.create(
model=model,
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "what's in this image?"},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}",
},
},
]
},
],
)
# Print the response generated by the AI model.
# This response is the model's description of the image content.
print(response.choices[0].message.model_dump())import base64
from langchain_openai import ChatOpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
# base_url points to the v1 endpoint of the AIOS API,
# and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# The model parameter specifies the model ID to use.
chat_llm = ChatOpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
image_path = "image/path.jpg"
# Define a function to Base64 encode an image.
# This converts the image to text format so it can be transmitted to the API.
def encode_image(image_path: str):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")
# Encode the image in Base64 format.
base64_image = encode_image(image_path)
# Configure the list of chat messages.
# In this example, the user is asked to ask a question about the image.
# The image is transmitted as a Base64-encoded string.
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "what's in this image?"},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}",
},
},
]
},
]
# Pass the list of messages to the chat LLM to get a response.
# The invoke method returns the model's output.
# This request asks the model to analyze the image.
chat_completion = chat_llm.invoke(messages)
# Print the response generated by the AI model.
# This response is the model's description of the image content.
print(chat_completion.model_dump())import base64
from langchain_openai import ChatOpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
# base_url points to the v1 endpoint of the AIOS API,
# and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# The model parameter specifies the model ID to use.
chat_llm = ChatOpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
image_path = "image/path.jpg"
# Define a function to Base64 encode an image.
# This converts the image to text format so it can be transmitted to the API.
def encode_image(image_path: str):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")
# Encode the image in Base64 format.
base64_image = encode_image(image_path)
# Configure the list of chat messages.
# In this example, the user is asked to ask a question about the image.
# The image is transmitted as a Base64-encoded string.
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "what's in this image?"},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}",
},
},
]
},
]
# Pass the list of messages to the chat LLM to get a response.
# The invoke method returns the model's output.
# This request asks the model to analyze the image.
chat_completion = chat_llm.invoke(messages)
# Print the response generated by the AI model.
# This response is the model's description of the image content.
print(chat_completion.model_dump())import { readFile } from "fs/promises";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
const imagePath = "image/path.jpg";
// Define a function to convert an image file to Base64.
// This converts the image to text format so it can be transmitted to the API.
async function imageFileToBase64(imagePath) {
// Read file contents as buffer
const fileBuffer = await readFile(imagePath);
// Convert buffer to Base64 string
return fileBuffer.toString("base64");
}
// Convert the image file to Base64 format.
const base64Image = await imageFileToBase64(imagePath);
// Configure the request data.
// In this example, the user is asked to ask a question about the image.
// The image is transmitted as a Base64-encoded string.
const data = {
model: model,
messages: [
{
role: "user",
content: [
{ type: "text", text: "what's in this image?" },
{
type: "image_url",
image_url: {
url: `data:image/jpeg;base64,${base64Image}`,
},
},
],
},
],
};
// Generate the AIOS API's v1/chat/completions endpoint URL.
let url = new URL("/v1/chat/completions", aios_base_url);
// Send a POST request to the AIOS API.
// This request asks the model to analyze the image.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
const body = await response.json();
// Print the response generated by the AI model.
// This response is the model's description of the image content.
console.log(body.choices[0].message);import { readFile } from "fs/promises";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
const imagePath = "image/path.jpg";
// Define a function to convert an image file to Base64.
// This converts the image to text format so it can be transmitted to the API.
async function imageFileToBase64(imagePath) {
// Read file contents as buffer
const fileBuffer = await readFile(imagePath);
// Convert buffer to Base64 string
return fileBuffer.toString("base64");
}
// Convert the image file to Base64 format.
const base64Image = await imageFileToBase64(imagePath);
// Configure the request data.
// In this example, the user is asked to ask a question about the image.
// The image is transmitted as a Base64-encoded string.
const data = {
model: model,
messages: [
{
role: "user",
content: [
{ type: "text", text: "what's in this image?" },
{
type: "image_url",
image_url: {
url: `data:image/jpeg;base64,${base64Image}`,
},
},
],
},
],
};
// Generate the AIOS API's v1/chat/completions endpoint URL.
let url = new URL("/v1/chat/completions", aios_base_url);
// Send a POST request to the AIOS API.
// This request asks the model to analyze the image.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
const body = await response.json();
// Print the response generated by the AI model.
// This response is the model's description of the image content.
console.log(body.choices[0].message);import OpenAI from "openai";
import { readFile } from "fs/promises";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
const imagePath = "image/path.jpg";
// Define a function to convert an image file to Base64.
// This converts the image to text format so it can be transmitted to the API.
async function imageFileToBase64(imagePath) {
// Read file contents as buffer
const fileBuffer = await readFile(imagePath);
// Convert buffer to Base64 string
return fileBuffer.toString("base64");
}
// Convert the image file to Base64 format.
const base64Image = await imageFileToBase64(imagePath);
// Create an OpenAI client.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// baseURL points to the v1 endpoint of the AIOS API.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Generate a chat completion using the AIOS model.
// The model parameter specifies the model ID to use.
// The messages parameter is a list of messages containing the user message.
// In this example, the user is asked to ask a question about the image.
// The image is transmitted as a Base64-encoded string.
const response = await client.chat.completions.create({
model: model,
messages: [
{
role: "user",
content: [
{ type: "text", text: "what's in this image?" },
{
type: "image_url",
image_url: {
url: `data:image/jpeg;base64,${base64Image}`,
},
},
],
},
],
});
// Print the response generated by the AI model.
// This response is the model's description of the image content.
console.log(response.choices[0].message);import OpenAI from "openai";
import { readFile } from "fs/promises";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
const imagePath = "image/path.jpg";
// Define a function to convert an image file to Base64.
// This converts the image to text format so it can be transmitted to the API.
async function imageFileToBase64(imagePath) {
// Read file contents as buffer
const fileBuffer = await readFile(imagePath);
// Convert buffer to Base64 string
return fileBuffer.toString("base64");
}
// Convert the image file to Base64 format.
const base64Image = await imageFileToBase64(imagePath);
// Create an OpenAI client.
// apiKey is the key required by AIOS, typically set to "EMPTY_KEY".
// baseURL points to the v1 endpoint of the AIOS API.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Generate a chat completion using the AIOS model.
// The model parameter specifies the model ID to use.
// The messages parameter is a list of messages containing the user message.
// In this example, the user is asked to ask a question about the image.
// The image is transmitted as a Base64-encoded string.
const response = await client.chat.completions.create({
model: model,
messages: [
{
role: "user",
content: [
{ type: "text", text: "what's in this image?" },
{
type: "image_url",
image_url: {
url: `data:image/jpeg;base64,${base64Image}`,
},
},
],
},
],
});
// Print the response generated by the AI model.
// This response is the model's description of the image content.
console.log(response.choices[0].message);import { HumanMessage } from "@langchain/core/messages";
import { ChatOpenAI } from "@langchain/openai";
import { readFile } from "fs/promises";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
const imagePath = "image/path.jpg";
// Define a function to convert an image file to Base64.
// This converts the image to text format so it can be transmitted to the API.
async function imageFileToBase64(imagePath) {
// Read file contents as buffer
const fileBuffer = await readFile(imagePath);
// Convert buffer to Base64 string
return fileBuffer.toString("base64");
}
// Convert the image file to Base64 format.
const base64Image = await imageFileToBase64(imagePath);
// Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
// base_url points to the v1 endpoint of the AIOS API,
// and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
// The model parameter specifies the model ID to use.
const llm = new ChatOpenAI({
model: model,
apiKey: "EMPTY_KEY",
configuration: {
baseURL: new URL("v1", aios_base_url).href,
},
});
// Configure the list of chat messages.
// In this example, the user is asked to ask a question about the image.
// The image is transmitted as a Base64-encoded string.
const messages = [
new HumanMessage({
content: [
{ type: "text", text: "what's in this image?" },
{
type: "image_url",
image_url: {
url: `data:image/jpeg;base64,${base64Image}`,
},
},
],
}),
];
// Pass the list of messages to the chat LLM to get a response.
// The invoke method returns the model's output.
// This request asks the model to analyze the image.
const response = await llm.invoke(messages);
// Print the response generated by the AI model.
// This response is the model's description of the image content.
console.log(response.content);import { HumanMessage } from "@langchain/core/messages";
import { ChatOpenAI } from "@langchain/openai";
import { readFile } from "fs/promises";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
const imagePath = "image/path.jpg";
// Define a function to convert an image file to Base64.
// This converts the image to text format so it can be transmitted to the API.
async function imageFileToBase64(imagePath) {
// Read file contents as buffer
const fileBuffer = await readFile(imagePath);
// Convert buffer to Base64 string
return fileBuffer.toString("base64");
}
// Convert the image file to Base64 format.
const base64Image = await imageFileToBase64(imagePath);
// Create a chat LLM (Large Language Model) instance using LangChain's ChatOpenAI class.
// base_url points to the v1 endpoint of the AIOS API,
// and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
// The model parameter specifies the model ID to use.
const llm = new ChatOpenAI({
model: model,
apiKey: "EMPTY_KEY",
configuration: {
baseURL: new URL("v1", aios_base_url).href,
},
});
// Configure the list of chat messages.
// In this example, the user is asked to ask a question about the image.
// The image is transmitted as a Base64-encoded string.
const messages = [
new HumanMessage({
content: [
{ type: "text", text: "what's in this image?" },
{
type: "image_url",
image_url: {
url: `data:image/jpeg;base64,${base64Image}`,
},
},
],
}),
];
// Pass the list of messages to the chat LLM to get a response.
// The invoke method returns the model's output.
// This request asks the model to analyze the image.
const response = await llm.invoke(messages);
// Print the response generated by the AI model.
// This response is the model's description of the image content.
console.log(response.content);package main
import (
"bytes"
"encoding/base64"
"encoding/json"
"fmt"
"io"
"net/http"
"os"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
var imagePath = "image/path.jpg"
// Define the Message structure.
// Role: Message role (user, assistant, etc.)
// Content: Message content (including text and image URL)
type Message struct {
Role string `json:"role"`
Content []map[string]interface{} `json:"content"`
}
// Define the POST request data structure.
// Model: Model ID to use
// Messages: List of messages
// Stream: Whether to stream
type PostData struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
Stream bool `json:"stream,omitempty"`
}
// Define a function to Base64 encode an image file.
// This converts the image to text format so it can be transmitted to the API.
func imageFileToBase64(imagePath string) (string, error) {
data, err := os.ReadFile(imagePath)
if err != nil {
return "", err
}
return base64.StdEncoding.EncodeToString([]byte(data)), nil
}
func main() {
// Encode the image file in Base64 format.
base64Image, err := imageFileToBase64(imagePath)
if err != nil {
panic(err)
}
// Configure the request data.
// In this example, the user is asked to ask a question about the image.
// The image is transmitted as a Base64-encoded string.
data := PostData{
Model: model,
Messages: []Message{
{
Role: "user",
Content: []map[string]interface{}{
{
"type": "text",
"text": "what's in this image?",
},
{
"type": "image_url",
"image_url": map[string]string{
"url": fmt.Sprintf("data:image/jpeg;base64,%s", base64Image),
},
},
},
},
},
}
// Serialize the request data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/chat/completions endpoint.
// This request asks the model to analyze the image.
response, err := http.Post(aiosBaseUrl+"/v1/chat/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// Read the response body.
body, err := io.ReadAll(response.Body)
if err != nil {
panic(err)
}
// Parse the response body to JSON format.
// This converts the model's response received from the server into structured data.
var v map[string]interface{}
json.Unmarshal(body, &v)
// Print the response generated by the AI model.
// This response is the model's description of the image content.
choices := v["choices"].([]interface{})
choice := choices[0].(map[string]interface{})
message, err := json.MarshalIndent(choice["message"], "", " ")
if err != nil {
panic(err)
}
fmt.Println(string(message))
}package main
import (
"bytes"
"encoding/base64"
"encoding/json"
"fmt"
"io"
"net/http"
"os"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
var imagePath = "image/path.jpg"
// Define the Message structure.
// Role: Message role (user, assistant, etc.)
// Content: Message content (including text and image URL)
type Message struct {
Role string `json:"role"`
Content []map[string]interface{} `json:"content"`
}
// Define the POST request data structure.
// Model: Model ID to use
// Messages: List of messages
// Stream: Whether to stream
type PostData struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
Stream bool `json:"stream,omitempty"`
}
// Define a function to Base64 encode an image file.
// This converts the image to text format so it can be transmitted to the API.
func imageFileToBase64(imagePath string) (string, error) {
data, err := os.ReadFile(imagePath)
if err != nil {
return "", err
}
return base64.StdEncoding.EncodeToString([]byte(data)), nil
}
func main() {
// Encode the image file in Base64 format.
base64Image, err := imageFileToBase64(imagePath)
if err != nil {
panic(err)
}
// Configure the request data.
// In this example, the user is asked to ask a question about the image.
// The image is transmitted as a Base64-encoded string.
data := PostData{
Model: model,
Messages: []Message{
{
Role: "user",
Content: []map[string]interface{}{
{
"type": "text",
"text": "what's in this image?",
},
{
"type": "image_url",
"image_url": map[string]string{
"url": fmt.Sprintf("data:image/jpeg;base64,%s", base64Image),
},
},
},
},
},
}
// Serialize the request data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/chat/completions endpoint.
// This request asks the model to analyze the image.
response, err := http.Post(aiosBaseUrl+"/v1/chat/completions", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// Read the response body.
body, err := io.ReadAll(response.Body)
if err != nil {
panic(err)
}
// Parse the response body to JSON format.
// This converts the model's response received from the server into structured data.
var v map[string]interface{}
json.Unmarshal(body, &v)
// Print the response generated by the AI model.
// This response is the model's description of the image content.
choices := v["choices"].([]interface{})
choice := choices[0].(map[string]interface{})
message, err := json.MarshalIndent(choice["message"], "", " ")
if err != nil {
panic(err)
}
fmt.Println(string(message))
}package main
import (
"context"
"encoding/base64"
"fmt"
"os"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
var imagePath = "image/path.jpg"
// Define a function to Base64 encode an image file.
// This converts the image to text format so it can be transmitted to the API.
func imageFileToBase64(imagePath string) (string, error) {
data, err := os.ReadFile(imagePath)
if err != nil {
return "", err
}
return base64.StdEncoding.EncodeToString([]byte(data)), nil
}
func main() {
// Encode the image file in Base64 format.
base64Image, err := imageFileToBase64(imagePath)
if err != nil {
panic(err)
}
// Create an OpenAI client.
// base_url points to the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl + "/v1"),
)
// Generate a chat completion using the AIOS model.
// The model parameter specifies the model ID to use.
// The messages parameter is a list of messages containing the user message.
// In this example, the user is asked to ask a question about the image.
// The image is transmitted as a Base64-encoded string.
response, err := client.Chat.Completions.New(context.TODO(), openai.ChatCompletionNewParams{
Model: model,
Messages: []openai.ChatCompletionMessageParamUnion{
openai.UserMessage([]openai.ChatCompletionContentPartUnionParam{
{
OfText: &openai.ChatCompletionContentPartTextParam{
Text: "what's in this image?",
},
},
{
OfImageURL: &openai.ChatCompletionContentPartImageParam{
ImageURL: openai.ChatCompletionContentPartImageImageURLParam{
URL: fmt.Sprintf("data:image/jpeg;base64,%s", base64Image),
},
},
},
}),
},
})
if err != nil {
panic(err)
}
// Print the response generated by the AI model.
// This response is the model's description of the image content.
fmt.Println(response.Choices[0].Message.RawJSON())
}package main
import (
"context"
"encoding/base64"
"fmt"
"os"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
var imagePath = "image/path.jpg"
// Define a function to Base64 encode an image file.
// This converts the image to text format so it can be transmitted to the API.
func imageFileToBase64(imagePath string) (string, error) {
data, err := os.ReadFile(imagePath)
if err != nil {
return "", err
}
return base64.StdEncoding.EncodeToString([]byte(data)), nil
}
func main() {
// Encode the image file in Base64 format.
base64Image, err := imageFileToBase64(imagePath)
if err != nil {
panic(err)
}
// Create an OpenAI client.
// base_url points to the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl + "/v1"),
)
// Generate a chat completion using the AIOS model.
// The model parameter specifies the model ID to use.
// The messages parameter is a list of messages containing the user message.
// In this example, the user is asked to ask a question about the image.
// The image is transmitted as a Base64-encoded string.
response, err := client.Chat.Completions.New(context.TODO(), openai.ChatCompletionNewParams{
Model: model,
Messages: []openai.ChatCompletionMessageParamUnion{
openai.UserMessage([]openai.ChatCompletionContentPartUnionParam{
{
OfText: &openai.ChatCompletionContentPartTextParam{
Text: "what's in this image?",
},
},
{
OfImageURL: &openai.ChatCompletionContentPartImageParam{
ImageURL: openai.ChatCompletionContentPartImageImageURLParam{
URL: fmt.Sprintf("data:image/jpeg;base64,%s", base64Image),
},
},
},
}),
},
})
if err != nil {
panic(err)
}
// Print the response generated by the AI model.
// This response is the model's description of the image content.
fmt.Println(response.Choices[0].Message.RawJSON())
}Response
The image is analyzed and text is generated as follows:
{
'annotations': None,
'audio': None,
'content': "Here's what's in the image:\n"
'\n'
'* **A Golden Retriever puppy:** The main focus is a cute, '
'fluffy golden retriever puppy lying on a patch of grass.\n'
'* **A bone:** The puppy is chewing on a pink bone.\n'
'* **Green grass:** The puppy is lying on a vibrant green lawn.\n'
'* **Background:** There’s a bit of foliage and some elements of '
'a garden or yard in the background, including a small shed and '
'some plants.\n'
'\n'
"It's a really heartwarming image!",
'function_call': None,
'reasoning_content': None,
'refusal': None,
'role': 'assistant',
'tool_calls': []
}
Embeddings API
Embeddings converts input text into high-dimensional vectors of a specified dimension.
The generated vectors can be used for various natural language processing tasks such as text similarity, clustering, and search.
Request
import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Configure the data to pass to the model.
data = {
"model": model,
"input": "What is the capital of France?"
}
# Send a POST request to AIOS's /v1/embeddings API endpoint.
response = requests.post(urljoin(aios_base_url, "v1/embeddings"), json=data)
body = json.loads(response.text)
# Print the generated embedding vector from the response.
print(body["data"][0]["embedding"])import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Configure the data to pass to the model.
data = {
"model": model,
"input": "What is the capital of France?"
}
# Send a POST request to AIOS's /v1/embeddings API endpoint.
response = requests.post(urljoin(aios_base_url, "v1/embeddings"), json=data)
body = json.loads(response.text)
# Print the generated embedding vector from the response.
print(body["data"][0]["embedding"])from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url specifies the AIOS API endpoint,
# and api_key is set to a dummy value ("EMPTY_KEY").
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Call the OpenAI client's embeddings.create method to generate an embedding.
# Pass the input text and model ID to generate an embedding vector.
response = client.embeddings.create(
input="What is the capital of France?",
model=model
)
# Print the generated embedding vector.
print(response.data[0].embedding)from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an OpenAI client.
# base_url specifies the AIOS API endpoint,
# and api_key is set to a dummy value ("EMPTY_KEY").
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Call the OpenAI client's embeddings.create method to generate an embedding.
# Pass the input text and model ID to generate an embedding vector.
response = client.embeddings.create(
input="What is the capital of France?",
model=model
)
# Print the generated embedding vector.
print(response.data[0].embedding)from langchain_together import TogetherEmbeddings
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an embedding instance using the TogetherEmbeddings class.
# base_url specifies the AIOS API endpoint,
# api_key is set to a dummy value ("EMPTY_KEY").
# model specifies the embedding model to use.
embeddings = TogetherEmbeddings(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
# Generate an embedding vector for the input text.
# The embed_query method generates an embedding for a single sentence.
embedding = embeddings.embed_query("What is the capital of France?")
# Print the generated embedding vector.
print(embedding)from langchain_together import TogetherEmbeddings
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create an embedding instance using the TogetherEmbeddings class.
# base_url specifies the AIOS API endpoint,
# api_key is set to a dummy value ("EMPTY_KEY").
# model specifies the embedding model to use.
embeddings = TogetherEmbeddings(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
# Generate an embedding vector for the input text.
# The embed_query method generates an embedding for a single sentence.
embedding = embeddings.embed_query("What is the capital of France?")
# Print the generated embedding vector.
print(embedding)const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Configure the data to pass to the model.
const data = {
model: model,
input: "What is the capital of France?"
};
// Generate the AIOS API's v1/embeddings endpoint URL.
let url = new URL("/v1/embeddings", aios_base_url);
// Send a POST request to AIOS's embeddings API endpoint.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
const body = await response.json();
// Print the generated embedding vector from the response.
console.log(body.data[0].embedding);const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Configure the data to pass to the model.
const data = {
model: model,
input: "What is the capital of France?"
};
// Generate the AIOS API's v1/embeddings endpoint URL.
let url = new URL("/v1/embeddings", aios_base_url);
// Send a POST request to AIOS's embeddings API endpoint.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
const body = await response.json();
// Print the generated embedding vector from the response.
console.log(body.data[0].embedding);import OpenAI from "openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create an OpenAI client.
// apiKey is set to a dummy value ("EMPTY_KEY"),
// and baseURL specifies the AIOS API endpoint.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Call the OpenAI client's embeddings.create method to generate an embedding.
// Pass the input text and model ID to generate an embedding vector.
const response = await client.embeddings.create({
model: model,
input: "What is the capital of France?",
});
// Print the generated embedding vector.
console.log(response.data[0].embedding);import OpenAI from "openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create an OpenAI client.
// apiKey is set to a dummy value ("EMPTY_KEY"),
// and baseURL specifies the AIOS API endpoint.
const client = new OpenAI({
apiKey: "EMPTY_KEY",
baseURL: new URL("v1", aios_base_url).href,
});
// Call the OpenAI client's embeddings.create method to generate an embedding.
// Pass the input text and model ID to generate an embedding vector.
const response = await client.embeddings.create({
model: model,
input: "What is the capital of France?",
});
// Print the generated embedding vector.
console.log(response.data[0].embedding);import { OpenAIEmbeddings } from "@langchain/openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create an embedding instance using LangChain's OpenAIEmbeddings class.
// base_url points to the v1 endpoint of the AIOS API,
// and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
// The model parameter specifies the model ID to use.
const embeddings = new OpenAIEmbeddings({
model: model,
apiKey: "EMPTY_KEY",
configuration: {
baseURL: new URL("v1", aios_base_url).href,
},
});
// Generate an embedding vector for the input text.
// The embedQuery method generates an embedding for a single sentence.
const response = await embeddings.embedQuery("What is the capital of France?");
// Print the generated embedding vector.
console.log(response);import { OpenAIEmbeddings } from "@langchain/openai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create an embedding instance using LangChain's OpenAIEmbeddings class.
// base_url points to the v1 endpoint of the AIOS API,
// and api_key is the key required by AIOS, typically set to "EMPTY_KEY".
// The model parameter specifies the model ID to use.
const embeddings = new OpenAIEmbeddings({
model: model,
apiKey: "EMPTY_KEY",
configuration: {
baseURL: new URL("v1", aios_base_url).href,
},
});
// Generate an embedding vector for the input text.
// The embedQuery method generates an embedding for a single sentence.
const response = await embeddings.embedQuery("What is the capital of France?");
// Print the generated embedding vector.
console.log(response);package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the data structure for the POST request.
// Model: Model ID to use
// Input: Input text to generate embedding for
type PostData struct {
Model string `json:"model"`
Input string `json:"input"`
}
func main() {
// Create the request data.
data := PostData{
Model: model,
Input: "What is the capital of France?",
}
// Marshal the data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/embeddings endpoint.
response, err := http.Post(aiosBaseUrl+"/v1/embeddings", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// Read the entire response body.
body, err := io.ReadAll(response.Body)
if err != nil {
panic(err)
}
// Unmarshal the response body to map format.
var v map[string]interface{}
json.Unmarshal(body, &v)
responseData := v["data"].([]interface{})
firstData := responseData[0].(map[string]interface{})
// Print the embedding vector of the first data in JSON format.
embedding, err := json.MarshalIndent(firstData["embedding"], "", " ")
if err != nil {
panic(err)
}
fmt.Println(string(embedding))
}package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the data structure for the POST request.
// Model: Model ID to use
// Input: Input text to generate embedding for
type PostData struct {
Model string `json:"model"`
Input string `json:"input"`
}
func main() {
// Create the request data.
data := PostData{
Model: model,
Input: "What is the capital of France?",
}
// Marshal the data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v1/embeddings endpoint.
response, err := http.Post(aiosBaseUrl+"/v1/embeddings", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// Read the entire response body.
body, err := io.ReadAll(response.Body)
if err != nil {
panic(err)
}
// Unmarshal the response body to map format.
var v map[string]interface{}
json.Unmarshal(body, &v)
responseData := v["data"].([]interface{})
firstData := responseData[0].(map[string]interface{})
// Print the embedding vector of the first data in JSON format.
embedding, err := json.MarshalIndent(firstData["embedding"], "", " ")
if err != nil {
panic(err)
}
fmt.Println(string(embedding))
}package main
import (
"context"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
func main() {
// Create an OpenAI client.
// Use option.WithBaseURL to set the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl + "/v1"),
)
// Generate an embedding using the AIOS model.
// Use openai.EmbeddingNewParams to set the model and input text.
// The input text is "What is the capital of France?".
completion, err := client.Embeddings.New(context.TODO(), openai.EmbeddingNewParams{
Model: model,
Input: openai.EmbeddingNewParamsInputUnion{
OfString: openai.String("What is the capital of France?"),
},
})
if err != nil {
panic(err)
}
// Print the generated embedding vector.
fmt.Println(completion.Data[0].Embedding)
}package main
import (
"context"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
func main() {
// Create an OpenAI client.
// Use option.WithBaseURL to set the v1 endpoint of the AIOS API.
client := openai.NewClient(
option.WithBaseURL(aiosBaseUrl + "/v1"),
)
// Generate an embedding using the AIOS model.
// Use openai.EmbeddingNewParams to set the model and input text.
// The input text is "What is the capital of France?".
completion, err := client.Embeddings.New(context.TODO(), openai.EmbeddingNewParams{
Model: model,
Input: openai.EmbeddingNewParamsInputUnion{
OfString: openai.String("What is the capital of France?"),
},
})
if err != nil {
panic(err)
}
// Print the generated embedding vector.
fmt.Println(completion.Data[0].Embedding)
}Response
You receive the value converted to vector format in embedding of data.
[
0.01319122314453125,
0.057220458984375,
-0.028533935546875,
-0.0008697509765625,
-0.01422119140625,
...omitted...
]
Rerank API
Rerank calculates the relevance to a query for given documents and assigns a ranking.
It helps improve the performance of RAG (Retrieval-Augmented Generation) applications by prioritizing relevant documents.
Request
import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Configure the request data.
# This includes the model ID to use, query, list of documents, and top N results.
data = {
"model": model,
"query": "What is the capital of France?",
"documents": [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France."
],
"top_n": 3
}
# Send a POST request to the AIOS API's v2/rerank endpoint.
# Compare the query and document list to rearrange documents with high relevance.
response = requests.post(urljoin(aios_base_url, "v2/rerank"), json=data)
body = json.loads(response.text)
# Print the rearranged results.
# This result is a list of documents sorted by relevance score between the query and documents.
print(body["results"])import json
import requests
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Configure the request data.
# This includes the model ID to use, query, list of documents, and top N results.
data = {
"model": model,
"query": "What is the capital of France?",
"documents": [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France."
],
"top_n": 3
}
# Send a POST request to the AIOS API's v2/rerank endpoint.
# Compare the query and document list to rearrange documents with high relevance.
response = requests.post(urljoin(aios_base_url, "v2/rerank"), json=data)
body = json.loads(response.text)
# Print the rearranged results.
# This result is a list of documents sorted by relevance score between the query and documents.
print(body["results"])import cohere
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create a Cohere client.
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# base_url points to the base path of the AIOS API.
client = cohere.ClientV2("EMPTY_KEY", base_url=aios_base_url)
# Define the list of documents.
# These are the documents to search.
docs = [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France."
]
# Use the AIOS model to rerank documents.
# The model parameter specifies the model ID to use.
# The query parameter is the search query.
# The documents parameter is the list of documents to search.
# The top_n parameter returns the top N results.
response = client.rerank(
model=model,
query="What is the capital of France?",
documents=docs,
top_n=3,
)
# Print the rearranged results.
# This result is a list of documents sorted by relevance score between the query and documents.
print([result.model_dump() for result in response.results])import cohere
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create a Cohere client.
# api_key is the key required by AIOS, typically set to "EMPTY_KEY".
# base_url points to the base path of the AIOS API.
client = cohere.ClientV2("EMPTY_KEY", base_url=aios_base_url)
# Define the list of documents.
# These are the documents to search.
docs = [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France."
]
# Use the AIOS model to rerank documents.
# The model parameter specifies the model ID to use.
# The query parameter is the search query.
# The documents parameter is the list of documents to search.
# The top_n parameter returns the top N results.
response = client.rerank(
model=model,
query="What is the capital of France?",
documents=docs,
top_n=3,
)
# Print the rearranged results.
# This result is a list of documents sorted by relevance score between the query and documents.
print([result.model_dump() for result in response.results])from langchain_cohere.rerank import CohereRerank
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create a reranker instance using the CohereRerank class.
# base_url points to the base path of the AIOS API.
# cohere_api_key is the key required for API requests, typically set to "EMPTY_KEY".
# The model parameter specifies the model ID to use.
rerank = CohereRerank(
base_url=aios_base_url,
cohere_api_key="EMPTY_KEY",
model=model
)
# Define the list of documents.
# These are the documents to rearrange.
docs = [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France."
]
# Use the reranker to rearrange documents.
# The documents parameter is the list of documents to rearrange.
# The query parameter is the search query.
# The top_n parameter returns the top N results.
ranks = rerank.rerank(
documents=docs,
query="What is the capital of France?",
top_n=3
)
# Print the rearranged results.
# This result is a list of documents sorted by relevance score between the query and documents.
print(ranks)from langchain_cohere.rerank import CohereRerank
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" # Enter the model ID for calling the AIOS model.
# Create a reranker instance using the CohereRerank class.
# base_url points to the base path of the AIOS API.
# cohere_api_key is the key required for API requests, typically set to "EMPTY_KEY".
# The model parameter specifies the model ID to use.
rerank = CohereRerank(
base_url=aios_base_url,
cohere_api_key="EMPTY_KEY",
model=model
)
# Define the list of documents.
# These are the documents to rearrange.
docs = [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France."
]
# Use the reranker to rearrange documents.
# The documents parameter is the list of documents to rearrange.
# The query parameter is the search query.
# The top_n parameter returns the top N results.
ranks = rerank.rerank(
documents=docs,
query="What is the capital of France?",
top_n=3
)
# Print the rearranged results.
# This result is a list of documents sorted by relevance score between the query and documents.
print(ranks)const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Configure the request data.
// This includes the model ID to use, query, list of documents, and top N results.
const data = {
model: model,
query: "What is the capital of France?",
documents: [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France.",
],
top_n: 3,
};
// Generate the AIOS API's v2/rerank endpoint URL.
let url = new URL("/v2/rerank", aios_base_url);
// Send a POST request to the AIOS API.
// This endpoint rearranges documents with high relevance by comparing the query and document list.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
const body = await response.json();
// Print the rearranged results.
// This result is a list of documents sorted by relevance score between the query and documents.
console.log(body.results);const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Configure the request data.
// This includes the model ID to use, query, list of documents, and top N results.
const data = {
model: model,
query: "What is the capital of France?",
documents: [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France.",
],
top_n: 3,
};
// Generate the AIOS API's v2/rerank endpoint URL.
let url = new URL("/v2/rerank", aios_base_url);
// Send a POST request to the AIOS API.
// This endpoint rearranges documents with high relevance by comparing the query and document list.
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
});
const body = await response.json();
// Print the rearranged results.
// This result is a list of documents sorted by relevance score between the query and documents.
console.log(body.results);import { CohereClientV2 } from "cohere-ai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create a CohereClientV2 client.
// token is the key required for API requests, typically set to "EMPTY_KEY".
// environment points to the base path of the AIOS API.
const cohere = new CohereClientV2({
token: "EMPTY_KEY",
environment: aios_base_url,
});
// Define the list of documents.
// These are the documents to rearrange.
const docs = [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France.",
];
// Use the AIOS model to rearrange documents.
// The model parameter specifies the model ID to use.
// The query parameter is the search query.
// The documents parameter is the list of documents to rearrange.
// The topN parameter returns the top N results.
const response = await cohere.rerank({
model: model,
query: "What is the capital of France?",
documents: docs,
topN: 3,
});
// Print the rearranged results.
// This result is a list of documents sorted by relevance score between the query and documents.
console.log(response.results);import { CohereClientV2 } from "cohere-ai";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create a CohereClientV2 client.
// token is the key required for API requests, typically set to "EMPTY_KEY".
// environment points to the base path of the AIOS API.
const cohere = new CohereClientV2({
token: "EMPTY_KEY",
environment: aios_base_url,
});
// Define the list of documents.
// These are the documents to rearrange.
const docs = [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France.",
];
// Use the AIOS model to rearrange documents.
// The model parameter specifies the model ID to use.
// The query parameter is the search query.
// The documents parameter is the list of documents to rearrange.
// The topN parameter returns the top N results.
const response = await cohere.rerank({
model: model,
query: "What is the capital of France?",
documents: docs,
topN: 3,
});
// Print the rearranged results.
// This result is a list of documents sorted by relevance score between the query and documents.
console.log(response.results);import { CohereClientV2 } from "cohere-ai";
import { CohereRerank } from "@langchain/cohere";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create a CohereClientV2 client.
// token is the key required for API requests, typically set to "EMPTY_KEY".
// environment points to the base path of the AIOS API.
const cohere = new CohereClientV2({
token: "EMPTY_KEY",
environment: aios_base_url,
});
// Create a rearranger instance using the CohereRerank class.
// The model parameter specifies the model ID to use.
// The client parameter passes the CohereClientV2 instance created above.
const reranker = new CohereRerank({
model: model,
client: cohere,
});
// Define the list of documents.
// These are the documents to rearrange.
const docs = [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France.",
];
// Define the search query.
const query = "What is the capital of France?";
// Use the rerank method of the reranker to rearrange documents.
// The first argument is the list of documents to rearrange.
// The second argument is the search query.
const response = await reranker.rerank(docs, query);
// Print the rearranged results.
// This result is a list of documents sorted by relevance score between the query and documents.
console.log(response);import { CohereClientV2 } from "cohere-ai";
import { CohereRerank } from "@langchain/cohere";
const aios_base_url = "<<aios endpoint-url>>"; // Enter the aios endpoint-url for calling the AIOS model.
const model = "<<model>>"; // Enter the model ID for calling the AIOS model.
// Create a CohereClientV2 client.
// token is the key required for API requests, typically set to "EMPTY_KEY".
// environment points to the base path of the AIOS API.
const cohere = new CohereClientV2({
token: "EMPTY_KEY",
environment: aios_base_url,
});
// Create a rearranger instance using the CohereRerank class.
// The model parameter specifies the model ID to use.
// The client parameter passes the CohereClientV2 instance created above.
const reranker = new CohereRerank({
model: model,
client: cohere,
});
// Define the list of documents.
// These are the documents to rearrange.
const docs = [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France.",
];
// Define the search query.
const query = "What is the capital of France?";
// Use the rerank method of the reranker to rearrange documents.
// The first argument is the list of documents to rearrange.
// The second argument is the search query.
const response = await reranker.rerank(docs, query);
// Print the rearranged results.
// This result is a list of documents sorted by relevance score between the query and documents.
console.log(response);package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the data structure for the POST request.
// Model: Model ID to use
// Query: Search query
// Documents: List of documents to rearrange
// TopN: Return top N results
type PostData struct {
Model string `json:"model"`
Query string `json:"query"`
Documents []string `json:"documents"`
TopN int32 `json:"top_n"`
}
func main() {
// Create the request data.
// The query is "What is the capital of France?",
// and the document list consists of three sentences.
// TopN is set to 3 to return the top 3 results.
data := PostData{
Model: model,
Query: "What is the capital of France?",
Documents: []string{
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France.",
},
TopN: 3,
}
// Marshal the data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v2/rerank endpoint.
// Compare the query and document list to rearrange documents with high relevance.
response, err := http.Post(aiosBaseUrl+"/v2/rerank", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// Read the entire response body.
body, err := io.ReadAll(response.Body)
if err != nil {
panic(err)
}
// Unmarshal the response body to map format.
var v map[string]interface{}
json.Unmarshal(body, &v)
// Print the rearranged results in JSON format.
// This result is a list of documents sorted by relevance score between the query and documents.
rerank, err := json.MarshalIndent(v["results"], "", " ")
if err != nil {
panic(err)
}
fmt.Println(string(rerank))
}package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
// Define the data structure for the POST request.
// Model: Model ID to use
// Query: Search query
// Documents: List of documents to rearrange
// TopN: Return top N results
type PostData struct {
Model string `json:"model"`
Query string `json:"query"`
Documents []string `json:"documents"`
TopN int32 `json:"top_n"`
}
func main() {
// Create the request data.
// The query is "What is the capital of France?",
// and the document list consists of three sentences.
// TopN is set to 3 to return the top 3 results.
data := PostData{
Model: model,
Query: "What is the capital of France?",
Documents: []string{
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France.",
},
TopN: 3,
}
// Marshal the data to JSON format.
jsonData, err := json.Marshal(data)
if err != nil {
panic(err)
}
// Send a POST request to the AIOS API's v2/rerank endpoint.
// Compare the query and document list to rearrange documents with high relevance.
response, err := http.Post(aiosBaseUrl+"/v2/rerank", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
panic(err)
}
defer response.Body.Close()
// Read the entire response body.
body, err := io.ReadAll(response.Body)
if err != nil {
panic(err)
}
// Unmarshal the response body to map format.
var v map[string]interface{}
json.Unmarshal(body, &v)
// Print the rearranged results in JSON format.
// This result is a list of documents sorted by relevance score between the query and documents.
rerank, err := json.MarshalIndent(v["results"], "", " ")
if err != nil {
panic(err)
}
fmt.Println(string(rerank))
}package main
import (
"context"
"fmt"
api "github.com/cohere-ai/cohere-go/v2"
client "github.com/cohere-ai/cohere-go/v2/client"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
func main() {
// Create a Cohere client.
// Use WithBaseURL to set the base path of the AIOS API.
co := client.NewClient(
client.WithBaseURL(aiosBaseUrl),
)
// Define the search query.
query := "What is the capital of France?"
// Define the list of documents.
// These are the documents to rearrange.
docs := []string{
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France.",
}
// Use the AIOS model to rearrange documents.
// Use &api.V2RerankRequest to set the model, query, and document list.
resp, err := co.V2.Rerank(
context.TODO(),
&api.V2RerankRequest{
Model: model,
Query: query,
Documents: docs,
},
)
if err != nil {
panic(err)
}
// Print the rearranged results.
// This result is a list of documents sorted by relevance score between the query and documents.
fmt.Println(resp.Results)
}package main
import (
"context"
"fmt"
api "github.com/cohere-ai/cohere-go/v2"
client "github.com/cohere-ai/cohere-go/v2/client"
)
const (
aiosBaseUrl = "<<aios endpoint-url>>" // Enter the aios endpoint-url for calling the AIOS model.
model = "<<model>>" // Enter the model ID for calling the AIOS model.
)
func main() {
// Create a Cohere client.
// Use WithBaseURL to set the base path of the AIOS API.
co := client.NewClient(
client.WithBaseURL(aiosBaseUrl),
)
// Define the search query.
query := "What is the capital of France?"
// Define the list of documents.
// These are the documents to rearrange.
docs := []string{
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France.",
}
// Use the AIOS model to rearrange documents.
// Use &api.V2RerankRequest to set the model, query, and document list.
resp, err := co.V2.Rerank(
context.TODO(),
&api.V2RerankRequest{
Model: model,
Query: query,
Documents: docs,
},
)
if err != nil {
panic(err)
}
// Print the rearranged results.
// This result is a list of documents sorted by relevance score between the query and documents.
fmt.Println(resp.Results)
}Response
In results, you can check the documents sorted in order of high relevance to the query.
[
{'document': {'text': 'The capital of France is Paris.'},
'index': 0,
'relevance_score': 0.9999659061431885},
{'document': {'text': 'France capital city is known for the Eiffel Tower.'},
'index': 1,
'relevance_score': 0.9663000106811523},
{'document': {'text': 'Paris is located in the north-central part of France.'},
'index': 2,
'relevance_score': 0.7127546668052673}
]
1.4 - Release Note
- The AIOS service has been officially launched.
- On Samsung Cloud Platform, you can create Virtual Server, GPU Server, Kubernetes Engine resources and use LLM on those resources.
1.5 - Licenses
AIOS Licenses
The license information for each AIOS provided model is as follows.
| Model | License |
|---|---|
| openai/gpt-oss-120b | Apache 2.0 |
| Qwen/Qwen3-Coder-30B-A3B-Instruct | Apache 2.0 |
| Qwen/Qwen3-30B-A3B-Thinking-2507 | Apache 2.0 |
| meta-llama/Llama-4-Scout | llama4 |
| meta-llama/Llama-Guard-4-12B | llama4 |
| sds/bge-m3 | Samsung SDS |
| sds/bge-reranker-v2-m3 | Samsung SDS |
1.5.1 - Llama-4-Scout
LLAMA 4 COMMUNITY LICENSE AGREEMENT
Llama 4 Version Effective Date: April 5, 2025
“Agreement” means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein.
Documentation" means the specifications, manuals and documentation accompanying Llama 4 distributed by Meta at https://www.llama.com/docs/overview.
“Licensee” or “you” means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf.
Llama 4" means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at https://www.llama.com/llama-downloads.
Llama Materials" means, collectively, Meta’s proprietary Llama 4 and Documentation (and any portion thereof) made available under this Agreement.
Meta" or “we” means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland).
By clicking “I Accept” below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement.
1. License Rights and Redistribution.
a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty-free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials.
b. Redistribution and Use.
i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display "Built with Llama" on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include "Llama" at the beginning of any such AI model name.
ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you.
iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a "Notice" text file distributed as a part of such copies: "Llama 4 is licensed under the Llama 4 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved."
iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at [https://www.llama.com/llama4/use-policy](https://www.llama.com/llama4/use-policy)), which is hereby incorporated by reference into this Agreement.
2. Additional Commercial Terms. If, on the Llama 4 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.
3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS.
4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING.
5. Intellectual Property.
a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use "Llama" (the "Mark") solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at [https://about.meta.com/brand/resources/meta/company-brand/](https://about.meta.com/brand/resources/meta/company-brand/)[)](https://en.facebookbrand.com/)). All goodwill arising out of your use of the Mark will inure to the benefit of Meta.
b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications.
c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 4 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials.
6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement.
7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement.
1.5.2 - Llama-Guard-4-12B
LLAMA 4 COMMUNITY LICENSE AGREEMENT
Llama 4 Version Effective Date: April 5, 2025
“Agreement” means the terms and conditions for use, reproduction, distribution and modification of the Llama Materials set forth herein.
“Documentation” means the specifications, manuals and documentation accompanying Llama 4 distributed by Meta at https://www.llama.com/docs/overview.
“Licensee” or “you” means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf.
“Llama 4” means the foundational large language models and software and algorithms, including machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing distributed by Meta at https://www.llama.com/llama-downloads.
“Llama Materials” means, collectively, Meta’s proprietary Llama 4 and Documentation (and any portion thereof) made available under this Agreement.
“Meta” or “we” means Meta Platforms Ireland Limited (if you are located in or, if you are an entity, your principal place of business is in the EEA or Switzerland) and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland).
By clicking “I Accept” below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement.
1. License Rights and Redistribution.
a. Grant of Rights. You are granted a non-exclusive, worldwide, non-transferable and royalty- free limited license under Meta’s intellectual property or other rights owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Llama Materials.
b. Redistribution and Use.
i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Llama” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name.
ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you.
iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Llama 4 is licensed under the Llama 4 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.”
iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at https://llama.com/llama4/use-policy), which is hereby incorporated by reference into this Agreement.
2. Additional Commercial Terms. If, on the Llama 4 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.
3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS.
4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING.
5. Intellectual Property.
a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials, neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Llama Materials or as set forth in this Section 5(a). Meta hereby grants you a license to use “Llama” (the “Mark”) solely as required to comply with the last sentence of Section 1.b.i. You will comply with Meta’s brand guidelines (currently accessible at https://about.meta.com/brand/resources/meta/company-brand/). All goodwill arising out of your use of the Mark will inure to the benefit of Meta.
b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications.
c. If you institute litigation or other proceedings against Meta or any entity (including a cross- claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 4 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials.
6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the termination of this Agreement.
7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement.
1.5.3 - bge-m3
MIT License
Copyright (c) [year] [fullname]
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
1.5.4 - bge-reranker-v2-m3
Model Overview
To strengthen Korean search ability based on BGE Reranker, using public dataset aihub 016(administration), 021(books), 151(law/finance), and 1.1 million general knowledge (Query-Passage Pair) to enhance Korean-based re‑ranking ability
- Model type: Reranker
- Main usage: Vector Search (RAG)
- Vocab.size: 250,002
- Version info: v1.0.0
- Base model license: apache-2.0
Technical features
- Structure: based on XLMRobertaModel
- Max Input Token : 1024(Max 8K, but fine-tune at 1024)
- Size: ~568M parameters (2.27GB, FP32)
Training data: aihub 016(administration), 021(books), 151(law/finance) , general knowledge 1.1 million items to strengthen Korean-based re-ranking capability
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright 2023 The k8sgpt Authors
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
1.5.5 - gpt-oss-120b
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
Definitions.
“License” shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document.
“Licensor” shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.
“Legal Entity” shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, “control” means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.
“You” (or “Your”) shall mean an individual or Legal Entity exercising permissions granted by this License.
“Source” form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.
“Object” form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types.
“Work” shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below).
“Derivative Works” shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof.
“Contribution” shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, “submitted” means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as “Not a Contribution.”
“Contributor” shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work.
Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form.
Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.
Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions:
(a) You must give any other recipients of the Work or Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and
(d) If the Work includes a “NOTICE” text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License.
You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License.
Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions.
Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file.
Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License.
Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages.
Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright 2023 The k8sgpt Authors
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
1.5.6 - Qwen3-30B-A3B
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright 2020 The k8sgpt Authors
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
1.5.7 - Qwen3-30B-A3B
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright 2023 The k8sgpt Authors
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
2 - CloudML
2.1 - Overview
Service Overview
CloudML is an integrated platform that supports the entire machine learning process from data analysis to model development, learning, verification, and deployment in a cloud environment.
Features
- Cloud ML is designed to allow users of various roles such as analysts, machine learning engineers, and developers to collaborate in one environment, and to easily design and operate machine learning workflows.
- Cloud ML provides an analysis environment based on Python and R, and users with programming experience can utilize the platform more flexibly and effectively. In particular, using the Copilot function based on generative AI, you can easily perform code writing, refactoring, error correction, and function recommendations with just natural language input, thereby increasing analysis productivity and analysis accessibility.
- Cloud ML supports each stage of analysis, including environment configuration, model development and serving, analysis automation, and visualization, in a systematic way. It also supports improving both productivity and model quality by automating repetitive experiments and operations.
Service Composition Diagram
CloudML consists of analysis environment, machine learning lifecycle management, automated analysis support, visualization, and generative AI-based Copilot function, and users can perform the entire machine learning process integrally through these components.
Provided Features
CloudML provides the following features.
- Visual Modeling: Provides an intuitive interface to build and deploy machine learning models without coding using a Drag&Drop method. You can easily manage all processes from data import to model evaluation and deployment.
- Code-based development: You can freely write and execute code using Python, R, etc. in the Jupyter Notebook environment. It provides powerful features for advanced users and researchers.
- Workflow Automation: It efficiently automates complex machine learning workflows such as data preprocessing, model training, evaluation, and deployment.
- Experiment Management: You can train machine learning models with various parameter combinations and systematically manage and compare the results.
- Copilot Feature Utilization: Provides natural language-based AI assistant functionality to guide and automate the model development process. It supports various tasks such as code generation, refactoring, error correction, and explanation, thereby improving productivity.
- Integrated Platform: All features are integrated within CloudML, making it convenient to use.
- Scalability and Flexibility: Supports expansion of computing resources and connection to various data sources as needed.
Constraints
Before using CloudML, please check the following restrictions and reflect them in your service usage plan. Cloud ML operates in a Kubernetes-based environment, so proper cluster resource settings are required for stable service operation.
- Application basic resources: For Application operation, a minimum of vCPU 24 cores and 96GB of memory are assigned by default.
- Analysis Job Resources: In addition to the basic resources, analysis jobs require additional CPU or GPU resources to be set. These resources should be set appropriately considering the workload of the analysis job.
- Copilot (CPU-based usage): To run Copilot on CPU resources, a minimum of 16-core vCPU and 10GBi of memory are required. In this case, the CPU resources available for analysis tasks are reduced accordingly.
- Copilot (GPU-based usage): Copilot can also be used by setting up dedicated GPU resources.
- Supported LLM models: Currently, the LLM models applicable to Copilot are limited to Llama3.
Region-based provision status
CloudML is available in the following environments.
| Region | Availability |
|---|---|
| Western Korea(kr-west1) | Provided |
| Korea East(kr-east1) | Provided |
| South Korea 1 (kr-south1) | Not provided |
| South Korea, southern region 2(kr-south2) | Not provided |
| South Korea southern region 3(kr-south3) | Not provided |
Preceding Service
This is a list of services that must be pre-configured before creating this service. Please refer to the guide provided for each service and prepare in advance for more details.
| Service Category | Service | Detailed Description |
|---|---|---|
| Container | Container Registry | A service that stores, manages, and shares container images |
| Container | Kubernetes Engine | Kubernetes container orchestration service |
| Networking | Load Balancer | A service that automatically distributes server traffic load |
2.2 - How-to guides
Create CloudML
The user can enter the essential information of CloudML through the Samsung Cloud Platform Console and create the service by selecting detailed options.
To create CloudML, follow these steps.
Click on the menu for all services > AI/ML > CloudML. It moves to the Service Home page of CloudML.
Service Home page, click the CloudML creation button. It moves to the CloudML page.
CloudML Creation page where you enter the information required for service creation and select detailed options.
Version Selection area, select the version of the service.
Classification NecessityDetailed Description Version Selection Required CloudML Version Selection Fig. CloudML Service Version Selection ItemsIn the SCP Kubernetes Engine deployment area, select the options required to create a service.
Classification NecessityDetailed Description Cluster Name Required Select Kubernetes Engine Cluster Fig. CloudML Service Cluster Selection ItemsService Information Input area, select the options required for service creation.
Classification NecessityDetailed Description CloudML name required Enter service name Description Selection Enter Service Description Domain Name Required Enter the domain name to be used in the service - Enter 2-63 characters using lowercase English letters, numbers, and special characters
Endpoint Required Select the endpoint to use for the service - Private and Public options
Copilot Selection Select whether to use Copilot in the service - Application selection requires agreement to terms in a popup window
- If the selected cluster is not composed of LLM dedicated GPU and the allocated LLM resources are insufficient, Copilot application is not possible
Resource Information Required Displays resource information of the selected cluster SCR Information Input Required Input SCR information to be used in the service - Private Endpoint, Authentication Key, Secret Key input
Table. CloudML Service Information Input ItemsEnter Additional Information Please enter or select the necessary information in the area.
Classification MandatoryDetailed Description Tag Selection Add Tag - Up to 50 can be added per resource
- Click the Add Tag button and enter or select Key, Value
Table. CloudML Additional Information Input Items
In the Summary panel, review the detailed information and estimated charges, and click the Complete button.
- Once creation is complete, check the created resource on the CloudML list page.
Check CloudML details
You can check and modify the entire resource list and detailed information of the CloudML service. The CloudML details page consists of details, tags, work history tabs.
To check the CloudML details, follow the next procedure.
Click on all services > AI/ML > CloudML menu. It moves to the Service Home page of CloudML.
Service Home page, click the resource (CloudML) to check the detailed information. It moves to the CloudML detail page.
- CloudML Details page displays the status information and detailed information of CloudML, and consists of Details, Tags, Work History tabs.
Division Detailed Description Service Status CloudML’s Status - Creating: being created
- Deployed: created/completed and operating normally
- Updating: updating settings
- Terminating: being deleted
- Error: error occurred
Connection Guide Service Connection Guide - Host information guide to be registered on the user PC
Service Cancellation Button to cancel the service Fig. CloudML Status Information and Additional Features
Detailed Information
On the CloudML list page, you can check the detailed information of the selected resource and modify the information if necessary.
| Division | Detailed Description |
|---|---|
| Service | Service Name |
| Resource Type | Resource Type |
| SRN | Unique resource ID in Samsung Cloud Platform |
| Resource Name | Resource Title |
| Resource ID | Unique resource ID in the service |
| Creator | User who created the service |
| Creation Time | The time when the service was created |
| Editor | User who modified the service information |
| Modified Date | Date when service information was modified |
| Product Name | CloudML Name |
| Copilot | Whether to use Copilot |
| Description | Description of the service |
| Cluster Name | Selected Kubernetes Engine cluster name |
| Domain Name | Entered Service Domain Name |
| Version | Selected Service Version |
| Installation Node Information | Node information installed in the cluster |
| SCR Information | Entered SCR Information |
Tag
On the CloudML list page, you can check the tag information of the selected resource, and add, change, or delete it.
| Classification | Detailed Description |
|---|---|
| Tag List | Tag list
|
Work History
You can check the job history of the selected resource on the CloudML list page.
| Classification | Detailed Description |
|---|---|
| Work history list | Resource change history
|
Canceling CloudML Service
Users can cancel the CloudML service through the Samsung Cloud Platform Console.
To cancel CloudML, follow these steps.
- Click on all services > AI/ML > CloudML menu. It moves to the Service Home page of CloudML.
- Service Home page, click the service cancellation button. A service cancellation notification window appears.
- Enter the CloudML name to be deleted in the notification window and click the Confirm button.
2.2.1 - Kubernetes Cluster Configuration
Configuring a Kubernetes Cluster
To apply for the CloudML service, a dedicated cluster for CloudML only must be configured. A dedicated cluster means creating a Kubernetes Engine with the required minimum specifications or higher and setting a few necessary requirements. Create a dedicated cluster before applying for the CloudML service.
- The way to create a cluster is to refer to the Cluster Configuration guide.
- CloudML exposes an HTTPS endpoint on port 443. Select the public endpoint when creating a cluster.
Cluster Node and Storage Recommended Specifications
Cluster nodes can be added or modified after the cluster is created. The following are the recommended specifications for the cluster nodes and storage that should be prepared to install CloudML based on 5 users.
| Division | Item | Role | Capacity |
|---|---|---|---|
| Cluster Node | Kubernetes Node Pool (Virtual Server) | Application Execution
| 24 core / 96 GB |
| Cluster Node | Kubernetes Node Pool (Virtual Server) | Analysis Execution
| 8 core / 32 GBi x 2 EA
|
| Repository | File Storage | Data Storage | 1 TB |
If you need to change the number of nodes, add GPU nodes, or scale up resources, please request technical support.
- Technical Support Guide Page: https://www.samsungsds.com/kr/support/support_tech.html
- Technical support request email: brightics.cs@samsung.com
Adding Labels to Nodes
Add labels to nodes directly based on the roles presented in the recommended specifications for cluster nodes and storage.
- To add labels to a node YAML, see the Editing Node YAML guide.
To add a label to a cluster node, follow these steps.
- Click all services > Container > Kubernetes Engine menu. It moves to the Service Home page of Kubernetes Engine.
- On the Service Home page, click the Node menu. It moves to the Node List page.
- On the Node List page, select the cluster you want to check detailed information from the Gear button at the top left, then click the Confirm button.
- Select and click the node you want to check the detailed information of. It will move to the Node Details page.
- Click the Node Details page YAML tab. Move to the YAML tab page.
- Click the Edit button on the YAML tab page. The node editing window opens.
- In the node editing window, add a label that matches the role and click the Save button.
- Check the following information and add labels that match the node specifications.
Division Purpose-based Label CPU Node - For app:
node.kubernetes.io/nodetype: ml-app
- For analytics:
node.kubernetes.io/nodetype: ml-analytics
GPU node - For analysis:
node.kubernetes.io/nodetype: ml-analytics-gpu
- For copilot:
node.kubernetes.io/nodetype: ml-gpu
Table. Kubernetes node labels by purpose - For app:
- Check the following information and add labels that match the node specifications.
2.3 - API Reference
2.4 - CLI Reference
2.5 - Release Note
CloudML
- Samsung Cloud Platform provides CloudML service that supports the entire machine learning process from data analysis to model development, learning, verification, and deployment in a cloud environment.
3 - AI&MLOps Platform
3.1 - Overview
Service Overview
AI&MLOps Platform is a machine learning platform that automates the repetitive tasks of the entire pipeline of machine learning model development, learning, and deployment process. Through the AI&MLOps Platform service, integrated management of training data, models, and operational data is possible based on a Kubernetes-based AI/MLOps environment.
AI&MLOps Platform is an open-source product that provides Kubeflow.Mini service, which can utilize the development, learning, tuning, and deployment functions of machine learning models, and Enterprise service that adds Add-on functions such as distributed learning Job execution and monitoring.
Features
Cloud Native MLOps Environment: AI&MLOps Platform provides a machine learning model development environment optimized for the cloud, and it is convenient to link with various open sources based on Kubernetes.
Machine Learning Development and Operational Convenience: Provides a standardized environment that supports various machine learning frameworks such as TensorFlow, PyTorch, scikit-learn, Keras, etc. It automates the entire pipeline of machine learning model development, training, and deployment, making it easy to configure and create models, and reusable.
GPU Collaboration Enhancement: With Bare Metal Server-based Multi Node GPU and GPUDirect RDMA (Remote Direct Memory Access), the job speed of LLM (Large Language Model) and NLP (Natural Language Processing) can be dramatically improved.
Service Composition Diagram
Provided Features
The AI&MLOps Platform provides the following functions.
ML Model Development Environment and Features
Notebook provision: Creates a Jupyter Notebook and VS Code with ML Framework (Tensorflow, Pytorch, etc.).
TensorBoard: TensorBoard(*ML model training process visualization/analysis tool) server is created and managed.
Volumes: When developing an ML model, use a volume to store datasets and models, and connect a volume when creating a Jupyter Notebook.
Distributed Training Job for ML Model Execution/Management
Supports distributed learning Job execution and monitoring, inference service management and analysis. (Add-on)
It provides various functions for managing Job Queue and configuring MLOps environment, etc. (Add-on)
Job Scheduler(FIFO, Bin-packing, Gang based), GPU Fraction, GPU resource monitoring, etc. provide efficient GPU resource utilization features. (Add-on) BM-based Multi Node GPU and GPU Direct RDMA (Remote Direct Memory Access) significantly improved the job speed of LLM (Large Language Model) and NLP (Natural Language Processing) (Add-on)
ML Model Experiment Management and Pipeline
ML pipeline experiment management is provided through Experiments(KFP). It supports pipeline automation configuration function to execute ML tasks in a step-by-step manner.
Component
Operating System Version
The operating systems supported by the AI&MLOps Platform are as follows.
| Operating System(OS) | Version |
|---|---|
| RHEL | RHEL 8.3 |
| Ubuntu | Ubuntu 18.04, Ubuntu 20.04, Ubuntu 22.04 |
Regional Provision Status
The AI&MLOps Platform can be provided in the following environments.
| Region | Availability |
|---|---|
| Western Korea(kr-west1) | Provided |
| Korean East(kr-east1) | Provided |
| South Korea 1(kr-south1) | Not provided |
| South Korea southern region 2(kr-south2) | Not provided |
| South Korea, Busan(kr-south3) | Not provided |
Preceding service
This is a list of services that must be pre-configured before creating this service. Please refer to the guide provided for each service and prepare in advance for more details.
| Service Category | Service | Detailed Description |
|---|---|---|
| Container | Kubernetes Engine | Kubernetes container orchestration service |
3.2 - How-to guides
Create AI&MLOps Platform
The user can enter the essential information of the AI&MLOps Platform through the Samsung Cloud Platform Console and create the service by selecting detailed options.
To create an AI&MLOps Platform, follow these steps.
All Services > AI/ML > AI&MLOps Platform menu is clicked. It moves to the Service Home page of AI&MLOps Platform.
On the Service Home page, click the Create AI&MLOps Platform button. It moves to the Create AI&MLOps Platform page.
On the Service Type Selection page of AI&MLOps Platform creation, enter the information required for service creation and select detailed options.
- Select Service Type and Version Select the service type in the Service Type and Version Selection area.
Classification NecessityDetailed Description Service Type Required The type of service chosen by the user - AI&MLOps Platform
- Kubeflow Mini
Service Type Version Required Version of the selected service - Provides a list of versions of the provided service
Table. AI&MLOps Platform Service Type and Version Selection Items - Cluster Deployment Area Division Select the options required for service creation in the area.
Classification NecessityDetailed Description Cluster Deployment Area Required - Deploy to Kubernetes Engine: Select an existing Kubernetes Engine
- Deploy to New Cluster: Create a Kubernetes Engine when creating the AI&MLOps Platform
Table. AI&MLOps Platform Service Cluster Deployment Area Division Items
ReferenceDepending on the settings of this cluster deployment, the following configuration elements of the Service Information Input page will be different.- Select Service Type and Version Select the service type in the Service Type and Version Selection area.
On the Service Information Input page of AI&MLOps Platform creation, enter the information required for service creation and select detailed options.
- You can select the cluster deployment area.
- Deploy to new cluster setup method, please refer to the Deploy to new cluster guide.
- For the Deploying on SCP Kubernetes Engine setup method, please refer to the Deploying on SCP Kubernetes Engine guide.
- The Kubernetes cluster specifications required for installation can be found in the Kubernetes cluster specifications guide.
- You can select the cluster deployment area.
On the Creation Information Confirmation page of AI&MLOps Platform Creation, check the detailed information created and the expected billing amount, and click the Complete button.
- Once creation is complete, check the created resource on the AI&MLOps Platform service list page.
AI&MLOps Platform detailed information check
The AI&MLOps Platform service can check and modify the entire resource list and detailed information. The AI&MLOps Platform Service Details page consists of Detailed Information, Tags, Work History tabs.
To check the detailed information of the AI&MLOps Platform service, follow the next procedure.
- All Services > AI/ML > AI&MLOps Platform Service menu is clicked. It moves to the Service Home page of the AI&MLOps Platform Service.
- Service Home page, click the AI&MLOps Platform menu. It moves to the AI&MLOps Platform service list page.
- AI&MLOps Platform Service List page, click on the resource to view detailed information. Move to the AI&MLOps Platform Service Details page.
- AI&MLOps Platform Service Details page displays status information and additional feature information, and consists of Details, Tags, Work History tabs.
Detailed Information
On the AI&MLOps Platform Service List page, you can check the detailed information of the selected resource and modify the information if necessary.
| Classification | Detailed Description |
|---|---|
| Service | Service Category |
| Resource Type | Service Name |
| SRN | Unique resource ID in Samsung Cloud Platform |
| Resource Name | Resource Name
|
| Resource ID | Unique resource ID in the service |
| Creator | User who created the service |
| Creation Time | The time when the service was created |
| Modifier | Service information modified user |
| Modified Date | Date when service information was modified |
| Dashboard Status | Dashboard Status Value |
| Service Name | Service Name |
| Admin Email Address | Administrator Email Address |
| Image Name | Service Image Name |
| Version | Image Version |
| Service Type | Deployed Service Type |
Tag
On the AI&MLOps Platform 서비스 목록 page, you can check the tag information of the selected resource, and add, change, or delete it.
| Classification | Detailed Description |
|---|---|
| Tag List | Tag list
|
Work History
You can check the work history of the selected resource on the AI&MLOps Platform 서비스 목록 page.
| Classification | Detailed Description |
|---|---|
| Work history list | Resource change history
|
AI&MLOps Platform connection
To access the AI&MLOps Platform dashboard, preliminary work must be done in advance.
Pre-work
To access the AI&MLOps Platform, you must set the relevant ports and IP addresses for access in the Security Group and Firewall (if using a firewall) in advance.
Kubeflow Mini: 31390 port (Security Group’s inbound rule, VPC firewall)
To access the cluster Worker Node, you must set the inbound rule for port 22 in the Security Group and Firewall (if using VPC firewall).
Logging into the Dashboard
To access the AI&MLOps Platform service, follow the procedure below.
- All Services > AI/ML > AI&MLOps Platform Service menu is clicked. It moves to the Service Home page of the AI&MLOps Platform Service.
- On the Service Home page, click the AI&MLOps Platform 서비스 menu. It moves to the AI&MLOps Platform 서비스 목록 page.
- AI&MLOps Platform Service List page, click on the resource to view detailed information. It moves to the AI&MLOps Platform Details page.
- AI&MLOps Platform details page, click the Access Guide button. The Access Guide popup window opens.
- Connection Guide In the Connection Guide popup window, click the URL link of the Dashboard. It moves to the corresponding dashboard page.
AI&MLOps Platform cancellation
You can save operating costs by canceling the corresponding service that is not in use. However, if you cancel the service, the operating service may be stopped immediately, so you should consider the impact of stopping the service sufficiently before proceeding with the cancellation work.
To cancel the AI&MLOps Platform, follow the procedure below.
- Click on the menu for all services > AI/ML > AI&MLOps Platform service. It moves to the Service Home page of the AI&MLOps Platform service.
- On the Service Home page, click the AI&MLOps Platform Service menu. It moves to the AI&MLOps Platform Service List page.
- AI&MLOps Platform Service List page, click on the resource to check the detailed information. It moves to the AI&MLOps Platform Details page.
- On the AI&MLOps Platform details page, click the Cancel Service button. The Cancel Service popup window will open.
- To confirm, enter the service name and click Confirm.
- Once the cancellation is complete, please check if the resource has been cancelled on the AI&MLOps Platform service list page.
3.2.1 - Cluster Deployment
Cluster Deployment Area
On the Samsung Cloud Platform, the AI&MLOps Platform creation’s service type selection provides 2 cloud deployment areas.
- Deploy on SCP Kubernetes Engine
- [Deploy to a new cluster](#Deploy to a new cluster)
Before proceeding with the cluster deployment task, please check the Kubernetes cluster specifications required for installation.
- Regardless of the selection of the cluster deployment area, the Kubernetes cluster specification must be checked in advance.
- Please refer to the Cluster Specification guide for detailed specification information.
Depending on the selection of the cluster deployment area, the installation content on the Service Information Input page of AI&MLOps Platform creation varies.
Deploying on SCP Kubernetes Engine
- Click on the All Services > AI/ML > AI&MLOps Platform menu. It moves to the Service Home page of AI&MLOps Platform.
- Service Home page, click the AI&MLOps Platform creation button. Move to the AI&MLOps Platform creation page.
- On the Service Type Selection page of AI&MLOps Platform creation, enter the information required for service creation and select detailed options.Cluster DeploymentSelect the SCP Kubernetes Engine deployment option.
- On the Service Information Input page of AI&MLOps Platform creation, enter the information required for service creation and select detailed options.
- Service Information Input area where you can enter or inquire the necessary information for service creation.
Classification NecessityDetailed Description Service Name Required Enter AI&MLOps Platform name - AI&MLOps Platform name cannot be duplicated within the project
Storage Class Required Storage Class is registered automatically Installation Node Information Query Confirm node information of the selected Kubernetes Engine Admin Email Address Required Input the email address of the administrator (Admin) to use when logging in Password Required Enter the password to use when logging in Password Confirmation Required Re-enter password to prevent password errors Table. AI&MLOps Platform Service Information Input Items - Additional Information Input area, please enter or select the information needed to create the service.
Classification NecessityDetailed Description Tag Selection Select a tag to add to the AI&MLOps Platform - Clicking on tag addition creates and adds a tag or adds an existing tag
- Up to 50 tags can be registered
- Newly added tags are applied after service creation is completed
Table. Additional Information Input Items for AI&MLOps Platform Service
- Service Information Input area where you can enter or inquire the necessary information for service creation.
Deploy to a new cluster
- Click all services > AI/ML > AI&MLOps Platform menu. It moves to the Service Home page of AI&MLOps Platform.
- On the Service Home page, click the Create AI&MLOps Platform button. It moves to the Create AI&MLOps Platform page.
- AI&MLOps Platform creation’s service type selection page, enter the information required for service creation and select detailed options.Cluster DeploymentSelect the new cluster to deploy option.
- On the Service Information Input page of AI&MLOps Platform creation, enter the information required for service creation and select detailed options.
Service Information Input area where you can enter or inquire about the information needed to create a service.
Classification NecessityDetailed Description Service Name Required Enter AI&MLOps Platform name - AI&MLOps Platform name cannot be duplicated within the project
Storage Class required Storage Class is registered automatically Installation Node Information Query Confirm node information of the selected Kubernetes Engine Admin Email Address Required Enter the email address of the administrator (Admin) to use when logging in Password Required Enter the password to use when logging in Password Confirmation Required Re-enter password to prevent password errors Table. AI&MLOps Platform Service Information Input ItemsEnter Kubernetes Engine information Enter or select the necessary information in the area.
Classification NecessityDetailed Description Cluster Name Required Cluster name - Starts with English and uses English, numbers, and special characters(
-)
- Input within 3~30 characters
Control Plane Version > Kubernetes Version Required Select Kubernetes Version Control Area Setting > Control Area Logging Select Select whether to use control area logging - Audit/Event logs of the cluster control area can be checked in Cloud Monitoring’s log analysis
- 1GB of log storage for all services in the account is provided for free, and logs are deleted sequentially when exceeding 1GB
- For more information, refer to Cloud Monitoring > Log Analysis
Network Settings Required Network connection settings for the node pool - VPC: Select a pre-created VPC
- Subnet: Select a general Subnet to use from the selected VPC’s subnets
- Security Group: Click the Search button and select a Security Group from the Security Group Selection popup window
- Load Balancer: Provides
type: LoadBalancerfunctionality in Kubernetes Service objects- Select a load balancer on the same network
- Usage: Select whether to use it
- Cannot be changed after setting
File Storage settings Required Select the file storage volume to be used in the cluster - Default volume (NFS): Select File Storage through the Search button
- The default volume file storage only provides NFS format
Table. Kubernetes Engine service information input items- Starts with English and uses English, numbers, and special characters(
Enter Node Pool Information Enter or select the required information in the area.
Classification NecessityDetailed Description Node Pool Configuration Required Select node pool information - * marked items are required input items, so they must be entered
- In the case of AI&MLOps Platform, the image capacity may continue to increase depending on use, so setting Block Storage to at least 200GB or more allows for smooth system configuration
Table. AI&MLOps Platform Service Information Input Items
- Windows OS node pool can only be created when additional storage (CIFS) volumes are in use in the cluster.
- Node pool Block Storage’s volume encryption can only be set at the time of initial creation.
- Setting encryption may cause performance degradation of some features.
- If you choose to use the node pool auto-scaling or auto-resizing feature, you can only enter number of nodes, minimum number of nodes, maximum number of nodes.
* **Additional Information Input** area, please enter or select the necessary information.
Classification
Necessity
Detailed Description
Tag
Selection
Select a tag to add to the AI&MLOps Platform - Clicking on tag addition creates and adds a tag or adds an existing tag
- Up to 50 tags can be registered
- Newly added tags are applied after service creation is completed
Table. AI&MLOps Platform Service Information Input Items
Cluster Specifications
To use the AI&MLOps Platform, a Kubernetes Engine to install the AI&MLOps Platform is required. You can select an existing Kubernetes Engine or create a Kubernetes Engine when creating the AI&MLOps Platform.
The specifications of the Kubernetes cluster required for installation are as follows.
Node pool resource scale (composed of 2 or more nodes)
- AI&MLOps Platform : vCPU 32, Memory 128G or more
- Kubeflow Mini: vCPU 24, Memory 96G or more
Kubernetes version
- AI&MLOps Platform v1.9.1 (k8s v1.30)
- Kubeflow Mini v1.9.1 (k8s v1.30)
3.2.2 - Kubeflow User Guide
Below is a guide on how to use Kubeflow after creation.
Adding Kubeflow Users
Below is a guide on how to use Kubeflow after creation.
Kubeflow only has one Admin User account created from the initial setup screen.
To add users to the Kubeflow Dashboard, you need to change the Dex settings (Kubeflow’s authentication component).
- Dex is deployed in the auth namespace and its settings are stored in a configmap named dex.
The following is an example of the Dex configuration.
apiVersion: v1
kind: ConfigMap
metadata:
name: dex
namespace: auth
data:
config.yaml: |
issuer: http://dex.auth.svc.cluster.local:5556/dex
storage:
type: kubernetes
config:
inCluster: true
web:
http: 0.0.0.0:5556
logger:
level: "debug"
format: text
oauth2:
skipApprovalScreen: true
enablePasswordDB: true
staticPasswords:
- email: admin@kubeflow.org
hash: $2y$10$Yb9WVbn8pzVSM6fBgKdFae1Bh6Z.XTihi7bNu3sB6/h5bt1JuUOgq
username: admin
userID: 9cb67307-fd6d-4441-9b59-52acd78f4c9e
staticClients:
- id: kubeflow-oidc-authservice
redirectURIs: ["/login/oidc"]
name: 'Dex Login Application'
secret: pUBnBOY80SnXgjibTYM9ZWNzY2xreNGQok
If the enablePasswordDB value is true in the configuration, Dex saves the list of users defined in staticPasswords in the internal storage when the service starts. Therefore, you can add new users by adding new values to staticPasswords with email, hash, username, and userID.
The properties for adding users are defined as follows.
| Parameter | Description |
|---|---|
| A value in the standard email format | |
| hash | A user password value encrypted with the Bcrypt algorithm, and the hash value created with the Bcrypt algorithm can be entered directly
|
| username | Username
|
| userID | A unique ID value
|
You can edit the dex configmap using the following command on a node where kubectl is available.
$ kubectl edit configmap dex -n auth
staticPasswords:
- email: admin@kubeflow.org
hash: $2y$10$Yb9WVbn8pzVSM6fBgKdFae1Bh6Z.XTihi7bNu3sB6/h5bt1JuUOgq
username: admin
userID: 9cb67307-fd6d-4441-9b59-52acd78f4c9e
- email: sds@samsung.com
hash: $2y$12$0g5.y86jnrt0v6In5NRCZ.YVuvrAUQ6j/RJYO3rV.kNulaDALOKfq
username: sds
userID: 8961d517-3498-4148-90c9-7e442ee91154
The staticPasswords value in the configmap is reflected when the Dex service starts, so you need to restart the Dex service using the following command.
kubectl rollout restart deployment dex -n auth
Try logging in with the new user information.
You should see that you are logged in successfully and can create a new namespace (profile).
The above content was written with reference to the Kubeflow official website. For more information, please refer to Kubeflow Profiles.
Using Custom Images in Kubeflow Jupyter Notebook
To use a custom image in Kubeflow Notebook Controller, which manages the Notebook life cycle, you need to meet certain requirements.
Kubeflow assumes that Jupyter starts automatically when the notebook image runs. Therefore, you need to set the default command to start Jupyter in the container image.
The following is an example of what you need to include in your Dockerfile.
ENV NB_PREFIX /
CMD ["sh","-c", "jupyter notebook --notebook-dir=/home/${NB_USER} --ip=0.0.0.0 --no-browser --allow-root --port=8888 --NotebookApp.token='' --NotebookApp.password='' --NotebookApp.allow_origin='*' --NotebookApp.base_url=${NB_PREFIX}"]
The above items are explained as follows.
| Parameter | Description |
|---|---|
--notebook-dir=/home/jovyan | Set the working directory
|
--ip=0.0.0.0 | Allow Jupyter Notebook to listen on all IPs |
--allow-root | Allow the user to run Jupyter Notebook as root |
--port=8888 | Set the port |
--NotebookApp.token=’’ –NotebookApp.password=’’ | Disable Jupyter authentication
|
--NotebookApp.allow_origin=’*’ | Allow origin |
--NotebookApp.base_url=NB_PREFIX | Set the base URL |
You can create a custom image by referencing the Dockerfile used to create the TensorFlow notebook image.
- Refer to https://github.com/kubeflow/kubeflow/blob/v1.2.0/components/tensorflow-notebook-image/Dockerfile.
Click the +NEW SERVER button on the Notebook Servers page.
If you have created a custom image, check Custom Image on the Kubeflow Notebook Server screen and enter the Custom Image address to create a new Notebook Server.
The above content was written with reference to the Kubeflow official website.
- For more information, please refer to the Kubeflow official website’s Kubeflow Notebooks > Container Images documentation.
3.3 - API Reference
3.4 - CLI Reference
3.5 - Release Note
AI&MLOps Platform
- AI&MLOps Platform open source version has been upgraded.
- Kubeflow 1.9
- The AI&MLOps Platform service, which automates the repetitive tasks of the entire pipeline of development, learning, and deployment of machine learning models, has been released.
- Provides a machine learning platform service based on Kubernetes.














