SDK Reference Overview
AIOS models are compatible with OpenAI’s API, so they are also compatible with OpenAI’s SDK. The following is a list of OpenAI and Cohere compatible APIs supported by Samsung Cloud Platform AIOS service.
| API Name | API | Detailed Description | Supported SDK |
|---|---|---|---|
| Text Completion API | Generates a natural sentence that follows the given input string. |
| |
| Conversation Completion API | Generates a response that follows the conversation content. |
| |
| Embeddings API | Converts text into a high-dimensional vector (embedding) that can be used for various natural language processing (NLP) tasks such as text similarity calculation, clustering, and search. |
| |
| Rerank API | Applies an embedding model or a cross-encoder model to predict the relevance between a single query and each item in a document list. |
|
- The SDK Reference guide is based on a Virtual Server environment with Python installed.
- The actual execution may differ from the example in terms of token count and message content.
OpenAI SDK
Installing the openai Package
Install the OpenAI package.
pip install openai
Text Completion API
The Text Completion API generates a natural sentence that follows the given input string.
/v1/completions
Request
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for AIOS model calls.
model = "<<model>>" # Enter the model ID for AIOS model calls.
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
response = client.completions.create(
model=model,
prompt="Hi"
)
Response
The text field in choices contains the model’s response.
Completion(
id='cmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
CompletionChoice(
finish_reason='length',
index=0,
logprobs=None,
text=' future president of the United States, I hope you’re doing well. As a',
stop_reason=None,
prompt_logprobs=None
)
],
created=1750000000,
model='<<model>>',
object='text_completion',
stream request
stream can be used to receive the completed answer one by one, rather than receiving the entire answer at once, as the model generates tokens.
Request
Set the stream parameter value to True.
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # AIOS model call endpoint-url to be input for AIOS model call
model = "<<model>>" # AIOS model call model ID to be input for AIOS model call
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
response = client.completions.create(
model=model,
prompt="Hi",
stream=True
)
# Receive the response as the model generates tokens.
for chunk in response:
print(chunk)
Response
Each token generates an answer, and each token can be checked in the choices’s text field.
Completion(
id='cmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
CompletionChoice(
finish_reason=None,
index=0,
logprobs=None,
text='.',
stop_reason=None
)
],
created=1750000000,
model='<<model>>',
object='text_completion',
system_fingerprint=None,
usage=None
)
Completion(..., choices=[CompletionChoice(..., text=' I', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text="'m", ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' looking', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' for', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' a', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' way', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' to', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' check', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' if', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' a', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' specific', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' process', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' is', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' running', ...)], ...)
Completion(..., choices=[CompletionChoice(..., text=' on', ...)], ...)
Completion(..., choices=[], ...,
usage=CompletionUsage(
completion_tokens=16,
prompt_tokens=2,
total_tokens=18,
completion_tokens_details=None,
prompt_tokens_details=None
)
)
conversation completion API
The conversation completion API takes a list of messages in order as input and responds with a message that is suitable for the current context as the next order.
/v1/chat/completions
Request
Text message only, you can call as follows:
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # AIOS model call for aios endpoint-url to enter.
model = "<<model>>" # AIOS model call for model ID to enter.
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
response = client.chat.completions.create(
model=model,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hi"}
]
)
Response
You can check the model’s answer in the choices’s message.
ChatCompletion(
id='chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
Choice(
finish_reason='stop',
index=0,
logprobs=None,
message=ChatCompletionMessage(
content='Hello. How can I assist you today?',
refusal=None,
role='assistant',
annotations=None,
audio=None,
function_call=None,
tool_calls=[],
reasoning_content=None
),
stop_reason=None
)
],
created=1750000000,
model='<<model>>',
object='chat.completion',
service_tier=None,
system_fingerprint=None,
usage=CompletionUsage(
completion_tokens=10,
prompt_tokens=42,
total_tokens=52,
completion_tokens_details=None,
prompt_tokens_details=None
),
prompt_logprobs=None
)
Stream Request
Using stream, you can wait for the model to generate all answers and receive the response at once, or receive and process the response for each token generated by the model.
Request
Enter True as the value of the stream parameter.
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # AIOS model call for aios endpoint-url to enter.
model = "<<model>>" # AIOS model call for model ID to enter.
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
response = client.chat.completions.create(
model="meta-llama/Llama-3.3-70B-Instruct",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hi"}
],
stream=True
)
# You can receive a response each time the model generates a token.
for chunk in response:
print(chunk)
Response
Each token generates a response, and each token can be checked in the choices field of the delta field.
ChatCompletionChunk(
id='chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
Choice(
delta=ChoiceDelta(
content='',
function_call=None,
refusal=None,
role='assistant',
tool_calls=None
),
finish_reason=None,
index=0,
logprobs=None
)
],
created=1750000000,
model='<<model>>',
object='chat.completion.chunk',
service_tier=None,
system_fingerprint=None,
usage=None
)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content='It', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content="'s", ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' nice', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' to', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content='meet', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' you', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content='.', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' Is', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' there', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' something', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' I', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' can', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' help', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' you', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' with', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' or', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' would', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' you', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' like', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' to', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content=' chat', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content='?', ...), ...)], ...)
ChatCompletionChunk(..., choices=[Choice(delta=ChoiceDelta(content='', ...), ...)], ...)
ChatCompletionChunk(..., choices=[], ...,
usage=CompletionUsage(
completion_tokens=23,
prompt_tokens=42,
total_tokens=65,
completion_tokens_details=None,
prompt_tokens_details=None
)
)
Tool Calling
Tool calling refers to the interface of external tools defined outside the model, allowing the model to generate answers that can perform suitable tools in the current context.
Using tool call, you can define metadata for functions that the model can execute and utilize them to generate answers.
Request
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # AIOS model call endpoint URL
model = "<<model>>" # AIOS model ID
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
# Function to get weather information
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current temperature for provided coordinates in celsius.",
"parameters": {
"type": "object",
"properties": {
"latitude": {"type": "number"},
"longitude": {"type": "number"}
},
"required": ["latitude", "longitude"],
"additionalProperties": False
},
"strict": True
}
}]
messages = [{“role”: “user”, “content”: “What is the weather like in Paris today?”}]
response = client.chat.completions.create( model=model, messages=messages, tools=tools # Inform the model of the metadata of the tools that can be used. )
Response
choices’s message.tool_calls can be used to check how the model determines the execution method of the tool.
In the following example, you can see that the tool_calls’s function uses the get_weather function and checks what arguments should be inserted.
ChatCompletion(
id='chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
Choice(
finish_reason='tool_calls',
index=0,
logprobs=None,
message=ChatCompletionMessage(
content=None,
refusal=None,
role='assistant',
annotations=None,
audio=None,
function_call=None,
tool_calls=[
ChatCompletionMessageToolCall(
id='chatcmpl-tool-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
function=Function(
arguments='{"latitude": 48.8566, "longitude": 2.3522}',
name='get_weather'
),
type='function'
)
],
reasoning_content=None
),
stop_reason=None
)
],
created=1750000000,
model='<<model>>',
object='chat.completion',
service_tier=None,
system_fingerprint=None,
usage=CompletionUsage(
completion_tokens=19,
prompt_tokens=194,
total_tokens=213,
completion_tokens_details=None,
prompt_tokens_details=None
),
prompt_logprobs=None
)
tool message
After adding the result value of the function as a tool message and generating the model’s response again, you can create an answer using the result value.
Request
Based on tool_calls’s function.arguments in the response data, you can actually call the function.
import json
# example function, always responds with 14 degrees.
def get_weather(latitude, longitude):
return "14℃"
tool_call = response.choices[0].message.tool_calls[0]
args = json.loads(tool_call.function.arguments)
result = get_weather(args["latitude"], args["longitude"]) # "14℃"
After adding the result value of the function as a tool message to the conversation context and calling the model again,
the model can create an appropriate answer using the result value of the function.
# Add the model's tool call message to messages
messages.append(response.choices[0].message)
# Add the result of the actual function call to messages
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": str(result)
})
response_2 = client.chat.completions.create(
model=model,
messages=messages,
# tools=tools
Response
ChatCompletion(
id='chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
Choice(
finish_reason='stop',
index=0,
logprobs=None,
message=ChatCompletionMessage(
content='The current weather in Paris is 14℃.',
refusal=None,
role='assistant',
annotations=None,
audio=None,
function_call=None,
tool_calls=[],
reasoning_content=None
),
stop_reason=None
)
],
created=1750000000,
model='<<model>>',
object='chat.completion',
service_tier=None,
system_fingerprint=None,
usage=CompletionUsage(
completion_tokens=11,
prompt_tokens=74,
total_tokens=85,
completion_tokens_details=None,
prompt_tokens_details=None
),
prompt_logprobs=None
)
Reasoning
Request
Reasoning is supported in models that provide a reasoning value, which can be checked as follows:
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for AIOS model calls.
model = "<<model>>" # Enter the model ID for AIOS model calls.
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
response = client.chat.completions.create(
model=model,
messages=[
{"role": "user", "content": "9.11 and 9.8, which is greater?"}
],
)
Response
The choices of the message field can be checked to see the content and also the reasoning_content, which provides the reasoning tokens.
ChatCompletion(
id='chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
Choice(
finish_reason='stop',
index=0,
logprobs=None,
message=ChatCompletionMessage(
content='''
To determine whether 9.11 or 9.8 is larger, we compare the decimal parts since both numbers have the same whole number part (9).
1. Convert both numbers to the same decimal places for easier comparison:
- 9.11 remains as is.
- 9.8 can be written as 9.80.
2. Compare the tenths place:
- The tenths place of 9.11 is 1.
- The tenths place of 9.80 is 8.
3. Since 8 (from 9.80) is greater than 1 (from 9.11), 9.80 (or 9.8) is larger.
4. Verification by subtraction:
- Subtracting 9.11 from 9.8 gives \(9.80 - 9.11 = 0.69\), which is positive, confirming 9.8 is larger.
Thus, the larger number is \(\boxed{9.8}\).
''',
refusal=None,
role='assistant',
annotations=None,
audio=None,
function_call=None,
tool_calls=[],
reasoning_content="""Okay, so I need to figure out whether 9.11 is bigger than 9.8 or vice versa.
Hmm, let me think. Both numbers start with 9, so the whole number part is the same.
Here is the translation of the given text:
That means the difference must be in the decimal parts.
First, I remember that when comparing decimals, you look at the digits one by one after the decimal point.
The first digit after the decimal is the tenths place, then hundredths, and so on.
Since both numbers have 9 in the units place, I can focus on the decimal parts: 0.11 versus 0.08.
Wait, let me write them out to be clear. 9.11 is 9 and 11 hundredths, and 9.8 is 9 and 8 tenths.
Wait a second, maybe I should convert them to the same decimal places to make it easier.
So, 9.8 is the same as 9.80, right? Because adding a zero at the end of a decimal doesn't change its value.
That way, both numbers have two decimal places. Now comparing 9.11 and 9.80.
Looking at the tenths place first: 9.11 has 1 tenth, and 9.80 has 8 tenths.
Since 8 is more than 1, that means 9.80 is larger than 9.11. Wait, but hold on, the original numbers were 9.11 and 9.8. Did I make a mistake here?
Let me double-check. 9.8 is indeed equal to 9.80.
So when comparing tenths, 8 tenths (from 9.80) is definitely more than 1 tenth (from 9.11). So that would mean 9.8 is bigger.
But wait, maybe I confused the decimal places. Let me think again.
The first number after the decimal is tenths, so in 9.11, the first digit is 1 (tenths place) and the second is 1 (hundredths).
In 9.8, the first digit is 8 (tenths place), and there's no hundredths, which is equivalent to 0.
So comparing the tenths place first: 1 vs 8. Since 8 is greater than 1, 9.8 is larger.
But why did I initially think maybe 9.11 was bigger? Maybe because 11 is a two-digit number?
But no, in decimals, each position is a different place value.
The tenths place is more significant than the hundredths. So even though 11 hundredths is 0.11, that's still less than 0.8 (which is 8 tenths).
Another way to think about it: Convert both to fractions.
9.11 is 9 + 11/100, and 9.8 is 9 + 8/10.
Converting 8/10 to hundredths, that's 80/100.
So 9.8 is 9 + 80/100, whereas 9.11 is 9 + 11/100. Comparing the fractions: 80/100 is more than 11/100, so 9.8 is bigger.
Alternatively, maybe subtract them to see the difference. 9.8 minus 9.11 is 0.69, which is positive, so 9.8 is larger.
Wait, let me do the subtraction step by step.
9.8 minus 9.11:
First, write them aligned by decimal:
9.80
-9.11
-------
0.69
Yes, so the result is positive, so 9.8 is bigger.
Alternatively, if I had to do it without converting, maybe I can think in terms of money.
If these were dollars, 9.11 is $9.11 and 9.8 is $9.80.
Clearly, $9.80 is more than $9.11.
Hmm, so all these methods point to 9.8 being larger.
Maybe my initial confusion was because I saw 11 as a two, but
...omitted...
**Final Answer**
The number 9.8 is larger than 9.11. This is because when comparing the decimal parts, 0.8 (from 9.8) is greater than 0.11 (from 9.11).
Specifically, 9.8 can be written as 9.80, and comparing the tenths place (8 vs. 1) shows that 9.8 is larger.
The difference between them is 0.69, confirming that 9.8 is indeed the larger number.
**Final Answer**
\\boxed{9.8}"""
),
stop_reason=None
)
], created=1750000000, model=’«model»’, object=‘chat.completion’, service_tier=None, system_fingerprint=None, usage=CompletionUsage( completion_tokens=4167, prompt_tokens=27, total_tokens=4194, completion_tokens_details=None, prompt_tokens_details=None ), prompt_logprobs=None, kv_transfer_params=None )
### image to text
**vision**을 지원하는 모델의 경우, 다음과 같이 이미지를 입력할 수 있습니다.

<div class="scp-textbox scp-textbox-type-error">
<div class="scp-textbox-title">Note</div>
<div class="scp-textbox-contents">
<p>For models that support <strong>vision</strong>, there are limitations on the size and number of input images.</p>
<p>Please refer to <a href="/en/userguide/ai_ml/aios/overview/#provided-models">Provided Models</a> for more information on image input limitations.</p>
</div>
</div>
#### Request
You can input an image with **MIME type** and **base64**.
```python
import base64
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # AIOS endpoint-url for model calls
model = "<<model>>" # Model ID for AIOS model calls
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
image_path = "image/path.jpg"
def encode_image(image_path: str):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")
base64_image = encode_image(image_path)
response = client.chat.completions.create( model=model, messages=[ { “role”: “user”, “content”: [ {“type”: “text”, “text”: “what’s in this image?”}, { “type”: “image_url”, “image_url”: { “url”: f"data:image/jpeg;base64,{base64_image}", }, }, ] }, ], )
Response
The following is an analysis of the image to generate text.
ChatCompletion(
id='chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
choices=[
Choice(
finish_reason='stop',
index=0,
logprobs=None,
message=ChatCompletionMessage(
content="""Here's what's in the image:
* **A golden retriever puppy:** The main subject is a light-colored golden retriever puppy lying on green grass.
* **A bone:** The puppy is holding a large bone in its paws and appears to be enjoying chewing on it.
* **Grass:** The puppy is lying on a well-maintained lawn.
* **Vegetation:** Behind the puppy, there are some shrubs and other greenery.
* **Outdoor setting:** The scene is outdoors, likely a backyard.""",
refusal=None,
role='assistant',
annotations=None,
audio=None,
function_call=None,
tool_calls=[],
reasoning_content=None
),
stop_reason=106
)
],
created=1750000000,
model='<<model>>',
object='chat.completion',
service_tier=None,
system_fingerprint=None,
usage=CompletionUsage(
completion_tokens=114,
prompt_tokens=276,
total_tokens=390,
completion_tokens_details=None,
prompt_tokens_details=None
),
prompt_logprobs=None,
kv_transfer_params=None
)
Embeddings API
Embeddings converts input text into a high-dimensional vector of a fixed dimension. The generated vector can be used for various natural language processing tasks such as text similarity, clustering, and search.
/v1/embeddings
Request
from openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # AIOS endpoint-url for model calls
model = "<<model>>" # Model ID for AIOS model calls
client = OpenAI(base_url=urljoin(aios_base_url, "v1"), api_key="EMPTY_KEY")
response = client.embeddings.create(
input="What is the capital of France?",
model=model
)
Response
data receives the converted value in vector form as a response.
CreateEmbeddingResponse(
data=[
Embedding(
embedding=[
0.01319122314453125,
0.057220458984375,
-0.028533935546875,
-0.0008697509765625,
-0.01422119140625,
...omitted...
],
index=0,
object='embedding'
)
],
model='<<model>>',
object='list',
usage=Usage(
prompt_tokens=9,
total_tokens=9,
completion_tokens=0,
prompt_tokens_details=None
),
id='embd-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
created=1750000000
)
Cohere SDK
The Rerank API is compatible with the Cohere SDK.
Installing the Cohere Package
The Cohere SDK can be used by installing the Cohere package.
pip install cohere
Rerank API
Rerank calculates the relevance between the given query and documents, and ranks them. It can help improve the performance of RAG (Retrieval-Augmented Generation) structure applications by adjusting relevant documents to the front.
/v2/rerank
Request
import cohere
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for AIOS model calls.
model = "<<model>>" # Enter the model ID for AIOS model calls.
client = cohere.ClientV2("EMPTY_KEY", base_url=aios_base_url)
docs = [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France."
]
response = client.rerank(
model=model,
query="What is the capital of France?",
documents=docs,
top_n=3,
)
Response
In results, you can check the documents sorted in order of relevance to the query.
V2RerankResponse(
id='rerank-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
results=[
V2RerankResponseResultsItem(
document=V2RerankResponseResultsItemDocument(
text='The capital of France is Paris.'
),
index=0,
relevance_score=1.0
),
V2RerankResponseResultsItem(
Here is the translated text:
document=V2RerankResponseResultsItemDocument(
text='France capital city is known for the Eiffel Tower.'
),
index=1,
relevance_score=1.0
),
V2RerankResponseResultsItem(
document=V2RerankResponseResultsItemDocument(
text='Paris is located in the north-central part of France.'
),
index=2,
relevance_score=0.982421875
)
], meta=None, model=’«model»’, usage={ ’total_tokens’: 62 } )
Langchain SDK
Langchain’s SDK is also composed of OpenAI and Cohere SDKs, so you can use the Langchain SDK.
langchain package installation
The Langchain SDK can be used with the AIOS model after installing the langchain package.
pip install langchain langchain-openai langchain-cohere langchain-together
The langchain-openai package can be used to utilize the text completion API and conversation completion API.
langchain_openai.OpenAI
When the text completion model (langchain_openai.OpenAI) is invoked, the result value is generated as text.
Request
from langchain_openai import OpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for AIOS model calls.
model = "<<model>>" # Enter the model ID for AIOS model calls.
llm = OpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
llm.invoke("Can you introduce yourself in 5 words?")
Response
"""Hi, I'm a fun artist!
...omitted..."""
langchain_openai.ChatOpenAI
When the conversation completion model (langchain_openai.ChatOpenAI) is invoked, the result value is generated as an AIMessage or Message object.
Request
from langchain_openai import ChatOpenAI
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for AIOS model calls.
model = "<<model>>" # Enter the model ID for AIOS model calls.
chat_llm = ChatOpenAI(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
chat_completion = chat_llm.invoke("Can you introduce yourself in 5 words?")
chat_completion.pretty_print()
Response
================================== Ai Message ==================================
I am an AI assistant.
embeddings
Embeddings models such as langchain-together, langchain-fireworks can be used.
Request
from langchain_together import TogetherEmbeddings
from urllib.parse import urljoin
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for AIOS model invocation.
model = "<<model>>" # Enter the model ID for AIOS model invocation.
embedding = TogetherEmbeddings(
base_url=urljoin(aios_base_url, "v1"),
api_key="EMPTY_KEY",
model=model
)
embedding.embed_query("What is the capital of France?")
Response
[
0.01319122314453125,
0.057220458984375,
-0.028533935546875,
-0.0008697509765625,
-0.01422119140625,
...omitted...
]
rerank
Rerank models can utilize langchain-cohere’s CohereRerank.
Request
from langchain_cohere.rerank import CohereRerank
aios_base_url = "<<aios endpoint-url>>" # Enter the aios endpoint-url for AIOS model invocation.
model = "<<model>>" # Enter the model ID for AIOS model invocation.
rerank = CohereRerank(
base_url=aios_base_url,
cohere_api_key="EMPTY_KEY",
model=model
)
docs = [
"The capital of France is Paris.",
"France capital city is known for the Eiffel Tower.",
"Paris is located in the north-central part of France."
]
rerank.rerank(
documents=docs,
query="What is the capital of France?",
top_n=3
)
Response
[
{'index': 0, 'relevance_score': 1.0},
{'index': 1, 'relevance_score': 1.0},
{'index': 2, 'relevance_score': 0.982421875}
]