이 섹션의 다중 페이지 출력 화면임. 여기를 클릭하여 프린트.

Tutorial

1: Chat Playground
2: RAG
3: Autogen

Tutorial

AIOS를 활용해볼 수 있는 Tutorial를 제공합니다.

구분	설명
Chat Playground	웹 기반 Playground을 만들고 활용하는 방법 자세한 내용은 Chat Playground를 참고하세요.
RAG	RAG 기반의 PR리뷰 보조 챗봇 만들기 자세한 내용은 RAG를 참고하세요.
Autogen	Autogen을 활용한 에이전트 애플리케이션 만들기 자세한 내용은 Autogen을 참고하세요.

표. AIOS Tutorial 목록

1 - Chat Playground

목표

이 튜토리얼에서는 SCP for Enterprise 환경에서 Streamlit을 사용하여 AIOS가 제공하는 여러 AI 모델의 API를 쉽게 시험해볼 수 있는 웹 기반 Playground를 만들고 활용하는 방법을 소개합니다.

환경

이 튜토리얼을 진행하려면 아래와 같은 환경이 준비되어 있어야 합니다.

시스템 환경

Python 3.10 +
pip

설치 필요 패키지

배경색 변경

pip install streamlit

pip install streamlit

코드 블럭. streamlit 패키지 설치

참고

Streamlit
Python 기반의 오픈소스 웹 애플리케이션 프레임워크로, 데이터 사이언스, 머신러닝, 데이터 분석 결과를 시각적으로 표현하고 공유하기에 매우 적합한 도구입니다. 복잡한 웹 개발 지식 없이도 코드를 몇 줄만 작성해도 웹 인터페이스를 빠르게 만들 수 있습니다.

구현

사전 점검

애플리케이션이 구동되는 환경에서 curl로 모델 호출이 정상적인지 점검합니다. 여기서 AIOS_LLM_Private_Endpoint 는 LLM 이용 가이드를 참고해주세요.

예시 : {AIOS LLM 프라이빗 엔드포인트}/{API}

배경색 변경

curl -H "Content-Type: application/json" \
-d '{"model": "meta-llama/Llama-3.3-70B-Instruct"
, "prompt" : "Hello, I am jihye, who are you" 
, "temperature": 0 
, "max_tokens": 100
, "stream": false}' -L AIOS_LLM_Private_Endpoint

curl -H "Content-Type: application/json" \
-d '{"model": "meta-llama/Llama-3.3-70B-Instruct"
, "prompt" : "Hello, I am jihye, who are you" 
, "temperature": 0 
, "max_tokens": 100
, "stream": false}' -L AIOS_LLM_Private_Endpoint

코드 블럭. CURL 모델 호출 예시

choices의 text 필드에 모델의 답변이 포함되어 있는 것을 확인할 수 있습니다.

{"id":"cmpl-4ac698a99c014d758300a3ec5583d73b","object":"text_completion","created":1750140201,"model":"meta-llama/Llama-3.3-70B-Instruct","choices":[{"index":0,"text":"?\nI am a Korean student who is studying English.\nI am interested in learning about different cultures and making friends from around the world.\nI like to watch movies, listen to music, and read books in my free time.\nI am looking forward to chatting with you and learning more about your culture and way of life.\nNice to meet you, jihye! I'm happy to chat with you and learn more about Korean culture. What kind of movies, music, and books do you enjoy? Do","logprobs":null,"finish_reason":"length","stop_reason":null,"prompt_logprobs":null}],"usage":{"prompt_tokens":11,"total_tokens":111,"completion_tokens":100}}

프로젝트 구조

chat-playground
├── app.py          # streamlit 메인 웹 앱 파일
├── endpoints.json  # AIOS 모델의 호출 타입 정의
├── img
│   └── aios.png
└── models.json     # AIOS 모델 목록

Chat Playground 코드

참고

models.json, endpoints.json 파일이 존재하고 적절한 형식으로 구성되어야 합니다. 아래 코드를 참고해주세요.
코드 내 BASE_URL 은 LLM 이용 가이드를 참고하여 AIOS LLM Private Endpoint 주소로 수정해야 합니다.
이 Playground는 단발성 요청 기반의 구조로 설계되어 있어, 사용자가 입력값을 제공하고 버튼을 눌러 한번의 요청을 보내고 결과를 확인하는 방식입니다. 이는 복잡한 세션 관리 없이 빠르게 테스트하고 응답을 확인할 수 있습니다.
사이드바에 구성된 Model, Type, Temperature, Max Tokens 의 파라미터는 st.sidebar를 통해 구성된 인터페이스이며, 필요에 따라 자유롭게 기능을 확장하거나 수정할 수 있습니다.
st.file_uploader()로 업로드한 이미지(파일)은 서버 메모리상의 일시적인 BytesIO 객체로 존재하고, 자동으로 디스크에 저장되지 않습니다.

app.py

streamlit 메인 웹 앱 파일입니다. 여기서 BASE_URL인 AIOS_LLM_Private_Endpoint는 LLM 이용 가이드를 참고해주세요.

배경색 변경

import streamlit as st
import base64
import json
import requests
from urllib.parse import urljoin

BASE_URL = "AIOS_LLM_Private_Endpoint"

# ===== 설정 =====
st.set_page_config(page_title="AIOS Chat Playground", layout="wide")
st.title("🤖 AIOS Chat Playground")

# ===== 공통 함수 =====
def load_models():
    with open("models.json", "r") as f:
        return json.load(f)

def load_endpoints():
    with open("endpoints.json", "r") as f:
        return json.load(f)

models = load_models()
endpoints_config = load_endpoints()

# ===== 사이드바 설정 =====
st.sidebar.title('Hello!')
st.sidebar.image("img/aios.png")
st.sidebar.header("⚙️ Setting")
model = st.sidebar.selectbox("Model", models)
endpoint_labels = [ep["label"] for ep in endpoints_config]
endpoint_label = st.sidebar.selectbox("Type", endpoint_labels)
selected_endpoint = next(ep for ep in endpoints_config if ep["label"] == endpoint_label)

temperature = st.sidebar.slider("🔥 Temperature", 0.0, 1.0, 0.7)
max_tokens = st.sidebar.number_input("🧮 Max Tokens", min_value=1, max_value=5000, value=100)

base_url = BASE_URL
path = selected_endpoint["path"]
endpoint_type = selected_endpoint["type"]
api_style = selected_endpoint.get("style", "openai")  # openai or cohere

# ===== 입력 UI =====
prompt = ""
docs = []
image_base64 = None

if endpoint_type == "image":
    prompt = st.text_area("✍️ Enter your question:", "Explain this image.")
    uploaded_image = st.file_uploader("🖼️ Upload an image", type=["png", "jpg", "jpeg"])
    if uploaded_image:
        st.image(uploaded_image, caption="Uploaded image", use_container_width=300)
        image_bytes = uploaded_image.read()
        image_base64 = base64.b64encode(image_bytes).decode("utf-8")

elif endpoint_type == "rerank":
    prompt = st.text_area("✍️ Enter your query:", "What is the capital of France?")
    raw_docs = st.text_area("📄 Documents (one per line)", "The capital of France is Paris.\nFrance capital city is known for the Eiffel Tower.\nParis is located in the north-central part of France.")
    docs = raw_docs.strip().splitlines()

elif endpoint_type == "reasoning":
    prompt = st.text_area("✍️ Enter prompt:", "9.11 and 9.8, which is greater?")

elif endpoint_type == "embedding":
    prompt = st.text_area("✍️ Enter prompt:", "What is the capital of France?")

else:
    prompt = st.text_area("✍️ Enter prompt:", "Hello, who are you?")
    uploaded_image = st.file_uploader("🖼️ Upload an image (Optional)", type=["png", "jpg", "jpeg"])
    if uploaded_image:
        image_bytes = uploaded_image.read()
        image_base64 = base64.b64encode(image_bytes).decode("utf-8")

# ===== 호출 버튼 =====
if st.button("🚀 Invoke model"):
    headers = {
        "Content-Type": "application/json",
        "Authorization": "Bearer EMPTY_KEY"
    }

    try:
        if endpoint_type == "chat":
            url = urljoin(base_url, "v1/chat/completions")
            payload = {
                "model": model,
                "messages": [
                    {"role": "system", "content": "You are a helpful assistant."},
                    {"role": "user", "content": prompt}
                ],
                "temperature": temperature,
                "max_tokens": max_tokens
            }

        elif endpoint_type == "completion":
            url = urljoin(base_url, "v1/completions")
            payload = {
                "model": model,
                "prompt": prompt,
                "temperature": temperature,
                "max_tokens": max_tokens
            }

        elif endpoint_type == "embedding":
            url = urljoin(base_url, "v1/embeddings")
            payload = {
                "model": model,
                "input": prompt
            }

        elif endpoint_type == "reasoning":
            url = urljoin(BASE_URL, "v1/chat/completions")
            payload = {
                "model": model,
                "messages": [
                    {"role": "user", "content": prompt}
                ],
                "temperature": temperature,
                "max_tokens": max_tokens
            }

        elif endpoint_type == "image":
            url = urljoin(base_url, "v1/chat/completions")
            if not image_base64:
                st.warning("🖼️ Upload an image")
                st.stop()

            payload = {
                "model": model,
                "messages": [
                    {
                        "role": "user",
                        "content": [
                            {"type": "text", "text": prompt},
                            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_base64}"}}
                        ]
                    }
                ]
            }

        elif endpoint_type == "rerank":
            url = urljoin(base_url, "v2/rerank")
            payload = {
                "model": model,
                "query": prompt,
                "documents": docs,
                "top_n": len(docs)
            }

        else:
            st.error("❌ Unknown endpoint type")
            st.stop()

        st.expander("📤 Request payload").code(json.dumps(payload, indent=2), language="json")
        response = requests.post(url, headers=headers, json=payload)
        response.raise_for_status()
        res = response.json()

        # ===== 응답 파싱 =====
        if endpoint_type == "chat" or endpoint_type == "image":
            output = res["choices"][0]["message"]["content"]

        elif endpoint_type == "completion":
            output = res["choices"][0]["text"]

        elif endpoint_type == "embedding":
            vec = res["data"][0]["embedding"]
            output = f"🔢 Vector dimensions: {len(vec)}"
            st.expander("📐 Vector preview").code(vec[:20])

        elif endpoint_type == "rerank":
            results = res["results"]
            output = "\n\n".join(
                [f"{i+1}. {r['document']['text']} (score: {r['relevance_score']:.3f})" for i, r in enumerate(results)]
            )

        elif endpoint_type == "reasoning":
            message = res.get("choices", [{}])[0].get("message", {})
            reasoning = message.get("reasoning_content", "❌ No reasoning_content")
            content = message.get("content", "❌ No content")
            output = f"""📘 <b>response:</b><br>{content}<br><br>🧠 <b>Reasoning:</b><br>{reasoning}"""

        st.success("✅ Model response:")
        st.markdown(f"<div style='padding:1rem;background:#f0f0f0;border-radius:8px'>{output}</div>", unsafe_allow_html=True)

        st.expander("📦 View full response").json(res)

    except requests.RequestException as e:
        st.error("❌ Request failed")
        st.code(str(e))

import streamlit as st
import base64
import json
import requests
from urllib.parse import urljoin

BASE_URL = "AIOS_LLM_Private_Endpoint"

# ===== 설정 =====
st.set_page_config(page_title="AIOS Chat Playground", layout="wide")
st.title("🤖 AIOS Chat Playground")

# ===== 공통 함수 =====
def load_models():
    with open("models.json", "r") as f:
        return json.load(f)

def load_endpoints():
    with open("endpoints.json", "r") as f:
        return json.load(f)

models = load_models()
endpoints_config = load_endpoints()

# ===== 사이드바 설정 =====
st.sidebar.title('Hello!')
st.sidebar.image("img/aios.png")
st.sidebar.header("⚙️ Setting")
model = st.sidebar.selectbox("Model", models)
endpoint_labels = [ep["label"] for ep in endpoints_config]
endpoint_label = st.sidebar.selectbox("Type", endpoint_labels)
selected_endpoint = next(ep for ep in endpoints_config if ep["label"] == endpoint_label)

temperature = st.sidebar.slider("🔥 Temperature", 0.0, 1.0, 0.7)
max_tokens = st.sidebar.number_input("🧮 Max Tokens", min_value=1, max_value=5000, value=100)

base_url = BASE_URL
path = selected_endpoint["path"]
endpoint_type = selected_endpoint["type"]
api_style = selected_endpoint.get("style", "openai")  # openai or cohere

# ===== 입력 UI =====
prompt = ""
docs = []
image_base64 = None

if endpoint_type == "image":
    prompt = st.text_area("✍️ Enter your question:", "Explain this image.")
    uploaded_image = st.file_uploader("🖼️ Upload an image", type=["png", "jpg", "jpeg"])
    if uploaded_image:
        st.image(uploaded_image, caption="Uploaded image", use_container_width=300)
        image_bytes = uploaded_image.read()
        image_base64 = base64.b64encode(image_bytes).decode("utf-8")

elif endpoint_type == "rerank":
    prompt = st.text_area("✍️ Enter your query:", "What is the capital of France?")
    raw_docs = st.text_area("📄 Documents (one per line)", "The capital of France is Paris.\nFrance capital city is known for the Eiffel Tower.\nParis is located in the north-central part of France.")
    docs = raw_docs.strip().splitlines()

elif endpoint_type == "reasoning":
    prompt = st.text_area("✍️ Enter prompt:", "9.11 and 9.8, which is greater?")

elif endpoint_type == "embedding":
    prompt = st.text_area("✍️ Enter prompt:", "What is the capital of France?")

else:
    prompt = st.text_area("✍️ Enter prompt:", "Hello, who are you?")
    uploaded_image = st.file_uploader("🖼️ Upload an image (Optional)", type=["png", "jpg", "jpeg"])
    if uploaded_image:
        image_bytes = uploaded_image.read()
        image_base64 = base64.b64encode(image_bytes).decode("utf-8")

# ===== 호출 버튼 =====
if st.button("🚀 Invoke model"):
    headers = {
        "Content-Type": "application/json",
        "Authorization": "Bearer EMPTY_KEY"
    }

    try:
        if endpoint_type == "chat":
            url = urljoin(base_url, "v1/chat/completions")
            payload = {
                "model": model,
                "messages": [
                    {"role": "system", "content": "You are a helpful assistant."},
                    {"role": "user", "content": prompt}
                ],
                "temperature": temperature,
                "max_tokens": max_tokens
            }

        elif endpoint_type == "completion":
            url = urljoin(base_url, "v1/completions")
            payload = {
                "model": model,
                "prompt": prompt,
                "temperature": temperature,
                "max_tokens": max_tokens
            }

        elif endpoint_type == "embedding":
            url = urljoin(base_url, "v1/embeddings")
            payload = {
                "model": model,
                "input": prompt
            }

        elif endpoint_type == "reasoning":
            url = urljoin(BASE_URL, "v1/chat/completions")
            payload = {
                "model": model,
                "messages": [
                    {"role": "user", "content": prompt}
                ],
                "temperature": temperature,
                "max_tokens": max_tokens
            }

        elif endpoint_type == "image":
            url = urljoin(base_url, "v1/chat/completions")
            if not image_base64:
                st.warning("🖼️ Upload an image")
                st.stop()

            payload = {
                "model": model,
                "messages": [
                    {
                        "role": "user",
                        "content": [
                            {"type": "text", "text": prompt},
                            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_base64}"}}
                        ]
                    }
                ]
            }

        elif endpoint_type == "rerank":
            url = urljoin(base_url, "v2/rerank")
            payload = {
                "model": model,
                "query": prompt,
                "documents": docs,
                "top_n": len(docs)
            }

        else:
            st.error("❌ Unknown endpoint type")
            st.stop()

        st.expander("📤 Request payload").code(json.dumps(payload, indent=2), language="json")
        response = requests.post(url, headers=headers, json=payload)
        response.raise_for_status()
        res = response.json()

        # ===== 응답 파싱 =====
        if endpoint_type == "chat" or endpoint_type == "image":
            output = res["choices"][0]["message"]["content"]

        elif endpoint_type == "completion":
            output = res["choices"][0]["text"]

        elif endpoint_type == "embedding":
            vec = res["data"][0]["embedding"]
            output = f"🔢 Vector dimensions: {len(vec)}"
            st.expander("📐 Vector preview").code(vec[:20])

        elif endpoint_type == "rerank":
            results = res["results"]
            output = "\n\n".join(
                [f"{i+1}. {r['document']['text']} (score: {r['relevance_score']:.3f})" for i, r in enumerate(results)]
            )

        elif endpoint_type == "reasoning":
            message = res.get("choices", [{}])[0].get("message", {})
            reasoning = message.get("reasoning_content", "❌ No reasoning_content")
            content = message.get("content", "❌ No content")
            output = f"""📘 <b>response:</b><br>{content}<br><br>🧠 <b>Reasoning:</b><br>{reasoning}"""

        st.success("✅ Model response:")
        st.markdown(f"<div style='padding:1rem;background:#f0f0f0;border-radius:8px'>{output}</div>", unsafe_allow_html=True)

        st.expander("📦 View full response").json(res)

    except requests.RequestException as e:
        st.error("❌ Request failed")
        st.code(str(e))

코드 블럭. app.py

models.json

AIOS 모델 목록입니다. LLM 이용 가이드를 참고하여 이용할 모델을 설정합니다.

배경색 변경

[
  "meta-llama/Llama-3.3-70B-Instruct",
  "qwen/Qwen3-30B-A3B",
  "qwen/QwQ-32B",
  "google/gemma-3-27b-it",
  "meta-llama/Llama-4-Scout",
  "meta-llama/Llama-Guard-4-12B",
  "sds/bge-m3",
  "sds/bge-reranker-v2-m3"
]

[
  "meta-llama/Llama-3.3-70B-Instruct",
  "qwen/Qwen3-30B-A3B",
  "qwen/QwQ-32B",
  "google/gemma-3-27b-it",
  "meta-llama/Llama-4-Scout",
  "meta-llama/Llama-Guard-4-12B",
  "sds/bge-m3",
  "sds/bge-reranker-v2-m3"
]

코드 블럭. models.json

endpoints.json

AIOS 모델의 호출 타입이 정의되어 있습니다. 타입에 따라 입력 화면 및 결과가 다르게 출력됩니다.

배경색 변경

[
  {
    "label": "Chat Model",
    "path": "/v1/chat/completions",
    "type": "chat"
  },
  {
    "label": "Completion Model",
    "path": "/v1/completions",
    "type": "completion"
  },
  {
    "label": "Embedding Model",
    "path": "/v1/embeddings",
    "type": "embedding"
  },
  {
    "label": "Image Chat Model",
    "path": "/v1/chat/completions",
    "type": "image"
  },
  {
    "label": "Rerank Model",
    "path": "/v2/rerank",
    "type": "rerank"
  },
  {
    "label": "Reasoning Model",
    "path": "/v1/chat/completions",
    "type": "reasoning"
  }
]

[
  {
    "label": "Chat Model",
    "path": "/v1/chat/completions",
    "type": "chat"
  },
  {
    "label": "Completion Model",
    "path": "/v1/completions",
    "type": "completion"
  },
  {
    "label": "Embedding Model",
    "path": "/v1/embeddings",
    "type": "embedding"
  },
  {
    "label": "Image Chat Model",
    "path": "/v1/chat/completions",
    "type": "image"
  },
  {
    "label": "Rerank Model",
    "path": "/v2/rerank",
    "type": "rerank"
  },
  {
    "label": "Reasoning Model",
    "path": "/v1/chat/completions",
    "type": "reasoning"
  }
]

코드 블럭. endpoints.json

Playground 사용 방법

이 문서에서는 Playground의 두 가지 실행 방법을 다룹니다.

Virtual Server에서 실행 하기

1. Virtual Server에서 Streamlit 실행

배경색 변경

streamlit run app.py --server.port 8501 --server.address 0.0.0.0

streamlit run app.py --server.port 8501 --server.address 0.0.0.0

코드 블럭. Streamlit 실행

You can now view your Streamlit app in your browser.
 
URL: http://0.0.0.0:8501

브라우저에서 http://{your_server_ip}:8501 또는 서버 SSH 터널링 설정 후 http://localhost:8501 로 접속합니다. SSH 터널링은 아래를 참고하세요.

2. 로컬PC에서 터널링으로 Virtual Server접속 (http://localhost:8501 로 접속하는 경우)

배경색 변경

ssh -i {your_pemkey.pem} -L 8501:localhost:8501 ubuntu@{your_server_ip}

ssh -i {your_pemkey.pem} -L 8501:localhost:8501 ubuntu@{your_server_ip}

코드 블럭. 로컬PC에서 터널링

SCP Kubernetes Engine에서 실행 하기

1. Deployment와 Service 기동
다음의 YAML을 실행하여 Deployment와 Service를 기동합니다. Chat Playground 튜토리얼 실행을 위해 코드와 파이썬 라이브러리 파일이 패키징된 컨테이너 이미지를 제공합니다.

참고

이미지 주소 : aios-zcavifox.scr.private.kr-west1.e.samsungsdscloud.com/tutorial/chat-playground:v1.0

배경색 변경

apiVersion: apps/v1
kind: Deployment
metadata:
  name: streamlit-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: streamlit
  template:
    metadata:
      labels:
        app: streamlit
    spec:
      containers:
        - name: streamlit-app
          image: aios-zcavifox.scr.private.kr-west1.e.samsungsdscloud.com/tutorial/chat-playground:v1.0
          ports:
            - containerPort: 8501
---
apiVersion: v1
kind: Service
metadata:
  name: streamlit-service
spec:
  type: NodePort
  selector:
    app: streamlit
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8501
      nodePort: 30081

apiVersion: apps/v1
kind: Deployment
metadata:
  name: streamlit-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: streamlit
  template:
    metadata:
      labels:
        app: streamlit
    spec:
      containers:
        - name: streamlit-app
          image: aios-zcavifox.scr.private.kr-west1.e.samsungsdscloud.com/tutorial/chat-playground:v1.0
          ports:
            - containerPort: 8501
---
apiVersion: v1
kind: Service
metadata:
  name: streamlit-service
spec:
  type: NodePort
  selector:
    app: streamlit
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8501
      nodePort: 30081

코드 블럭. run.yaml

배경색 변경

kubectl apply -f run.yaml

kubectl apply -f run.yaml

코드 블럭. Deployment와 Service 기동

$ kubectl get pod
NAME                                   READY   STATUS    RESTARTS   AGE
streamlit-deployment-8bfcd5959-6xpx9   1/1     Running   0          17s

$ kubectl logs streamlit-deployment-8bfcd5959-6xpx9
 
Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
 
 
  You can now view your Streamlit app in your browser.
 
  URL: http://0.0.0.0:8501
 
$ kubectl get svc
NAME                TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
kubernetes          ClusterIP   172.20.0.1      <none>        443/TCP        46h
streamlit-service   NodePort    172.20.95.192   <none>        80:30081/TCP   130m

브라우저에서 http://{worker_node_ip}:30081 또는 서버 SSH 터널링 설정 후 http://localhost:8501 로 접속합니다. SSH 터널링은 아래를 참고하세요.

2. 로컬PC에서 터널링으로 워커노드 접속 (http://localhost:8501 로 접속하는 경우)

배경색 변경

ssh -i {your_pemkey.pem} -L 8501:{worker_node_ip}:30081 ubuntu@{worker_node_ip}

ssh -i {your_pemkey.pem} -L 8501:{worker_node_ip}:30081 ubuntu@{worker_node_ip}

코드 블럭. 로컬PC에서 워커노드 터널링

3. 로컬PC에서 터널링으로 중계서버 통하여 워커노드 접속 (http://localhost:8501 로 접속하는 경우)

배경색 변경

ssh -i {your_pemkey.pem} -L 8501:{worker_node_ip}:30081 ubuntu@{your_server_ip}

ssh -i {your_pemkey.pem} -L 8501:{worker_node_ip}:30081 ubuntu@{your_server_ip}

코드 블럭. 로컬PC에서 중계서버 통하여 워커노드 터널링

사용 예시

주요 화면 구성

	항목	설명
1	Model	models.json 파일에 설정된 호출 가능한 모델 목록입니다.
2	Endpoint type	endpoints.json 파일에 설정된 모델 호출 형식으로 모델에 맞게 선택해야 합니다.
3	Temperature	모델 출력의 “랜덤성” 또는 “창의성"정도를 조절하는 파라미터입니다. 이 튜토리얼에서는 0.00 ~ 1.00 범위로 지정되어 있습니다. 0.0 : 가장 확률이 높은 토큰만 선택 → 정확하고 일관된 응답, 다양성 부족 0.7 : 적당한 무작위성 → 창의성과 일관성의 균형 1.0 : 높은 무작위성 → 다양하고 창의적인 응답, 품질 편차 가능
4	Max Tokens	출력 길이 제한 파라미터로 응답 텍스트에서 생성 가능한 최대 토큰 수를 설정합니다. 이 튜토리얼에서는 1 ~ 5000 범위로 지정되어 있습니다.
5	입력 영역	엔드포인트 유형별로 prompt, 이미지 등 받는 방식이 달라집니다. Chat, Completion, Embedding. Reasoning : 일반 텍스트 입력 Image : 텍스트 + 이미지 업로드 Rerank : 쿼리 + 문서리스트 (이 튜토리얼에서는 라인별 텍스트를 문서로 인식)

표. 주요 화면 구성

Chat 모델 호출하기

Image 모델 호출하기

Reasoning 모델 호출하기

마무리

이 튜토리얼을 통해 AIOS에서 제공하는 다양한 AI모델 API를 손쉽게 테스트 할 수 있는 Playground UI를 직접 구축하고 활용하는 방법을 익히셨기를 바랍니다. 실제 서비스 목적에 따라 원하는 모델과 엔드포인트 구조에 맞춰 유연하게 커스터마이징해서 사용하실 수 있습니다.

참고 링크

https://docs.streamlit.io/

2 - RAG

목표

AIOS에서 제공하는 AI모델을 활용해 GIT 로그, PR 설명, 리뷰 코멘트 등을 벡터화하고, 이를 기반으로 RAG 기반의 PR리뷰 보조 챗봇을 구현합니다.

참고

RAG
RAG(Retrieval-Augmented Generation, 검색 증강 생성)는 대규모 언어 모델(LLM)이 응답을 생성하기 전에 외부의 신뢰할 수 있는 지식 베이스나 데이터베이스에서 관련 정보를 검색(Retrieval)하고, 그 검색된 정보를 바탕으로 답변을 생성(Generation)하는 자연어 처리 기술입니다. 기존 LLM은 훈련된 데이터에만 의존하기 때문에 최신 정보나 특정 도메인에 특화된 지식을 반영하는 데 한계가 있습니다. RAG는 이 한계를 보완하여, 사용자의 질문에 대해 먼저 관련 문서나 데이터를 벡터 검색 등의 방법으로 찾아내고, 그 정보를 활용해 더 정확하고 맥락에 맞는 답변을 생성합니다.

환경

이 튜토리얼을 진행하려면 아래와 같은 환경이 준비되어 있어야 합니다.

시스템 환경

Python 3.10 +
pip

설치 필요 패키지

배경색 변경

pip install streamlit
pip install opensearch-py

pip install streamlit
pip install opensearch-py

코드 블럭. streamlit, opensearch 패키지 설치

사전 준비 사항

사용자 지식 베이스나 데이터베이스

참고

이 튜토리얼에서는 VM 내부에 OpenSearch를 구성하여 벡터 데이터베이스로 활용하였습니다.
사용자의 기존 저장소를 사용하거나, SCP의 Search Engine 상품을 활용 할 수 있습니다.

시스템 아키텍처

GitHub PR 데이터를 수집하여 RAG 기반 QA 시스템을 구성하고, AIOS 모델을 활용해 임베딩 및 응답 생성을 수행하는 전체 흐름을 보여줍니다.

RAG Flow

Git 저장소에서 PR 데이터를 수집하여 pr_dataset.jsonl 생성
RAG 입력에 적합하도록 텍스트 정제 → rag_ready.jsonl
AIOS Embedding 모델을 통해 벡터 생성 후 rag_embedded.jsonl 파일로 저장
해당 벡터 파일을 OpenSearch에 업로드하여 검색 가능한 형태로 구성

RAG QA Application Flow

사용자의 질의(예: “이 PR을 분석해줘.")를 임베딩하여 검색 질의로 변환
OpenSearch에서 KNN 검색 또는 AIOS Embedding 모델(score API) 호출을 통해 연관 문서 추출
추출된 문서 기반으로 프롬프트를 구성하고 AIOS Chat 모델로 전송
응답을 생성하여 최종 결과 출력

구현

참고

이 튜토리얼에서는 kubeflow 프로젝트 github 을 활용하였습니다.
벡터 데이터베이스 데이터는 일회성으로 구성하였으며, 실제 서비스 시에는 실시간 연동 등으로 커스터마이징하여 사용하실 수 있습니다.

프로젝트 구조

rag-tutorial
├── app.py                                  # streamlit 메인 웹 앱 파일
├── generate_pr_dateset_from_branch.py      # 1. Github PR 데이터 수집
├── generate_rag_data_from_pr_dataset.py    # 2. RAG 입력용 텍스트 구성 (RAG 입력에 적합하도록 요약하여 텍스트 정제)
├── embed_prs.py                            # 3. RAG 입력용 텍스트 구성 (AIOS Embedding 모델을 통해 벡터 생성)
└── upload_rag_documnets.py                 # 4. OpenSearch에 업로드

Github PR 데이터 수집

Git 저장소에서 PR 데이터를 수집하여 pr_dataset.jsonl 생성합니다.

참고

아래 코드는 git 디렉토리 내에서 실행합니다.
추가 PR 병합 기록이 없거나, PR 병합이 rebase 방식 또는 squash-merge 방식으로 이루어져 정규 merge 커밋이 생성되지 않으면 데이터 수집이 되지 않습니다.
데이터 수집 시 각 커밋의 diff 항목은 최대 3000자로 제한하였습니다. 실제 시스템을 구성할 때는 효율적인 검색과 응답 생성을 위해, 내용의 길이나 구조에 따라 적절한 청킹(chunking) 작업이 추가적으로 필요합니다.

$ git branch
* (HEAD detached at v1.9.1)
  master

$ python3 generate_pr_dateset_from_branch.py
🔍 Searching for merged PRs...
✅ Generated pr_dataset.jsonl with 43 merged PRs.

$ head -n 1 pr_dataset.jsonl | jq
{
  "merge_sha": "167e162ef7dffc033ddc82e55b0a108db27fc340",
  "author": "Ricardo Martinelli de Oliveira",
  "date": "Tue Mar 5 11:46:36 2024 -0300",
  "title": "Merge pull request #7461 from rimolive/kf-1.9",
  "pr_id": null,
  "commits": [
    {
      "sha": "68e4d10bbf976bb89810b4e16e8b765a2a0e68b7",
      "author": "Ricardo Martinelli de Oliveira",
      "message": "Update ROADMAP.md",
      "date": "Mon Feb 19 18:51:40 2024 -0300",
      "files": [
        "ROADMAP.md"
      ],
      "diff": "commit 68e4d10bbf976bb89810b4e16e8b765a2a0e68b7\nAuthor: Ricardo Martinelli de Oliveira <rmartine@redhat.com>\nDate:   Mon Feb 19 18:51:40 2024 -0300\n\n    Update ROADMAP.md\n    \n    Co-authored-by: Tommy Li <Tommy.chaoping.li@ibm.com>\n\ndiff --git a/ROADMAP.md b/ROADMAP.md\nindex 35021954..cfd39558 100644\n--- a/ROADMAP.md\n+++ b/ROADMAP.md\n@@ -8,7 +8,7 @@ The Kubeflow Community plans to deliver its v1.9 release in Jul 2024 per this [t\n * CNCF Transition\n * LLM APIs\n * New component: Model Registry\n-* Kubeflow Pipelines and kfp-tekton merged in a single GitHub repository\n+* Kubeflow Pipelines and kfp-tekton V2 merged in a single GitHub repository\n \n ### Detailed features, bug fixes and enhancements are identified in the Working Group Roadmaps and Tracking Issues:\n * [Training Operators](https://github.com/kubeflow/training-operator/issues/1994)"
    },
    {
      "sha": "5c3404782fa2700f8547b37132ff7ab2d1ed99fe",
      "author": "Ricardo M. Oliveira",
      "message": "Add Kubeflow 1.9 release roadmap",
      "date": "Mon Feb 5 14:43:45 2024 -0300",
      "files": [
        "ROADMAP.md"
      ],
      "diff": "commit 5c3404782fa2700f8547b37132ff7ab2d1ed99fe\nAuthor: Ricardo M. Oliveira <rmartine@redhat.com>\nDate:   Mon Feb 5 14:43:45 2024 -0300\n\n    Add Kubeflow 1.9 release roadmap\n    \n    Signed-off-by: Ricardo M. Oliveira <rmartine@redhat.com>\n\ndiff --git a/ROADMAP.md b/ROADMAP.md\nindex de3c8951..35021954 100644\n--- a/ROADMAP.md\n+++ b/ROADMAP.md\n@@ -1,6 +1,26 @@\n # Kubeflow Roadmap\n \n-## Kubeflow 1.8 Release, Planned for release: Oct 2023\n+## Kubeflow 1.9 Release, Planned for release: Jul 2024\n+The Kubeflow Community plans to deliver its v1.9 release in Jul 2024 per this [timeline](https://github.com/kubeflow/community/blob/master/releases/release-1.9/README.md#timeline). The high level deliverables are tracked in the [v1.9 Release](https://github.com/orgs/kubeflow/projects/61) Github project board. The v1.9 release process will be managed by the v1.9 [release team](https://github.com/kubeflow/community/blob/master/releases/release-1.9/release-team.md) using the best practices in the [Release Handbook](https://github.com/kubeflow/community/blob/master/releases/handbook.md).\n+\n+### Themes\n+* Kubernetes 1.29 support\n+* CNCF Transition\n+* LLM APIs\n+* New component: Model Registry\n+* Kubeflow Pipelines and kfp-tekton merged in a single GitHub repository\n+\n+### Detailed features, bug fixes and enhancements are identified in the Working Group Roadmaps and Tracking Issues:\n+* [Training Operators](https://github.com/kubeflow/training-operator/issues/1994)\n+* [KServe](https://github.com/orgs/kserve/projects/12)\n+* [Katib](https://github.com/kubeflow/katib/issues/2255)\n+* [Kubeflow Pipelines](https://github.com/kubeflow/pipelines/issues/10402)\n+* [Notebooks](https://github.com/kubeflow/kubeflow/issues/7459)\n+* [Manifests](https://github.com/kubeflow/manifests/issues/2592)\n+* [Security](https://github.com/kubeflow/manifests/issues/2598)\n+* [Model Registry](https://github.com/kubeflow/model-registry/issues/3)\n+\n+## Kubeflow 1.8 Release, Delivered: Nov 2023\n The Kubeflow Community plans to deliver its v1.8 release in Oct 2023 per this [timeline](https://github.com/kubeflow/community/tree/master/releases/release-1.8#timeline). The high level deliverables are tracked in the [v1.8 Release](https://github.com/orgs/kubeflow/projects/58/) Github project board. The v1.8 release process will be managed by the v1.8 [release team](https://github.com/kubeflow/community/blob/a956b3f6f15c49f928e37eaafec40d7f73ee1d5b/releases/release-team.md) using the best practices in the [Release Handbook](https://github.com/kubeflow/community/blob/master/releases/handbook.md).\n \n ### Themes"
    }
  ]
}

generate_pr_dateset_from_branch.py

배경색 변경

import subprocess
import json

def run(cmd):
    return subprocess.check_output(cmd, shell=True, text=True).strip()

def extract_pr_commits(merge_sha):
    try:
        parent1 = run(f"git rev-parse {merge_sha}^1")
        parent2 = run(f"git rev-parse {merge_sha}^2")
    except subprocess.CalledProcessError:
        return []

    try:
        lines = run(f"git log {parent1}..{parent2} --pretty=format:'%H|%an|%s|%ad'").splitlines()
    except subprocess.CalledProcessError:
        return []

    commits = []
    for line in lines:
        try:
            sha, author, msg, date = line.split("|", 3)
            files = run(f"git show --pretty=format:'' --name-only {sha}").splitlines()
            diff = run(f"git show {sha}")
            commits.append({
                "sha": sha,
                "author": author,
                "message": msg,
                "date": date,
                "files": files,
                "diff": diff[:3000]  # diff가 너무 길면 자름
            })
        except:
            continue
    return commits

def extract_pr_id(title):
    if "# " in title:
        try:
            return title.split("#")[1].split()[0]
        except:
            return None
    return None

output = []

print("🔍 Searching for merged PRs...")
log_lines = run("git log --merges --pretty=format:'%H|%an|%ad|%s'").splitlines()

for line in log_lines:
    try:
        merge_sha, author, date, title = line.split("|", 3)
    except ValueError:
        continue

    commits = extract_pr_commits(merge_sha)
    if not commits:
        continue

    pr_doc = {
        "merge_sha": merge_sha,
        "author": author,
        "date": date,
        "title": title,
        "pr_id": extract_pr_id(title),
        "commits": commits
    }

    output.append(pr_doc)

with open("pr_dataset.jsonl", "w") as f:
    for item in output:
        f.write(json.dumps(item, ensure_ascii=False) + "\n")

print(f"✅ Generated pr_dataset.jsonl with {len(output)} merged PRs.")

import subprocess
import json

def run(cmd):
    return subprocess.check_output(cmd, shell=True, text=True).strip()

def extract_pr_commits(merge_sha):
    try:
        parent1 = run(f"git rev-parse {merge_sha}^1")
        parent2 = run(f"git rev-parse {merge_sha}^2")
    except subprocess.CalledProcessError:
        return []

    try:
        lines = run(f"git log {parent1}..{parent2} --pretty=format:'%H|%an|%s|%ad'").splitlines()
    except subprocess.CalledProcessError:
        return []

    commits = []
    for line in lines:
        try:
            sha, author, msg, date = line.split("|", 3)
            files = run(f"git show --pretty=format:'' --name-only {sha}").splitlines()
            diff = run(f"git show {sha}")
            commits.append({
                "sha": sha,
                "author": author,
                "message": msg,
                "date": date,
                "files": files,
                "diff": diff[:3000]  # diff가 너무 길면 자름
            })
        except:
            continue
    return commits

def extract_pr_id(title):
    if "# " in title:
        try:
            return title.split("#")[1].split()[0]
        except:
            return None
    return None

output = []

print("🔍 Searching for merged PRs...")
log_lines = run("git log --merges --pretty=format:'%H|%an|%ad|%s'").splitlines()

for line in log_lines:
    try:
        merge_sha, author, date, title = line.split("|", 3)
    except ValueError:
        continue

    commits = extract_pr_commits(merge_sha)
    if not commits:
        continue

    pr_doc = {
        "merge_sha": merge_sha,
        "author": author,
        "date": date,
        "title": title,
        "pr_id": extract_pr_id(title),
        "commits": commits
    }

    output.append(pr_doc)

with open("pr_dataset.jsonl", "w") as f:
    for item in output:
        f.write(json.dumps(item, ensure_ascii=False) + "\n")

print(f"✅ Generated pr_dataset.jsonl with {len(output)} merged PRs.")

코드 블럭. generate_pr_dateset_from_branch.py

RAG 입력용 텍스트 구성

RAG 입력에 적합하도록 요약하여 텍스트 정제후, AIOS Embedding 모델을 통해 벡터를 생성합니다.

$ python3 generate_rag_data_from_pr_dataset.py
✅ RAG용 텍스트 생성 완료 → rag_ready.jsonl
$ head -n 1 rag_ready.jsonl | jq
{
  "pr_id": null,
  "title": "Merge pull request #7461 from rimolive/kf-1.9",
  "text": "PR 제목: Merge pull request #7461 from rimolive/kf-1.9\n병합자: Ricardo Martinelli de Oliveira / 날짜: Tue Mar 5 11:46:36 2024 -0300\n커밋 요약:\n- Ricardo Martinelli de Oliveira (Mon Feb 19 18:51:40 2024 -0300): Update ROADMAP.md\n  변경 파일: ROADMAP.md\n  변경사항:\ncommit 68e4d10bbf976bb89810b4e16e8b765a2a0e68b7\nAuthor: Ricardo Martinelli de Oliveira <rmartine@redhat.com>\nDate:   Mon Feb 19 18:51:40 2024 -0300\n\n    Update ROADMAP.md\n    \n    Co-authored-by: Tommy Li <Tommy.chaoping.li@ibm.com>\n\ndiff --git a/ROADMAP.md b/ROADMAP.md\nindex 35021954..cfd39558 100644\n--- a/ROADMAP.md\n+++ b/ROADMAP.md\n@@ -8,7 +8,7 @@ The Kubeflow Community plans to deliver its v1.9 release in Jul 2024 per this [t\n * CNCF Transition\n * LLM APIs\n * New component: Model Registry\n-* Kubeflow Pipelines and kfp-tekton merged in a single GitHub repository\n+* Kubeflow Pipelines and kfp-tekton V2 merged in a single GitHub repository\n \n ### Detailed features, bug fixes and enhancements are identified in the Working Group Roadmaps and Tracking Issues:\n * [Training Operators](https://github.com/kubeflow/training-operator/issues/1994)\n- Ricardo M. Oliveira (Mon Feb 5 14:43:45 2024 -0300): Add Kubeflow 1.9 release roadmap\n  변경 파일: ROADMAP.md\n  변경사항:\ncommit 5c3404782fa2700f8547b37132ff7ab2d1ed99fe\nAuthor: Ricardo M. Oliveira <rmartine@redhat.com>\nDate:   Mon Feb 5 14:43:45 2024 -0300\n\n    Add Kubeflow 1.9 release roadmap\n    \n    Signed-off-by: Ricardo M. Oliveira <rmartine@redhat.com>\n\ndiff --git a/ROADMAP.md b/ROADMAP.md\nindex de3c8951..35021954 100644\n--- a/ROADMAP.md\n+++ b/ROADMAP.md\n@@ -1,6 +1,26 @@\n # Kubeflow Roadmap\n \n-## Kubeflow 1.8 Release, Planned for release: Oct 2023\n+## Kubeflow 1.9 Release, Planned for release: Jul 2024\n+The Kubeflow Community plans to deliver its v1.9 release in Jul 2024 per this [timeline](https://github.com/kubeflow/community/blob/master/releases/release-1.9/README.md#timeline). The high level deliverables are tracked in the [v1.9 Release](https://github.com/orgs/kubeflow/projects/61) Github project board. The v1.9 release process will be managed by the v1.9 [release team](https://github.com/kubeflow/community/blob/master/releases/release-1.9/release-team.md) using the best practices in the [Rele"
}

$ python3 embed_prs.py
✅ Line 1: embedded
✅ Line 2: embedded
✅ Line 3: embedded
✅ Line 4: embedded
✅ Line 5: embedded
✅ Line 6: embedded
✅ Line 7: embedded
✅ Line 8: embedded
✅ Line 9: embedded
✅ Line 10: embedded
... (중략) ...

generate_rag_data_from_pr_dataset.py

배경색 변경

import json

def build_text(pr):
    lines = []
    lines.append(f"PR 제목: {pr['title']}")
    lines.append(f"병합자: {pr['author']} / 날짜: {pr['date']}")
    lines.append("커밋 요약:")
    for c in pr["commits"]:
        lines.append(f"- {c['author']} ({c['date']}): {c['message']}")
        if c["files"]:
            lines.append(f"  변경 파일: {', '.join(c['files'])}")
        lines.append("  변경사항:")
        lines.append(c["diff"][:1000])  # 너무 길면 자름
    return "\n".join(lines)

with open("pr_dataset.jsonl") as fin, open("rag_ready.jsonl", "w") as fout:
    for line in fin:
        pr = json.loads(line)
        text = build_text(pr)
        out = {
            "pr_id": pr.get("pr_id"),
            "title": pr.get("title"),
            "text": text
        }
        fout.write(json.dumps(out, ensure_ascii=False) + "\n")

print("✅ RAG용 텍스트 생성 완료 → rag_ready.jsonl")

import json

def build_text(pr):
    lines = []
    lines.append(f"PR 제목: {pr['title']}")
    lines.append(f"병합자: {pr['author']} / 날짜: {pr['date']}")
    lines.append("커밋 요약:")
    for c in pr["commits"]:
        lines.append(f"- {c['author']} ({c['date']}): {c['message']}")
        if c["files"]:
            lines.append(f"  변경 파일: {', '.join(c['files'])}")
        lines.append("  변경사항:")
        lines.append(c["diff"][:1000])  # 너무 길면 자름
    return "\n".join(lines)

with open("pr_dataset.jsonl") as fin, open("rag_ready.jsonl", "w") as fout:
    for line in fin:
        pr = json.loads(line)
        text = build_text(pr)
        out = {
            "pr_id": pr.get("pr_id"),
            "title": pr.get("title"),
            "text": text
        }
        fout.write(json.dumps(out, ensure_ascii=False) + "\n")

print("✅ RAG용 텍스트 생성 완료 → rag_ready.jsonl")

코드 블럭. generate_rag_data_from_pr_dataset.py

embed_prs.py

참고

코드 내 EMBEDDING_API_URL인 AIOS_LLM_Private_Endpoint과 model의 MODEL_ID는 LLM 이용 가이드를 참고해주세요. 아래의 예시처럼 입력할 수 있습니다.
- EMBEDDING_API_URL = “{AIOS LLM 프라이빗 엔드포인트}/{API}”
- “model”: “{모델ID}”

배경색 변경

import json
import requests
import time

EMBEDDING_API_URL = "AIOS_LLM_Private_Endpoint"
HEADERS = {"Content-Type": "application/json"}

def get_embedding(text):
    payload = {
        "model": "MODEL_ID",
        "input": text,
        "stream": False
    }

    try:
        response = requests.post(EMBEDDING_API_URL, headers=HEADERS, json=payload)
        if response.status_code == 200:
            result = response.json()
            return result["data"][0]["embedding"]
        else:
            print(f"❌ Failed with status {response.status_code}: {response.text}")
            return None
    except Exception as e:
        print(f"⚠️ Error calling embedding API: {e}")
        return None

def main():
    with open("rag_ready.jsonl", "r", encoding="utf-8") as fin, \
         open("rag_embedded.jsonl", "w", encoding="utf-8") as fout:

        for i, line in enumerate(fin, start=1):
            try:
                item = json.loads(line)
                text = item.get("text", "").strip()
                if not text:
                    print(f"⚠️ Line {i}: empty text, skipping")
                    continue

                embedding = get_embedding(text)
                if embedding is None:
                    print(f"⚠️ Line {i}: embedding failed, skipping")
                    continue

                item["embedding"] = embedding
                fout.write(json.dumps(item, ensure_ascii=False) + "\n")
                print(f"✅ Line {i}: embedded")

                time.sleep(0.2)  # optional: rate limiting
            except Exception as e:
                print(f"❌ Line {i}: error - {e}")
                continue

if __name__ == "__main__":
    main()

import json
import requests
import time

EMBEDDING_API_URL = "AIOS_LLM_Private_Endpoint"
HEADERS = {"Content-Type": "application/json"}

def get_embedding(text):
    payload = {
        "model": "MODEL_ID",
        "input": text,
        "stream": False
    }

    try:
        response = requests.post(EMBEDDING_API_URL, headers=HEADERS, json=payload)
        if response.status_code == 200:
            result = response.json()
            return result["data"][0]["embedding"]
        else:
            print(f"❌ Failed with status {response.status_code}: {response.text}")
            return None
    except Exception as e:
        print(f"⚠️ Error calling embedding API: {e}")
        return None

def main():
    with open("rag_ready.jsonl", "r", encoding="utf-8") as fin, \
         open("rag_embedded.jsonl", "w", encoding="utf-8") as fout:

        for i, line in enumerate(fin, start=1):
            try:
                item = json.loads(line)
                text = item.get("text", "").strip()
                if not text:
                    print(f"⚠️ Line {i}: empty text, skipping")
                    continue

                embedding = get_embedding(text)
                if embedding is None:
                    print(f"⚠️ Line {i}: embedding failed, skipping")
                    continue

                item["embedding"] = embedding
                fout.write(json.dumps(item, ensure_ascii=False) + "\n")
                print(f"✅ Line {i}: embedded")

                time.sleep(0.2)  # optional: rate limiting
            except Exception as e:
                print(f"❌ Line {i}: error - {e}")
                continue

if __name__ == "__main__":
    main()

코드 블럭. embed_prs.py

OpenSearch에 업로드

벡터 파일을 OpenSearch에 업로드하여 검색 가능한 형태로 구성합니다.

참고

이 튜토리얼에서는 VM 내부에 OpenSearch를 구성하고, http://localhost:9200 주소로 호출합니다. 사용자 벡터 데이터베이스를 사용하는 경우에는 URL을 알맞게 변경해 주세요.

# OpenSearch에 "kubeflow-pr-rag-index"이름의 인덱스 생성
$ curl -X PUT "http://localhost:9200/kubeflow-pr-rag-index" \
  -H "Content-Type: application/json" \
  -d '{
    "settings": {
      "index": {
        "knn": true
      }
    },
    "mappings": {
      "properties": {
        "title": { "type": "text" },
        "text":  { "type": "text" },
        "embedding": {
          "type": "knn_vector",
          "dimension": 1024,
          "method": {
            "name": "hnsw",
            "space_type": "cosinesimil",
            "engine": "nmslib"
          }
        }
      }
    }
  }'
{"acknowledged":true,"shards_acknowledged":true,"index":"kubeflow-pr-rag-index"}

$ python3 upload_rag_documnets.py
✅ Uploaded document pr-1
✅ Uploaded document pr-2
✅ Uploaded document pr-3
✅ Uploaded document pr-4
✅ Uploaded document pr-5
✅ Uploaded document pr-6
✅ Uploaded document pr-7
✅ Uploaded document pr-8
✅ Uploaded document pr-9
✅ Uploaded document pr-10
... (중략) ...

upload_rag_documnets.py

배경색 변경

import json
from opensearchpy import OpenSearch

# OpenSearch 연결 설정
client = OpenSearch(
    hosts=[{"host": "localhost", "port": 9200}],
    use_ssl=False,
    verify_certs=False
)

index_name = "kubeflow-pr-rag-index"

with open("rag_embedded.jsonl", "r", encoding="utf-8") as f:
    for i, line in enumerate(f, 1):
        try:
            doc = json.loads(line)
            title = doc.get("title", "")
            text = doc.get("text", "")
            embedding = doc.get("embedding", [])

            if not embedding or len(embedding) != 1024:
                print(f"⚠️  Line {i}: Invalid embedding length, skipping.")
                continue

            body = {
                "title": title,
                "text": text,
                "embedding": embedding
            }

            doc_id = f"pr-{i}"
            client.index(index=index_name, id=doc_id, body=body)
            print(f"✅ Uploaded document {doc_id}")
        except Exception as e:
            print(f"❌ Line {i}: Failed to upload due to {e}")

import json
from opensearchpy import OpenSearch

# OpenSearch 연결 설정
client = OpenSearch(
    hosts=[{"host": "localhost", "port": 9200}],
    use_ssl=False,
    verify_certs=False
)

index_name = "kubeflow-pr-rag-index"

with open("rag_embedded.jsonl", "r", encoding="utf-8") as f:
    for i, line in enumerate(f, 1):
        try:
            doc = json.loads(line)
            title = doc.get("title", "")
            text = doc.get("text", "")
            embedding = doc.get("embedding", [])

            if not embedding or len(embedding) != 1024:
                print(f"⚠️  Line {i}: Invalid embedding length, skipping.")
                continue

            body = {
                "title": title,
                "text": text,
                "embedding": embedding
            }

            doc_id = f"pr-{i}"
            client.index(index=index_name, id=doc_id, body=body)
            print(f"✅ Uploaded document {doc_id}")
        except Exception as e:
            print(f"❌ Line {i}: Failed to upload due to {e}")

코드 블럭. upload_rag_documnets.py

OpenSearch Dashboards에서 확인

아래 그림과 같이 OpenSearch Dashboard에서 kubeflow-pr-rag-index 에 해당하는 데이터를 확인할 수 있습니다. 데이터는 title, text, embedding으로 구성되어 있습니다.

참고

OpenSearch Dashboard에서 Index Patterns 등록
왼쪽 메뉴 → Dashboards Management → Index patterns → Create index pattern 클릭

RAG QA Application 구성

사용자의 질의를 임베딩하여 검색 질의로 변환한 뒤, RAG를 활용해 연관 문서를 추출하고, AIOS Chat 모델을 통해 최종 결과를 제공합니다.

참고

이 코드에서는 유사도 검색 방식으로 OpenSearch의 KNN(K-Nearest Neightbors) 검색과 AIOS에서 제공하는 Embedding 모델의 Score API를 호출하여 입력 벡터와 가장 유사한 문서를 계산하는 방식을 지원합니다. 사용자는 두 방식 중 하나를 선택하여 사용할 수 있으며, 이 튜토리얼에서는 AIOS Score API 기반의 유사도 검색 방식을 사용합니다.
- OpenSearch의 KNN 호출 : docs = search_similar_docs(query_vec, K)
- AIOS Embedding 모델 호출 : docs = search_similar_docs_with_score(question, K)
코드 내 EMBEDDING_API_URL, LLM_API_URL, SCORE_API_URL, MODEL_EMBEDDING, MODEL_CHAT은 LLM 이용 가이드를 참고하여 사용할 API와 Model로 입력해주세요. 아래의 예시처럼 입력할 수 있습니다.
- EMBEDDING_API_URL = “{AIOS LLM 프라이빗 엔드포인트}/{API}”
- MODEL_EMBEDDING = “{모델ID}”

app.py

배경색 변경

import streamlit as st
import requests
from opensearchpy import OpenSearch

# 설정
def get_opensearch_client():
    return OpenSearch(
        hosts=[{"host": "localhost", "port": 9200}],
        use_ssl=False,
        verify_certs=False
    )

EMBEDDING_API_URL = "YOUR_EMBEDDING_API_URL"
LLM_API_URL = "YOUR_LLM_API_URL"
SCORE_API_URL = "YOUR_SCORE_API_URL"
MODEL_EMBEDDING = "YOUR_MODEL_EMBEDDING"
MODEL_CHAT = "YOUR_MODEL_CHAT"
INDEX_NAME = "kubeflow-pr-rag-index"
VECTOR_DIM = 1024
K = 3

# 임베딩 생성 함수
def embed_text(text):
    res = requests.post(
        EMBEDDING_API_URL,
        headers={"Content-Type": "application/json"},
        json={"model": MODEL_EMBEDDING, "input": text, "stream": False}
    )
    return res.json()["data"][0]["embedding"]

# 모든 문서 불러오기 (OpenSearch)
def fetch_all_docs():
    client = get_opensearch_client()
    res = client.search(
        index=INDEX_NAME,
        body={
            "size": 1000,  # 필요한 만큼 설정 (작을 경우 스크롤 API 활용 가능)
            "query": {"match_all": {}}
        }
    )
    return [doc["_source"] for doc in res["hits"]["hits"]]

# 두 문장 리스트를 받아 유사도 점수 계산
def score_text_pairs(text_1, text_2):
    payload = {
        "model": MODEL_EMBEDDING,
        "encoding_format": "float",
        "text_1": text_1,
        "text_2": text_2
    }
    headers = {
        "accept": "application/json",
        "Content-Type": "application/json"
    }

    response = requests.post(SCORE_API_URL, headers=headers, json=payload)
    response.raise_for_status()

    # 유사도 score만 추출
    scores = [item["score"] for item in response.json()["data"]]
    return scores

# 유사 문서 선택 (점수 기반 Top-K)
def search_similar_docs_with_score(query, k):
    all_docs = fetch_all_docs()
    doc_texts = [doc["text"] for doc in all_docs]
    queries = [query] * len(doc_texts)
    scores = score_text_pairs(queries, doc_texts)

    # 점수 높은 순으로 정렬
    scored_docs = sorted(zip(all_docs, scores), key=lambda x: x[1], reverse=True)
    top_docs = [doc for doc, score in scored_docs[:k]]
    return top_docs

# KNN 검색 함수
def search_similar_docs(query_vector, k):
    client = get_opensearch_client()
    res = client.search(
        index=INDEX_NAME,
        body={
            "size": k,
            "query": {
                "knn": {
                    "embedding": {
                        "vector": query_vector,
                        "k": k
                    }
                }
            }
        }
    )
    return [doc["_source"] for doc in res["hits"]["hits"]]

# 프롬프트 구성
def build_prompt(docs, question):
    context_blocks = []
    for i, doc in enumerate(docs):
        context_blocks.append(f"[문서 {i+1}]\n{doc['text']}")
    context = "\n\n".join(context_blocks)
    return f"""다음은 Kubeflow 프로젝트에서 유사한 PR 문서들입니다:

{context}

사용자 질문: {question}

위 내용을 참고하여 질문에 대해 자연어로 답변해 주세요. 가능한 문서 번호를 인용해서 설명해주세요."""

# LLM 호출 함수
def call_llm(prompt):
    res = requests.post(
        LLM_API_URL,
        headers={"Content-Type": "application/json"},
        json={
            "model": MODEL_CHAT,
            "messages": [{"role": "user", "content": prompt}],
            "stream": False
        }
    )
    return res.json()["choices"][0]["message"]["content"]

# Streamlit UI 시작
st.set_page_config(page_title="RAG QA", layout="wide")
st.title("📘 RAG-based PR Summary Chatbot")

question = st.text_input("Enter your question:", "Please summarize the PR the Add Kubeflow 1.9 release roadmap.")

if st.button("Searching and generating response"):
    with st.spinner("Generating embeddings..."):
        query_vec = embed_text(question)

    with st.spinner("Searching for similar documents in OpenSearch..."):
        #docs = search_similar_docs(query_vec, K)
        docs = search_similar_docs_with_score(question, K)

    with st.spinner("Constructing prompt and invoking LLM..."):
        prompt = build_prompt(docs, question)
        answer = call_llm(prompt)

    st.markdown("### 🤖 LLM response")
    st.write(answer)

    st.markdown("---")
    st.markdown("### 🔍 Highlighted PR document")
    for i, doc in enumerate(docs):
        with st.expander(f"문서 {i+1}: {doc['title']}"):
            # 간단한 질문 키워드 하이라이트 
            highlighted = doc['text'].replace(question.split()[0], f"**{question.split()[0]}**")
            st.markdown(highlighted)

import streamlit as st
import requests
from opensearchpy import OpenSearch

# 설정
def get_opensearch_client():
    return OpenSearch(
        hosts=[{"host": "localhost", "port": 9200}],
        use_ssl=False,
        verify_certs=False
    )

EMBEDDING_API_URL = "YOUR_EMBEDDING_API_URL"
LLM_API_URL = "YOUR_LLM_API_URL"
SCORE_API_URL = "YOUR_SCORE_API_URL"
MODEL_EMBEDDING = "YOUR_MODEL_EMBEDDING"
MODEL_CHAT = "YOUR_MODEL_CHAT"
INDEX_NAME = "kubeflow-pr-rag-index"
VECTOR_DIM = 1024
K = 3

# 임베딩 생성 함수
def embed_text(text):
    res = requests.post(
        EMBEDDING_API_URL,
        headers={"Content-Type": "application/json"},
        json={"model": MODEL_EMBEDDING, "input": text, "stream": False}
    )
    return res.json()["data"][0]["embedding"]

# 모든 문서 불러오기 (OpenSearch)
def fetch_all_docs():
    client = get_opensearch_client()
    res = client.search(
        index=INDEX_NAME,
        body={
            "size": 1000,  # 필요한 만큼 설정 (작을 경우 스크롤 API 활용 가능)
            "query": {"match_all": {}}
        }
    )
    return [doc["_source"] for doc in res["hits"]["hits"]]

# 두 문장 리스트를 받아 유사도 점수 계산
def score_text_pairs(text_1, text_2):
    payload = {
        "model": MODEL_EMBEDDING,
        "encoding_format": "float",
        "text_1": text_1,
        "text_2": text_2
    }
    headers = {
        "accept": "application/json",
        "Content-Type": "application/json"
    }

    response = requests.post(SCORE_API_URL, headers=headers, json=payload)
    response.raise_for_status()

    # 유사도 score만 추출
    scores = [item["score"] for item in response.json()["data"]]
    return scores

# 유사 문서 선택 (점수 기반 Top-K)
def search_similar_docs_with_score(query, k):
    all_docs = fetch_all_docs()
    doc_texts = [doc["text"] for doc in all_docs]
    queries = [query] * len(doc_texts)
    scores = score_text_pairs(queries, doc_texts)

    # 점수 높은 순으로 정렬
    scored_docs = sorted(zip(all_docs, scores), key=lambda x: x[1], reverse=True)
    top_docs = [doc for doc, score in scored_docs[:k]]
    return top_docs

# KNN 검색 함수
def search_similar_docs(query_vector, k):
    client = get_opensearch_client()
    res = client.search(
        index=INDEX_NAME,
        body={
            "size": k,
            "query": {
                "knn": {
                    "embedding": {
                        "vector": query_vector,
                        "k": k
                    }
                }
            }
        }
    )
    return [doc["_source"] for doc in res["hits"]["hits"]]

# 프롬프트 구성
def build_prompt(docs, question):
    context_blocks = []
    for i, doc in enumerate(docs):
        context_blocks.append(f"[문서 {i+1}]\n{doc['text']}")
    context = "\n\n".join(context_blocks)
    return f"""다음은 Kubeflow 프로젝트에서 유사한 PR 문서들입니다:

{context}

사용자 질문: {question}

위 내용을 참고하여 질문에 대해 자연어로 답변해 주세요. 가능한 문서 번호를 인용해서 설명해주세요."""

# LLM 호출 함수
def call_llm(prompt):
    res = requests.post(
        LLM_API_URL,
        headers={"Content-Type": "application/json"},
        json={
            "model": MODEL_CHAT,
            "messages": [{"role": "user", "content": prompt}],
            "stream": False
        }
    )
    return res.json()["choices"][0]["message"]["content"]

# Streamlit UI 시작
st.set_page_config(page_title="RAG QA", layout="wide")
st.title("📘 RAG-based PR Summary Chatbot")

question = st.text_input("Enter your question:", "Please summarize the PR the Add Kubeflow 1.9 release roadmap.")

if st.button("Searching and generating response"):
    with st.spinner("Generating embeddings..."):
        query_vec = embed_text(question)

    with st.spinner("Searching for similar documents in OpenSearch..."):
        #docs = search_similar_docs(query_vec, K)
        docs = search_similar_docs_with_score(question, K)

    with st.spinner("Constructing prompt and invoking LLM..."):
        prompt = build_prompt(docs, question)
        answer = call_llm(prompt)

    st.markdown("### 🤖 LLM response")
    st.write(answer)

    st.markdown("---")
    st.markdown("### 🔍 Highlighted PR document")
    for i, doc in enumerate(docs):
        with st.expander(f"문서 {i+1}: {doc['title']}"):
            # 간단한 질문 키워드 하이라이트 
            highlighted = doc['text'].replace(question.split()[0], f"**{question.split()[0]}**")
            st.markdown(highlighted)

코드 블럭. app.py

RAG QA Chatbot UI 사용 방법

호출 코드 실행

VM에서 Streamlit 실행

배경색 변경

streamlit run app.py --server.port 8501 --server.address 0.0.0.0

streamlit run app.py --server.port 8501 --server.address 0.0.0.0

코드 블럭. Streamlit 실행

You can now view your Streamlit app in your browser.
 
URL: http://0.0.0.0:8501

브라우저에서 http://{your_server_ip}:8501 또는 서버 SSH 터널링 설정 후 http://0.0.0.0:8501 로 접속합니다. SSH 터널링은 아래를 참고하세요.

2. 로컬PC에서 터널링으로 VM접속 (http://0.0.0.0:8501 로 접속하는 경우)

배경색 변경

ssh -i {your_pemkey.pem} -L 8501:localhost:8501 ubuntu@{your_server_ip}

ssh -i {your_pemkey.pem} -L 8501:localhost:8501 ubuntu@{your_server_ip}

코드 블럭. 로컬PC에서 터널링

사용 예시

Kubeflow 프로젝트 Git에서 Add Kubeflow 1.9 release roadmap PR 에 대한 요약을 질문합니다.

Kubeflow 프로젝트의 해당 PR에 대한 정보입니다.

마무리

이번 튜토리얼에서는 AIOS에서 제공하는 AI 모델을 활용하여 GIT PR 관련 데이터를 벡터화하고, OpenSearch 기반의 벡터 검색 및 LLM 응답을 조합하여 PR 리뷰 보조 챗봇을 구현해 보았습니다.이를 통해 과거 PR 히스토리에 기반한 질의응답이 가능해져, 개발자의 코드 리뷰 효율성과 품질을 향상시킬 수 있습니다. 본 시스템은 다음과 같은 방식으로 사용자 환경에 맞게 확장 및 커스터마이징할 수 있습니다.

벡터 데이터베이스 교체 : OpenSearch 외에 SCP Search Engine 상품 활용, 사용자 벡터 데이터베이스를 연동할 수 있습니다.
실시간 데이터 수집 연동 : Github Webhook 또는 Gitlab API 연동을 통해 실시간 PR 생성/업데이트 정보를 수집하고 자동 인덱싱 가능합니다.
대화형 UI 고도화: Streamlit 외에도 Slack Bot, 사내 메신저 등 다양한 인터페이스로 확장 가능합니다.

이번 튜토리얼을 기반으로 실제 서비스 목적에 따라 적합한 AIOS 기반 협업 도우미를 직접 구축해 보시길 바랍니다.

참고 링크

https://opensearch.org/
https://github.com/kubeflow/kubeflow

3 - Autogen

목표

AIOS에서 제공하는 AI모델을 활용해 Autogen AI Agent 애플리케이션을 생성합니다.

참고

Autogen
Autogen은 LLM 기반 다중 에이전트 협업과 이벤트 기반 자동화 워크플로우를 손쉽게 구축, 관리할 수 있는 오픈소스 프레임워크입니다.

환경

이 튜토리얼을 진행하려면 아래와 같은 환경이 준비되어 있어야 합니다.

시스템 환경

Python 3.10 +
pip

설치 필요 패키지

배경색 변경

pip install autogen-agentchat==0.6.1 autogen-ext[openai,mcp]==0.6.1 mcp-server-time==0.6.2

pip install autogen-agentchat==0.6.1 autogen-ext[openai,mcp]==0.6.1 mcp-server-time==0.6.2

코드 블럭. autogen, mcp 서버 패키지 설치

시스템 아키텍처

다중 AI 에이전트 아키텍처 및 MCP를 활용한 에이전트 아키텍처의 전체 흐름을 보여줍니다.

Travel Planning Agent Flow

사용자가 3일간의 네팔 여행 계획을 세워달라고 요청
Groupchat manger는 등록된 에이전트(여행 계획, 로컬 정보, 여행 회화, 종합 요약)의 실행 순서를 조정
각각의 에이전트는 각자의 역할에 맞게 주어진 작업을 협업하여 수행
최종적으로 여행 계획 결과물이 도출되면 사용자에게 전달

MCP Flow

참고

MCP
MCP(Model Context Protocol)는 모델과 외부 데이터나 도구와의 상호작용을 조율하는 개방형 표준 프로토콜입니다.

MCP 서버는 이를 구현한 서버로, 도구 메타데이터를 활용해 함수 호출을 중계, 실행합니다.

사용자가 한국의 현재 시각에 대해 질의
mcp_server_time 서버를 통해 현재 시각을 가져올 수 있는 도구의 메타데이터를 포함하여 모델 요청
get_current_time 함수를 호출하는 tool calls 메시지 생성
MCP 서버를 통해 get_current_time 함수를 실행하여 결과물을 모델 요청으로 전달하면 최종 응답을 생성하여 사용자에게 전달

구현

Travel Planning Agent

참고

코드 내 AIOS_BASE_URL인 AIOS_LLM_Private_Endpoint와 MODEL의 MODEL_ID는 LLM 이용 가이드를 참고해주세요.

autogen_travel_planning.py

배경색 변경

from urllib.parse import urljoin

from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_core.models import ModelFamily


# 모델 접근을 위한 API URL과 모델 이름을 설정합니다.
AIOS_BASE_URL = "AIOS_LLM_Private_Endpoint"
MODEL = "MODEL_ID"

# OpenAIChatCompletionClient를 사용하여 모델 클라이언트를 생성합니다.
model_client = OpenAIChatCompletionClient(
    model=MODEL,
    base_url=urljoin(AIOS_BASE_URL, "v1"),
    api_key="EMPTY_KEY",
    model_info={
        # 이미지를 지원하는 경우 True로 설정합니다.
        "vision": False,
        # 함수 호출을 지원하는 경우 True로 설정합니다.
        "function_calling": True,
        # JSON 출력을 지원하는 경우 True로 설정합니다.
        "json_output": True,
        # 사용하고자 하는 모델이 ModelFamily에서 제공하지 않는 경우 UNKNOWN을 사용합니다.
        # "family": ModelFamily.UNKNOWN,
        "family": ModelFamily.LLAMA_3_3_70B,
        # 구조화된 출력을 지원하는 경우 True로 설정합니다.
        "structured_output": True,
    },
)

# 여러 에이전트를 생성합니다.
# 각 에이전트는 여행 계획, 지역 활동 추천, 언어 팁 제공, 여행 계획 요약 등의 역할을 수행합니다.
planner_agent = AssistantAgent(
    "planner_agent",
    model_client=model_client,
    description="A helpful assistant that can plan trips.",
    system_message=("You are a helpful assistant that can suggest a travel plan "
                    "for a user based on their request."),
)

local_agent = AssistantAgent(
    "local_agent",
    model_client=model_client,
    description="A local assistant that can suggest local activities or places to visit.",
    system_message=("You are a helpful assistant that can suggest authentic and "
                    "interesting local activities or places to visit for a user "
                    "and can utilize any context information provided."),
)

language_agent = AssistantAgent(
    "language_agent",
    model_client=model_client,
    description="A helpful assistant that can provide language tips for a given destination.",
    system_message=("You are a helpful assistant that can review travel plans, "
                    "providing feedback on important/critical tips about how best to address "
                    "language or communication challenges for the given destination. "
                    "If the plan already includes language tips, "
                    "you can mention that the plan is satisfactory, with rationale."),
)

travel_summary_agent = AssistantAgent(
    "travel_summary_agent",
    model_client=model_client,
    description="A helpful assistant that can summarize the travel plan.",
    system_message=("You are a helpful assistant that can take in all of the suggestions "
                    "and advice from the other agents and provide a detailed final travel plan. "
                    "You must ensure that the final plan is integrated and complete. "
                    "YOUR FINAL RESPONSE MUST BE THE COMPLETE PLAN. "
                    "When the plan is complete and all perspectives are integrated, "
                    "you can respond with TERMINATE."),
)

# 에이전트들을 그룹으로 묶어 RoundRobinGroupChat을 생성합니다.
# RoundRobinGroupChat은 에이전트들이 등록된 순서대로 돌아가면서 작업을 수행하도록 조정합니다.
# 이 그룹은 에이전트들이 상호작용하며 여행 계획을 세울 수 있도록 합니다.
# 종료 조건은 TextMentionTermination을 사용하여 "TERMINATE"라는 텍스트가 언급될 때 그룹 채팅을 종료합니다.
termination = TextMentionTermination("TERMINATE")
group_chat = RoundRobinGroupChat(
    [planner_agent, local_agent, language_agent, travel_summary_agent],
    termination_condition=termination,
)

async def main():
    """메인 함수로, 그룹 채팅을 실행하고 여행 계획을 세웁니다."""
    # 그룹 채팅을 실행하여 여행 계획을 세웁니다.
    # 사용자가 "Plan a 3 day trip to Nepal."라는 작업을 요청합니다.
    # Console을 사용하여 결과를 출력합니다.
    await Console(group_chat.run_stream(task="Plan a 3 day trip to Nepal."))
    await model_client.close()


if __name__ == "__main__":
    import asyncio

    asyncio.run(main())

from urllib.parse import urljoin

from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_core.models import ModelFamily


# 모델 접근을 위한 API URL과 모델 이름을 설정합니다.
AIOS_BASE_URL = "AIOS_LLM_Private_Endpoint"
MODEL = "MODEL_ID"

# OpenAIChatCompletionClient를 사용하여 모델 클라이언트를 생성합니다.
model_client = OpenAIChatCompletionClient(
    model=MODEL,
    base_url=urljoin(AIOS_BASE_URL, "v1"),
    api_key="EMPTY_KEY",
    model_info={
        # 이미지를 지원하는 경우 True로 설정합니다.
        "vision": False,
        # 함수 호출을 지원하는 경우 True로 설정합니다.
        "function_calling": True,
        # JSON 출력을 지원하는 경우 True로 설정합니다.
        "json_output": True,
        # 사용하고자 하는 모델이 ModelFamily에서 제공하지 않는 경우 UNKNOWN을 사용합니다.
        # "family": ModelFamily.UNKNOWN,
        "family": ModelFamily.LLAMA_3_3_70B,
        # 구조화된 출력을 지원하는 경우 True로 설정합니다.
        "structured_output": True,
    },
)

# 여러 에이전트를 생성합니다.
# 각 에이전트는 여행 계획, 지역 활동 추천, 언어 팁 제공, 여행 계획 요약 등의 역할을 수행합니다.
planner_agent = AssistantAgent(
    "planner_agent",
    model_client=model_client,
    description="A helpful assistant that can plan trips.",
    system_message=("You are a helpful assistant that can suggest a travel plan "
                    "for a user based on their request."),
)

local_agent = AssistantAgent(
    "local_agent",
    model_client=model_client,
    description="A local assistant that can suggest local activities or places to visit.",
    system_message=("You are a helpful assistant that can suggest authentic and "
                    "interesting local activities or places to visit for a user "
                    "and can utilize any context information provided."),
)

language_agent = AssistantAgent(
    "language_agent",
    model_client=model_client,
    description="A helpful assistant that can provide language tips for a given destination.",
    system_message=("You are a helpful assistant that can review travel plans, "
                    "providing feedback on important/critical tips about how best to address "
                    "language or communication challenges for the given destination. "
                    "If the plan already includes language tips, "
                    "you can mention that the plan is satisfactory, with rationale."),
)

travel_summary_agent = AssistantAgent(
    "travel_summary_agent",
    model_client=model_client,
    description="A helpful assistant that can summarize the travel plan.",
    system_message=("You are a helpful assistant that can take in all of the suggestions "
                    "and advice from the other agents and provide a detailed final travel plan. "
                    "You must ensure that the final plan is integrated and complete. "
                    "YOUR FINAL RESPONSE MUST BE THE COMPLETE PLAN. "
                    "When the plan is complete and all perspectives are integrated, "
                    "you can respond with TERMINATE."),
)

# 에이전트들을 그룹으로 묶어 RoundRobinGroupChat을 생성합니다.
# RoundRobinGroupChat은 에이전트들이 등록된 순서대로 돌아가면서 작업을 수행하도록 조정합니다.
# 이 그룹은 에이전트들이 상호작용하며 여행 계획을 세울 수 있도록 합니다.
# 종료 조건은 TextMentionTermination을 사용하여 "TERMINATE"라는 텍스트가 언급될 때 그룹 채팅을 종료합니다.
termination = TextMentionTermination("TERMINATE")
group_chat = RoundRobinGroupChat(
    [planner_agent, local_agent, language_agent, travel_summary_agent],
    termination_condition=termination,
)

async def main():
    """메인 함수로, 그룹 채팅을 실행하고 여행 계획을 세웁니다."""
    # 그룹 채팅을 실행하여 여행 계획을 세웁니다.
    # 사용자가 "Plan a 3 day trip to Nepal."라는 작업을 요청합니다.
    # Console을 사용하여 결과를 출력합니다.
    await Console(group_chat.run_stream(task="Plan a 3 day trip to Nepal."))
    await model_client.close()


if __name__ == "__main__":
    import asyncio

    asyncio.run(main())

코드 블럭. autogen_travel_planning.py

python을 이용하여 파일을 실행하면 하나의 태스크를 위해 여러 개의 에이전트가 함께 각각의 역할을 수행하는 모습을 확인할 수 있습니다.

배경색 변경

python autogen_travel_planning.py

python autogen_travel_planning.py

코드 블럭. autogen 여행 계획 에이전트 실행

실행결과

---------- TextMessage (user) ----------
Plan a 3 day trip to Nepal.
---------- TextMessage (planner_agent) ----------
Nepal! A country with a rich cultural heritage, breathtaking natural beauty, and warm hospitality. Here's a suggested 3-day itinerary for your trip to Nepal:

**Day 1: Arrival in Kathmandu and Exploration of the City**

* Arrive at Tribhuvan International Airport in Kathmandu, the capital city of Nepal.
* Check-in to your hotel and freshen up.
* Visit the famous **Boudhanath Stupa**, one of the largest Buddhist stupas in the world.
* Explore the **Thamel** area, a popular tourist hub known for its narrow streets, shops, and restaurants.
* In the evening, enjoy a traditional Nepali dinner and watch a cultural performance at a local restaurant.

**Day 2: Kathmandu Valley Tour**

* Start the day with a visit to the **Pashupatinath Temple**, a sacred Hindu temple dedicated to Lord Shiva.
* Next, head to the **Kathmandu Durbar Square**, a UNESCO World Heritage Site and the former royal palace of the Malla kings.
* Visit the **Swayambhunath Stupa**, also known as the Monkey Temple, which offers stunning views of the city.
* In the afternoon, take a short drive to the **Patan City**, known for its rich cultural heritage and traditional crafts.
* Explore the **Patan Durbar Square** and visit the **Krishna Temple**, a beautiful example of Nepali architecture.

**Day 3: Bhaktapur and Nagarkot**

* Drive to **Bhaktapur**, a medieval town and a UNESCO World Heritage Site (approximately 1 hour).
* Explore the **Bhaktapur Durbar Square**, which features stunning architecture, temples, and palaces.
* Visit the **Pottery Square**, where you can see traditional pottery-making techniques.
* In the afternoon, drive to **Nagarkot**, a scenic hill station with breathtaking views of the Himalayas (approximately 1.5 hours).
* Watch the sunset over the Himalayas and enjoy the peaceful atmosphere.

**Additional Tips:**

* Make sure to try some local Nepali cuisine, such as momos, dal bhat, and gorkhali lamb.
* Bargain while shopping in the markets, as it's a common practice in Nepal.
* Respect local customs and traditions, especially when visiting temples and cultural sites.
* Stay hydrated and bring sunscreen, as the sun can be strong in Nepal.

**Accommodation:**

Kathmandu has a wide range of accommodation options, from budget-friendly guesthouses to luxury hotels. Some popular areas to stay include Thamel, Lazimpat, and Boudha.

**Transportation:**

You can hire a taxi or a private vehicle for the day to travel between destinations. Alternatively, you can use public transportation, such as buses or microbuses, which are affordable and convenient.

**Budget:**

The budget for a 3-day trip to Nepal can vary depending on your accommodation choices, transportation, and activities. However, here's a rough estimate:

* Accommodation: $20-50 per night
* Transportation: $10-20 per day
* Food: $10-20 per meal
* Activities: $10-20 per person

Total estimated budget for 3 days: $200-500 per person

I hope this helps, and you have a wonderful trip to Nepal!
---------- TextMessage (local_agent) ----------
Your 3-day itinerary for Nepal is well-planned and covers many of the country's cultural and natural highlights. Here are a few additional suggestions and tips to enhance your trip:

**Day 1:**

* After visiting the Boudhanath Stupa, consider exploring the surrounding streets, which are filled with Tibetan shops, restaurants, and monasteries.
* In the Thamel area, be sure to try some of the local street food, such as momos or sel roti.
* For dinner, consider trying a traditional Nepali restaurant, such as the Kathmandu Guest House or the Northfield Cafe.

**Day 2:**

* At the Pashupatinath Temple, be respectful of the Hindu rituals and customs. You can also take a stroll along the Bagmati River, which runs through the temple complex.
* At the Kathmandu Durbar Square, consider hiring a guide to provide more insight into the history and significance of the temples and palaces.
* In the afternoon, visit the Patan Museum, which showcases the art and culture of the Kathmandu Valley.

**Day 3:**

* In Bhaktapur, be sure to try some of the local pottery and handicrafts. You can also visit the Bhaktapur National Art Gallery, which features traditional Nepali art.
* At Nagarkot, consider taking a short hike to the nearby villages, which offer stunning views of the Himalayas.
* For sunset, find a spot with a clear view of the mountains, and enjoy the peaceful atmosphere.

**Additional Tips:**

* Nepal is a relatively conservative country, so dress modestly and respect local customs.
* Try to learn some basic Nepali phrases, such as "namaste" (hello) and "dhanyabaad" (thank you).
* Be prepared for crowds and chaos in the cities, especially in Thamel and Kathmandu Durbar Square.
* Consider purchasing a local SIM card or portable Wi-Fi hotspot to stay connected during your trip.

**Accommodation:**

* Consider staying in a hotel or guesthouse that is centrally located and has good reviews.
* Look for accommodations that offer amenities such as free Wi-Fi, hot water, and a restaurant or cafe.

**Transportation:**

* Consider hiring a private vehicle or taxi for the day, as this will give you more flexibility and convenience.
* Be sure to negotiate the price and agree on the itinerary before setting off.

**Budget:**

* Be prepared for variable prices and exchange rates, and have some local currency (Nepali rupees) on hand.
* Consider budgeting extra for unexpected expenses, such as transportation or food.

Overall, your itinerary provides a good balance of culture, history, and natural beauty, and with these additional tips and suggestions, you'll be well-prepared for an unforgettable trip to Nepal!
---------- TextMessage (language_agent) ----------
Your 3-day itinerary for Nepal is well-planned and covers many of the country's cultural and natural highlights. The additional suggestions and tips you provided are excellent and will help enhance the trip experience.

One aspect that is well-covered in your plan is the cultural and historical significance of the destinations. You have included a mix of temples, stupas, and cultural sites, which will give visitors a good understanding of Nepal's rich heritage.

Regarding language and communication challenges, your tip to "try to learn some basic Nepali phrases, such as 'namaste' (hello) and 'dhanyabaad' (thank you)" is excellent. This will help visitors show respect for the local culture and people, and can also facilitate interactions with locals.

Additionally, your suggestion to "consider purchasing a local SIM card or portable Wi-Fi hotspot to stay connected during your trip" is practical and will help visitors stay in touch with family and friends back home, as well as navigate the local area.

Your plan is satisfactory, and with the additional tips and suggestions, visitors will be well-prepared for an unforgettable trip to Nepal. The itinerary provides a good balance of culture, history, and natural beauty, and the tips on language, communication, and logistics will help ensure a smooth and enjoyable journey.

Overall, your plan is well-thought-out, and the additional suggestions and tips will help visitors make the most of their trip to Nepal. Well done!

However, one minor suggestion I might make is to consider including a few phrases in the local language for emergency situations, such as "where is the hospital?" or "how do I get to the airport?" This can help visitors in case of an unexpected situation, and can also give them more confidence when navigating unfamiliar areas.

But overall, your plan is excellent, and with these minor suggestions, it can be even more comprehensive and helpful for visitors to Nepal.
---------- TextMessage (travel_summary_agent) ----------
TERMINATE

Here is the complete and integrated 3-day travel plan to Nepal:

**Day 1: Arrival in Kathmandu and Exploration of the City**

* Arrive at Tribhuvan International Airport in Kathmandu, the capital city of Nepal.
* Check-in to your hotel and freshen up.
* Visit the famous **Boudhanath Stupa**, one of the largest Buddhist stupas in the world.
* Explore the surrounding streets, which are filled with Tibetan shops, restaurants, and monasteries.
* Explore the **Thamel** area, a popular tourist hub known for its narrow streets, shops, and restaurants. Be sure to try some of the local street food, such as momos or sel roti.
* In the evening, enjoy a traditional Nepali dinner and watch a cultural performance at a local restaurant, such as the Kathmandu Guest House or the Northfield Cafe.

**Day 2: Kathmandu Valley Tour**

* Start the day with a visit to the **Pashupatinath Temple**, a sacred Hindu temple dedicated to Lord Shiva. Be respectful of the Hindu rituals and customs, and take a stroll along the Bagmati River, which runs through the temple complex.
* Next, head to the **Kathmandu Durbar Square**, a UNESCO World Heritage Site and the former royal palace of the Malla kings. Consider hiring a guide to provide more insight into the history and significance of the temples and palaces.
* Visit the **Swayambhunath Stupa**, also known as the Monkey Temple, which offers stunning views of the city.
* In the afternoon, visit the **Patan City**, known for its rich cultural heritage and traditional crafts. Explore the **Patan Durbar Square** and visit the **Krishna Temple**, a beautiful example of Nepali architecture. Also, visit the Patan Museum, which showcases the art and culture of the Kathmandu Valley.

**Day 3: Bhaktapur and Nagarkot**

* Drive to **Bhaktapur**, a medieval town and a UNESCO World Heritage Site (approximately 1 hour). Explore the **Bhaktapur Durbar Square**, which features stunning architecture, temples, and palaces. Be sure to try some of the local pottery and handicrafts, and visit the Bhaktapur National Art Gallery, which features traditional Nepali art.
* In the afternoon, drive to **Nagarkot**, a scenic hill station with breathtaking views of the Himalayas (approximately 1.5 hours). Consider taking a short hike to the nearby villages, which offer stunning views of the Himalayas. Find a spot with a clear view of the mountains, and enjoy the peaceful atmosphere during sunset.

**Additional Tips:**

* Make sure to try some local Nepali cuisine, such as momos, dal bhat, and gorkhali lamb.
* Bargain while shopping in the markets, as it's a common practice in Nepal.
* Respect local customs and traditions, especially when visiting temples and cultural sites.
* Stay hydrated and bring sunscreen, as the sun can be strong in Nepal.
* Dress modestly and respect local customs, as Nepal is a relatively conservative country.
* Try to learn some basic Nepali phrases, such as "namaste" (hello), "dhanyabaad" (thank you), "where is the hospital?" and "how do I get to the airport?".
* Consider purchasing a local SIM card or portable Wi-Fi hotspot to stay connected during your trip.
* Be prepared for crowds and chaos in the cities, especially in Thamel and Kathmandu Durbar Square.

**Accommodation:**

* Consider staying in a hotel or guesthouse that is centrally located and has good reviews.
* Look for accommodations that offer amenities such as free Wi-Fi, hot water, and a restaurant or cafe.

**Transportation:**

* Consider hiring a private vehicle or taxi for the day, as this will give you more flexibility and convenience.
* Be sure to negotiate the price and agree on the itinerary before setting off.

**Budget:**

* The budget for a 3-day trip to Nepal can vary depending on your accommodation choices, transportation, and activities. However, here's a rough estimate:
        + Accommodation: $20-50 per night
        + Transportation: $10-20 per day
        + Food: $10-20 per meal
        + Activities: $10-20 per person
* Total estimated budget for 3 days: $200-500 per person
* Be prepared for variable prices and exchange rates, and have some local currency (Nepali rupees) on hand.
* Consider budgeting extra for unexpected expenses, such as transportation or food.

에이전트별 대화내용 요약

에이전트	대화 내용 요약
planner_agent	네팔 3일 여행 일정을 제안합니다. 1일차: 카트만두 도착 및 도시 탐험 2일차: 카트만두 계곡 투어 3일차: 박타푸르와 나가르코트 방문 추가 팁: 현지 풍습 존중, 현지 음식 시도, 교통 수단 선택 등
local_agent	planner_agent의 3일 여행 일정을 기반으로 추가적인 제안과 팁을 제공합니다. 1일차: 부다나트 스투파 주변 탐험, 2일차: 파슈파티나트 사원에서 힌두 의식 존중 3일차: 박타푸르의 도자기와 수공예품 시도 추가 팁: 현지 풍습 존중, 기본 네팔어 학습, 현지 시설 이용 등
language_agent	여행 일정을 평가하고, 추가적인 제안을 제공합니다. 기본 네팔어 학습, 현지 시설 이용, 비상 상황에 대비한 언어 준비 등
travel_summary_agent	전체적인 3일 여행 계획을 요약합니다. 1일차: 카트만두 도착 및 도시 탐험 2일차: 카트만두 계곡 투어 3일차: 박타푸르와 나가르코트 방문 추가 팁: 현지 풍습 존중, 현지 음식 시도, 교통 수단 선택 등

MCP 활용 Agent

참고

코드 내 AIOS_BASE_URL인 AIOS_LLM_Private_Endpoint와 MODEL의 MODEL_ID는 LLM 이용 가이드를 참고해주세요.

autogen_mcp.py

배경색 변경

from urllib.parse import urljoin

from autogen_core.models import ModelFamily
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.tools.mcp import McpWorkbench, StdioServerParams
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.ui import Console

# 모델 접근을 위한 API URL과 모델 이름을 설정합니다.
AIOS_BASE_URL = "AIOS_LLM_Private_Endpoint"
MODEL = "MODEL_ID"

# OpenAIChatCompletionClient를 사용하여 모델 클라이언트를 생성합니다.
model_client = OpenAIChatCompletionClient(
    model=MODEL,
    base_url=urljoin(AIOS_BASE_URL, "v1"),
    api_key="EMPTY_KEY",
    model_info={
        # 이미지를 지원하는 경우 True로 설정합니다.
        "vision": False,
        # 함수 호출을 지원하는 경우 True로 설정합니다.
        "function_calling": True,
        # JSON 출력을 지원하는 경우 True로 설정합니다.
        "json_output": True,
        # 사용하고자 하는 모델이 ModelFamily에서 제공하지 않는 경우 UNKNOWN을 사용합니다.
        # "family": ModelFamily.UNKNOWN,
        "family": ModelFamily.LLAMA_3_3_70B,
        # 구조화된 출력을 지원하는 경우 True로 설정합니다.
        "structured_output": True,
    }
)

# MCP 서버 파라미터를 설정합니다.
# mcp_server_time은 python으로 구현된 MCP 서버로, 
# 내부에 현재 시각을 알려주는 get_current_time, 시간대를 변환해 주는 convert_time 함수가 포함됩니다.
# 이 파라미터는 MCP 서버를 로컬 타임존으로 설정하여 시간을 확인할 수 있도록 합니다.
# 예를 들어, "Asia/Seoul"로 설정하면 한국 시간대에 맞춰 시간을 확인할 수 있습니다.
mcp_server_params = StdioServerParams(
    command="python",
    args=["-m", "mcp_server_time", "--local-timezone", "Asia/Seoul"],
)

async def main():
    """메인 함수로, MCP 워크벤치를 사용하여 시간을 확인하는 에이전트를 실행합니다."""
    # MCP 워크벤치를 사용하여 시간을 확인하는 에이전트를 생성하고 실행합니다.
    # 에이전트는 "What time is it now in South Korea?"라는 작업을 수행합니다.
    # Console을 사용하여 결과를 출력합니다.
    # MCP 워크벤치가 실행되는 동안 에이전트는 시간을 확인하고
    # 결과를 스트리밍 방식으로 출력합니다.
    # MCP 워크벤치가 종료되면 에이전트도 종료됩니다.
    async with McpWorkbench(mcp_server_params) as workbench:
        time_agent = AssistantAgent(
            "time_assistant",
            model_client=model_client,
            workbench=workbench,
            reflect_on_tool_use=True,
        )
        await Console(time_agent.run_stream(task="What time is it now in South Korea?"))
    await model_client.close()

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

from urllib.parse import urljoin

from autogen_core.models import ModelFamily
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.tools.mcp import McpWorkbench, StdioServerParams
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.ui import Console

# 모델 접근을 위한 API URL과 모델 이름을 설정합니다.
AIOS_BASE_URL = "AIOS_LLM_Private_Endpoint"
MODEL = "MODEL_ID"

# OpenAIChatCompletionClient를 사용하여 모델 클라이언트를 생성합니다.
model_client = OpenAIChatCompletionClient(
    model=MODEL,
    base_url=urljoin(AIOS_BASE_URL, "v1"),
    api_key="EMPTY_KEY",
    model_info={
        # 이미지를 지원하는 경우 True로 설정합니다.
        "vision": False,
        # 함수 호출을 지원하는 경우 True로 설정합니다.
        "function_calling": True,
        # JSON 출력을 지원하는 경우 True로 설정합니다.
        "json_output": True,
        # 사용하고자 하는 모델이 ModelFamily에서 제공하지 않는 경우 UNKNOWN을 사용합니다.
        # "family": ModelFamily.UNKNOWN,
        "family": ModelFamily.LLAMA_3_3_70B,
        # 구조화된 출력을 지원하는 경우 True로 설정합니다.
        "structured_output": True,
    }
)

# MCP 서버 파라미터를 설정합니다.
# mcp_server_time은 python으로 구현된 MCP 서버로, 
# 내부에 현재 시각을 알려주는 get_current_time, 시간대를 변환해 주는 convert_time 함수가 포함됩니다.
# 이 파라미터는 MCP 서버를 로컬 타임존으로 설정하여 시간을 확인할 수 있도록 합니다.
# 예를 들어, "Asia/Seoul"로 설정하면 한국 시간대에 맞춰 시간을 확인할 수 있습니다.
mcp_server_params = StdioServerParams(
    command="python",
    args=["-m", "mcp_server_time", "--local-timezone", "Asia/Seoul"],
)

async def main():
    """메인 함수로, MCP 워크벤치를 사용하여 시간을 확인하는 에이전트를 실행합니다."""
    # MCP 워크벤치를 사용하여 시간을 확인하는 에이전트를 생성하고 실행합니다.
    # 에이전트는 "What time is it now in South Korea?"라는 작업을 수행합니다.
    # Console을 사용하여 결과를 출력합니다.
    # MCP 워크벤치가 실행되는 동안 에이전트는 시간을 확인하고
    # 결과를 스트리밍 방식으로 출력합니다.
    # MCP 워크벤치가 종료되면 에이전트도 종료됩니다.
    async with McpWorkbench(mcp_server_params) as workbench:
        time_agent = AssistantAgent(
            "time_assistant",
            model_client=model_client,
            workbench=workbench,
            reflect_on_tool_use=True,
        )
        await Console(time_agent.run_stream(task="What time is it now in South Korea?"))
    await model_client.close()

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

코드 블럭. autogen_mcp.py

python을 이용하여 파일을 실행하면 MCP 서버로부터 도구의 메타데이터를 가져와서 모델을 호출하고, 모델이 tool calls 메시지를 생성하면 현재 시각을 조회하기 위해 get_current_time 함수를 실행하는 것을 확인할 수 있습니다.

배경색 변경

python autogen_mcp.py

python autogen_mcp.py

코드 블럭. autogen MCP 활용 에이전트 실행

실행결과

# TextMessage (user): 사용자가 준 입력 메시지 
---------- TextMessage (user) ----------
What time is it now in South Korea?
# MCP 서버에서 사용할 수 있는 도구들의 메타데이터 조회 
INFO:mcp.server.lowlevel.server:Processing request of type ListToolsRequest
...생략...
INFO:autogen_core.events:{
  # MCP 서버에서 사용 가능한 도구들의 메타데이터
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_time",
        "description": "Get current time in a specific timezones",
        "parameters": {
          "type": "object",
          "properties": {
            "timezone": {
              "type": "string",
              "description": "IANA timezone name (e.g., 'America/New_York', 'Europe/London'). Use 'Asia/Seoul' as local timezone if no timezone provided by the user."
            }
          },
          "required": [
            "timezone"
          ],
          "additionalProperties": false
        },
        "strict": false
      }
    },
    {
      "type": "function",
      "function": {
        "name": "convert_time",
        "description": "Convert time between timezones",
        "parameters": {
          "type": "object",
          "properties": {
            "source_timezone": {
              "type": "string",
              "description": "Source IANA timezone name (e.g., 'America/New_York', 'Europe/London'). Use 'Asia/Seoul' as local timezone if no source timezone provided by the user."
            },
            "time": {
              "type": "string",
              "description": "Time to convert in 24-hour format (HH:MM)"
            },
            "target_timezone": {
              "type": "string",
              "description": "Target IANA timezone name (e.g., 'Asia/Tokyo', 'America/San_Francisco'). Use 'Asia/Seoul' as local timezone if no target timezone provided by the user."
            }
          },
          "required": [
            "source_timezone",
            "time",
            "target_timezone"
          ],
          "additionalProperties": false
        },
        "strict": false
      }
    }
  ],
  "type": "LLMCall",
  # 입력 메시지 
  "messages": [
    {
      "content": "You are a helpful AI assistant. Solve tasks using your tools. Reply with TERMINATE when the task has been completed.",
      "role": "system"
    },
    {
      "role": "user",
      "name": "user",
      "content": "What time is it now in South Korea?"
    }
  ],
  # 모델 응답 
  "response": {
    "id": "chatcmpl-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    "choices": [
      {
        "finish_reason": "tool_calls",
        "index": 0,
        "logprobs": null,
        "message": {
          "content": null,
          "refusal": null,
          "role": "assistant",
          "annotations": null,
          "audio": null,
          "function_call": null,
          "tool_calls": [
            {
              "id": "chatcmpl-tool-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
              "function": {
                "arguments": "{\"timezone\": \"Asia/Seoul\"}",
                "name": "get_current_time"
              },
              "type": "function"
            }
          ],
          "reasoning_content": null
        },
        "stop_reason": 128008
      }
    ],
    "created": 1751278737,
    "model": "MODEL_ID",
    "object": "chat.completion",
    "service_tier": null,
    "system_fingerprint": null,
    "usage": {
      "completion_tokens": 21,
      "prompt_tokens": 508,
      "total_tokens": 529,
      "completion_tokens_details": null,
      "prompt_tokens_details": null
    },
    "prompt_logprobs": null
  },
  "prompt_tokens": 508,
  "completion_tokens": 21,
  "agent_id": null
}
# ToolCallRequestEvent: 모델로부터 tool call 메시지를 받음
---------- ToolCallRequestEvent (time_assistant) ----------
[FunctionCall(id='chatcmpl-tool-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx', arguments='{"timezone": "Asia/Seoul"}', name='get_current_time')]
INFO:mcp.server.lowlevel.server:Processing request of type ListToolsRequest
# MCP 서버를 통해 tool call 메시지의 함수 실행 
INFO:mcp.server.lowlevel.server:Processing request of type CallToolRequest
# ToolCallExecutionEvent: 함수의 실행 결과를 모델에게 전달 
---------- ToolCallExecutionEvent (time_assistant) ----------
[FunctionExecutionResult(content='{\n  "timezone": "Asia/Seoul",\n  "datetime": "2025-06-30T19:18:58+09:00",\n  "is_dst": false\n}', name='get_current_time', call_id='chatcmpl-tool-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx', is_error=False)]
...생략...
# TextMessage (time_assistant): 모델이 생성한 최종 답변 
---------- TextMessage (time_assistant) ----------
The current time in South Korea is 19:18:58 KST.
TERMINATE

MCP 서버 시간 조회 시스템 로그 분석 결과

MCP(Model Control Protocol) 서버를 통한 시간 조회 시스템의 실행 과정을 보여주는 로그 분석 결과입니다.

요청 정보

항목	내용
사용자 요청	What time is it now in South Korea?
요청 시간	2025-06-30 19:18:58 KST
처리 방식	MCP 서버 도구 호출

사용 가능한 도구

도구명	설명	매개변수	기본값
`get_current_time`	특정 시간대의 현재 시간 조회	`timezone` (IANA 시간대 이름)	`Asia/Seoul`
`convert_time`	시간대 간 시간 변환	`source_timezone`, `time`, `target_timezone`	`Asia/Seoul`

처리 과정

단계	액션	상세 내용
1	도구 메타데이터 조회	MCP 서버에서 사용 가능한 도구 목록 확인
2	AI 모델 응답	`get_current_time` 함수를 `Asia/Seoul` 시간대로 호출
3	함수 실행	MCP 서버가 시간 조회 도구 실행
4	결과 반환	구조화된 JSON 형식으로 시간 정보 제공
5	최종 답변	사용자에게 읽기 쉬운 형태로 시간 전달

함수 호출 상세

항목	값
함수명	`get_current_time`
매개변수	`{"timezone": "Asia/Seoul"}`
호출 ID	`chatcmpl-tool-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx`
타입	`function`

실행 결과

필드	값	설명
`timezone`	`Asia/Seoul`	시간대
`datetime`	`2025-06-30T19:18:58+09:00`	ISO 8601 형식 시간
`is_dst`	`false`	서머타임 적용 여부

최종 응답

항목	내용
응답 메시지	The current time in South Korea is 19:18:58 KST.
완료 표시	TERMINATE
응답 시간	19:18:58 KST

사용량 지표표

지표	값
프롬프트 토큰	508
완료 토큰	21
총 토큰 사용량	529
처리 시간	즉시 (실시간)

주요 특징

특징	설명
MCP 프로토콜 활용	외부 도구와의 원활한 연동
한국 시간대 기본 설정	`Asia/Seoul`을 기본값으로 사용
구조화된 응답	JSON 형식의 명확한 데이터 반환
자동 완료 표시	`TERMINATE`로 작업 완료 알림
실시간 정보 제공	정확한 현재 시간 조회

기술적 의의

이는 AI 어시스턴트가 외부 시스템과 연동하여 실시간 정보를 제공하는 현대적인 아키텍처의 예시입니다. MCP를 통해 AI 모델이 다양한 외부 도구와 서비스에 접근할 수 있어, 더욱 실용적이고 동적인 응답이 가능합니다.

마무리

이번 튜토리얼에서는 AIOS에서 제공하는 AI 모델과 autogen을 활용하여 다중 에이전트를 이용하여 여행 일정을 세워 주는 애플리케이션, MCP 서버를 활용하여 외부 도구를 활용할 수 있는 에이전트 애플리케이션을 구현하였습니다. 이를 통해 각각의 관점을 가진 여러 에이전트를 통해 다각도로 문제를 해결하고 외부 도구를 활용할 수 있다는 것을 알게 되었습니다. 본 시스템은 다음과 같은 방식으로 사용자 환경에 맞게 확장 및 커스터마이징할 수 있습니다.

에이전트 흐름 조절 : 작업을 진행할 에이전트를 선택할 때 다양한 기법을 활용할 수 있습니다. 신뢰성있는 결과를 위해 에이전트의 순서를 고정하여 구현할 수도 있고, 유연한 처리를 위해 AI 모델이 에이전트를 선택하게 할 수 있습니다. 또한 이벤트 기법을 이용하여 병렬적으로 복수의 에이전트가 작업을 처리하도록 구현할 수도 있습니다.
다양한 MCP 서버 도입 : mcp_server_time 외에 이미 구현된 다양한 MCP 서버들이 존재합니다. 이를 활용하여 AI 모델이 유연하게 다양한 외부 도구를 활용하게 하여 유용한 애플리케이션을 구현할 수 있습니다.

이번 튜토리얼을 기반으로 실제 서비스 목적에 따라 적합한 AIOS 기반 협업 도우미를 직접 구축해 보시길 바랍니다.

참고 링크

https://microsoft.github.io/autogen
https://modelcontextprotocol.io/
https://github.com/modelcontextprotocol/servers