Goals

  • Build your own AI agent using Friendli Serverless Endpoints and Gradio less than 50 LoC 🤖
  • Use tool calling to make your agent even smarter 🤩
  • Share your AI agent with the world and gather feedback 🌎

Gradio is the fastest way to demo your model with a friendly web interface.

Getting Started

  1. Head to https://suite.friendli.ai, and create an account.
  2. Grab a FRIENDLI_TOKEN to use Friendli Serverless Endpoints within an agent.

🚀 Step 1. Prerequisite

Install dependencies.

pip install openai gradio

🚀 Step 2. Launch your agent

Build your own AI agent using Friendli Serverless Endpoints and Gradio.

  • Gradio provides a ChatInterface that implements a chatbot UI running the chat_function.
    • More information about the chat_function(message, history)

      The input function should accept two parameters: a string input message and list of two-element lists of the form [[user_message, bot_message], …] representing the chat history, and return a string response.

  • Implement the chat_function using Friendli Serverless Endpoints.
    • Here, we used the meta-llama-3.1-70b-instruct model.
    • Feel free to explore other available models here.
from openai import OpenAI
import gradio as gr

friendli_client = OpenAI(
    base_url="https://inference.friendli.ai/v1",
    api_key="YOUR FRIENDLI TOKEN"
)

def chat_function(message, history):
    messages = []
    for user, chatbot in history:
        messages.append({"role" : "user", "content": user})
        messages.append({"role" : "assistant", "content": chatbot})
    messages.append({"role": "user", "content": message})

    stream = friendli_client.chat.completions.create(
        model="meta-llama-3.1-70b-instruct",
        messages=messages,
        stream=True
    )
    res = ""
    for chunk in stream:
        res += chunk.choices[0].delta.content or ""
        yield res

css = """
.gradio-container {
    max-width: 800px !important;
    margin-top: 100px !important;
}

.pending {
    display: none !important;
}

.sm {
    box-shadow: None !important;
}

#component-2 {
    height: 400px !important;
}
"""

with gr.Blocks(theme=gr.themes.Soft(), css=css) as friendli_agent:
    gr.ChatInterface(chat_function)

friendli_agent.launch()

🚀 Step 3. Tool Calling (Advanced)

Use tool calling to make your agent even smarter! We will show you how to make your agent search the web before answer as an example.

  • Change the base_url to https://inference.friendli.ai/tools/v1
  • Add tools parameter when calling chat completion API
from openai import OpenAI
import gradio as gr

friendli_client = OpenAI(
    base_url="https://inference.friendli.ai/tools/v1",
    api_key="YOUR FRIENDLI TOKEN"
)

def chat_function(message, history):
    messages = []
    for user, chatbot in history:
        messages.append({"role" : "user", "content": user})
        messages.append({"role" : "assistant", "content": chatbot})
    messages.append({"role": "user", "content": message})

    stream = friendli_client.chat.completions.create(
        model="meta-llama-3.1-70b-instruct",
        messages=messages,
        stream=True,
        tools=[{"type": "web:search"}],
    )
    res = ""
    for chunk in stream:
        if chunk.choices is None:
            yield "Waiting for tool response..."
        else:
            res += chunk.choices[0].delta.content or ""
            yield res

css = """
.gradio-container {
    max-width: 800px !important;
    margin-top: 100px !important;
}

.pending {
    display: none !important;
}

.sm {
    box-shadow: None !important;
}

#component-2 {
    height: 400px !important;
}
"""
with gr.Blocks(theme=gr.themes.Soft(), css=css) as agent:
    gr.ChatInterface(chat_function)
agent.launch()

Here is the available built-in tools (beta) list. Feel free to build your agent using the below tools.

  • math:calculator (tool for calculating arithmetic operations)
  • math:statistics (tool for analyzing statistic data)
  • math:calendar (tool for handling date-related data)
  • web:search (tool for retrieving data through the web search)
  • web:url (tool for extracting data from a given website)
  • code:python_interpreter (tool for writing and executing python code)
  • file:text (tool for extracting text data from a given file)

🚀 Step 4. Deploy your agent

For the temporal deployment, change the last line of the code.

agent.launch(share=True)

For the permanent deployment, you can use HuggingFace Space !