Integrations

Friendli integrates with LangChain, LiteLLM, LlamaIndex, and MongoDB to streamline the deployment of compound GenAI applications. The integration of LangChain and LlamaIndex facilitates tool calling AI agents or Retrieval-Augmented Generation (RAG). MongoDB supports these agentic systems by providing memory with vector databases, while LiteLLM enhances performance through load balancing and evaluation.

Get a quick overview of Friendli Serverless Endpoints’ integrations and learn more through the linked resources.

LangChain

LangChain is a framework for developing applications powered by large language models (LLMs). Utilize Friendli Serverless Endpoints for LLM inferencing in LangChain by preparing a Friendli Token.

To install the required packages, run:

pip install langchain langchain-community friendli-client

Here’s a streaming chat sample code to get started with LangChain and FriendliAI:

from langchain_community.chat_models.friendli import ChatFriendli

llm = ChatFriendli(model="meta-llama-3.1-70b-instruct")

for chunk in llm.stream("Tell me a funny joke."):
    print(chunk.content, end="", flush=True)

Output:

Here's one:
Why couldn't the bicycle stand up by itself?
(Wait for it...)
Because it was two-tired!
Hope that brought a smile to your face!

Resources

MongoDB

MongoDB Atlas is a developer data platform offering vector stores and searches for compound GenAI applications, compatible through both LangChain and LlamaIndex. Utilize Friendli Serverless Endpoints for LLM inferencing in MongoDB by preparing a Friendli Token.

To install the required packages, run:

pip install pymongo friendli-client langchain langchain-mongodb langchain-community pypdf langchain-openai tiktoken

Here’s a RAG sample code to get started with MongoDB and FriendliAI using LangChain:

# Note: You can find detailed explanation on this code in the blog post below.
from pymongo import MongoClient
from langchain_mongodb.vectorstores import MongoDBAtlasVectorSearch
from langchain_community.chat_models.friendli import ChatFriendli
from langchain_community.document_loaders import PyPDFLoader
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough

# Fill in your Cluster URI here.
MONGODB_ATLAS_CLUSTER_URI = "{YOUR CLUSTER URI}"

client = MongoClient(MONGODB_ATLAS_CLUSTER_URI)

# Fill in your DB information here.
DB_NAME = "{YOUR DB NAME}"
COLLECTION_NAME = "{YOUR COLLECTION NAME}"
ATLAS_VECTOR_SEARCH_INDEX_NAME = "{YOUR INDEX NAME}"

MONGODB_COLLECTION = client[DB_NAME][COLLECTION_NAME]

# Fill in your PDF link here.
loader = PyPDFLoader("{YOUR PDF DOCUMENT LINK}")
data = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=150)
docs = text_splitter.split_documents(data)

vector_store = MongoDBAtlasVectorSearch.from_documents(
    documents=docs,
    embedding=OpenAIEmbeddings(disallowed_special=()),
    collection=MONGODB_COLLECTION,
    index_name=ATLAS_VECTOR_SEARCH_INDEX_NAME,
)
retriever = vector_store.as_retriever()

llm = ChatFriendli(model="meta-llama-3.1-70b-instruct")

prompt = PromptTemplate.from_template(
    """
    Use the following pieces of context to answer the question.
    {context}
    Question: {question}
    Helpful Answer:
    """
)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# Input your user query here.
rag_chain.invoke("{Sample Query Texts}")

Resources

LlamaIndex

LlamaIndex is a data framework designed to connect LLMs to custom data sources. Utilize Friendli Serverless Endpoints for LLM inferencing in LlamaIndex by preparing a Friendli Token. Additionally, an OpenAI API key is required to access the OpenAI embedding API.

To install the required packages, run:

pip install llama-index-llms-friendli llama-index

Here’s a RAG streaming chat sample code to get started with LlamaIndex and FriendliAI:

from llama_index.llms.friendli import Friendli
from llama_index.core import Settings, SimpleDirectoryReader, VectorStoreIndex

Settings.llm = Friendli()

# Assuming a directory named 'data_folder' stores your pdf file.
documents = SimpleDirectoryReader('data_folder').load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(streaming=True)

# Input your user query here.
response = query_engine.query("{Sample Query Texts}")
response.print_response_stream()

Resources

LiteLLM

LiteLLM is a versatile platform offering access to 100+ LLMs in the OpenAI API format. Utilize Friendli Serverless Endpoints for LLM inferencing in LiteLLM by preparing a Friendli Token.

To install the required package, run:

pip install litellm

Here’s a streaming chat sample code to get started with LiteLLM and FriendliAI:

from litellm import completion

response = completion(
    # Simply change the model ID to use different LLM inference models & engines.
    model="friendliai/meta-llama-3-70b-instruct",
    messages=[
       {"role": "user", "content": "Hello from LiteLLM"}
    ],
    stream=True,
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="", flush=True)

Output:

Hello from an AI! It's great to meet you, LiteLLM! How's your day going so far?

Get Started

Products

Tutorials

LangChain

Resources

MongoDB

Resources

LlamaIndex

Resources

LiteLLM

Resources

Get Started

Products

Tutorials

​LangChain

​Resources

​MongoDB

​Resources

​LlamaIndex

​Resources

​LiteLLM

​Resources

LangChain

Resources

MongoDB

Resources

LlamaIndex

Resources

LiteLLM

Resources