Skip to main content


Friendli Serverless Endpoints offer a range of models tailored to various tasks.

Text Generation Models

Text generation models provide users with completions and chat completions APIs, with pricing determined on a per-token basis. The following table outlines the pricing details for different text generation models:

Model CodePrice per Token
meta-llama-3-70b-instruct$0.8 / 1M tokens
mistral-7b-instruct-v0-2$0.13 / 1M tokens
mixtral-8x7b-instruct-v0-1$0.4 / 1M tokens
gemma-7b-it$0.13 / 1M tokens

The term "token" refers to an individual unit processed by the model.

Image Generation Models

Image generation models, specifically designed for text-to-image generation APIs, operate on a pricing model that charges per inference step. For stable diffusion models, an inference step corresponds to a denoising step. The pricing details for image generation models are outlined below:

Model CodePrice per Step
stable-diffusion-v1-5$0.0005 / 10 steps