Skip to main content

QuickStart: Friendli Dedicated Endpoints

1. Log In or Sign Up

  • If you have an account, log in using your preferred SSO or email/password combination.
  • If you're new to FriendliAI, create an account for free.

Login

2. Access Friendli Dedicated Endpoints

  • On your dashboard, find the "Friendli Dedicated Endpoints" section.
  • If unauthorized, ask your team admin to provide access to the Friendli Dedicated Endpoints at the team settings.

Dashboard Unauthorized Team Members Dashboard Authorized

3. Select Your Project

  • Either create a new project, or choose from your existing projects for your workload.

Project List

4. Prepare Your Model

  • Choose a model that you wish to serve from HuggingFace, or upload your custom model on our cloud.

HuggingFace

5. Deploy Your Endpoint

  • Deploy your endpoint, using the model of your choice prepared from step 3, and the instance equipped with your desired GPU specification.
  • You can also configure your replicas and the max-batch-size for your endpoint.

Create Endpoint Endpoint Detail

6. Generate Responses

  • You can generate your responses in two ways: playground and endpoint address.
  • Try out and test generating responses on your custom model using a chatGPT-like interface at the playground tab.

Endpoint Playground

  • For general usages, send queries to your model through our API at the given endpoint address, accessible on the endpoint information tab.
info

Generating Responses Through the Endpoint URL

Refer to this guide for general instructions on personal access tokens.

# Send inference request to a running Friendli Serverless Endpoint using a `curl` command.

$ curl -X POST https://inference.friendli.ai/dedicated/v1/completions \
-H "Authorization: Bearer $FRIENDLI_TOKEN" \
-d '{"model": "$ENDPOINT_ID", "prompt": "Python is a popular",
"min_tokens": 20, "max_tokens": 30,
"top_k": 32, "top_p": 0.8, "n": 3, "no_repeat_ngram": 3,
"ngram_repetition_penalty": 1.75}'
info

For a more detailed tutorial for your usage, please refer to our tutorial for using HuggingFace models and W&B models