Embedding

Embedding & Reranker Quickstart

This quickstart walks you through generating your first embedding with Infron.

Embedding quickstart

Basic Request

To generate embeddings, send a POST request to /embeddings with your text input and chosen model:

import requests

response = requests.post(
  "https://llm.onerouter.pro/v1/embeddings",
  headers={
    "Authorization": f"Bearer {{API_KEY_REF}}",
    "Content-Type": "application/json",
  },
  json={
    "model": "{{MODEL}}",
    "input": "The quick brown fox jumps over the lazy dog"
  }
)

data = response.json()
embedding = data["data"][0]["embedding"]
print(f"Embedding dimension: {len(embedding)}")

Batch Processing

You can generate embeddings for multiple texts in a single request by passing an array of strings:

Here's a complete example of building a semantic search system using embeddings:

Expected output:

Reranker quickstart

In the example below, we use the Rerank API endpoint to index the list of documents from most to least relevant to the query "What is the capital of the United States?".

Example with Texts

Request

In this example, the documents being passed in are a list of strings:

Response

Example with Structured Data:

If your documents contain structured data, for best performance we recommend formatting them as YAML strings.

Request

In the documents parameter, we are passing in a list YAML strings, representing the structured data.

Response

Last updated