Embed with LLM
Embed plain text using Cosmos.
About LLM Embedding
The LLM Embed Service allows you to generate dense vector representations of text. These embeddings can be used for various purposes, such as semantic search, text clustering, and machine learning.
Two embedding models are currently supported in Cosmos API.
Embedding Model | Developer | Vector size |
---|---|---|
ada-v2 | OpenAI | 1536 |
text-embedding-3-large | OpenAI | 3072 |
⚠️ Warning:
Notice that all Embedding models provide the vector representation of input text, but they use different algorithms and their embeddings dimensions may vary. That means that results obtained from different models won't be interchangeable in RAG operations or other processes.
Step-by-step Tutorial
In this guide:
Section 1
: Prerequisites.Section 2
: Setup Cosmos Python Client.Section 3
: Embed text.Section 4
: Handle Errors.
1. Prerequisites
Before you begin, ensure you have:
- An active CosmosPlatform account
- API key from the API keys dashboard
2. Setup Cosmos Python Client
Using Python Cosmos client you can perform the API requests in a convenient way.
2.1. Install Cosmos Python Client:
Get the Cosmos Python client through PIP:
pip install delos-cosmos
2.2. Authenticate Requests:
Initialize the client with your API key:
from cosmos import CosmosClient cosmos_client = CosmosClient(api_key=your-cosmos-api-key)
2.3. Call API:
You can start invoking any Cosmos endpoints. For example, let's try the /health
endpoint to check the validity of your API key and the availability of the client services:
response = cosmos_client.status_health() print(response)
3. Embed text
Here is an example of a LLM embedding request using Python client (/llm/embed
endpoint):
from cosmos import CosmosClient cosmos_client = CosmosClient(api_key="your-cosmos-api-key") response = cosmos_client.llm_embed(text="Hello, World!", model="ada-v2") print(response)
The text that you send is split in blocks, in order to fit the context size of the embedding model. The embeddings are a vector of floating-point numbers, that allow performing similarity research and other operations.
Each block will be identified with an id
(0, 1, 2, ...), containing the text that was embedded together. This same blocks-like structure is kept and returned in the embedded_texts
inside the embedding response . This is an example of a successful response:
{ "request_id": "efb4d2f7-6a69-46b6-bb5b-615ae7fe66c7", "response_id": "e9e850a3-9b45-4c67-a846-cf88c4640eb9", "status_code": 200, "status": "success", "message": "Text successfully embedded.", "data": { "embedded_texts": { { "id": "0", "text": "Hello, World!", "embedding": [ -0.041947972, -0.010223089, -0.011844206, ... (more values) ], } } } , "timestamp": "2024-11-20T15:25:32.290344Z" }
3.1. Parameters:
The parameters for the web search request are:
Parameter | Description | Example |
---|---|---|
text | The text to be embedded | "What is the capital of France?" |
model | Choice of the embedding model | ada-v2 or text-embeddings-3-large |
4. Handle Errors
Common errors include:
- Missing API key
- No text provided
Example error response:
{ "status_code": 400, "status": "error", "message": "Validation error", "error": { "error_code": "400", "error_message": "No text provided for embedding." } }