LLMInspect API User Guide
About This Guide
Welcome to the LLMInspect API User Guide. This guide is designed to help users of all levels interact seamlessly with various language models using the LLMInspect API. With LLMInspect API, you can connect to multiple LLM providers, including OpenAI, Gemini, and your locally deployed InspectGPT (Local LLM). It is compatible with the OpenAI API format, making it easy for users familiar with OpenAI's API to get started quickly.
Key Features
- Multi-LLM Support: Connect to multiple language model providers such as OpenAI, Gemini, and Local LLMs like InspectGPT.
- OpenAI API Compatibility: LLMInspect follows the OpenAI Chat Completions format, allowing for a smooth transition if you're already using OpenAI's API.
- Supports Chat Completions and Image Generation: Interact with language models for chat completions and generate images using supported models.
By default, all requests sent through the LLMInspect API are directed to OpenAI. To interact with Gemini or InspectGPT, you can specify the desired provider by using the appropriate headers, which will be explained in detail later in this guide.
Authentication
To use the LLMInspect API, proper authentication is required. Your API access can be authenticated in two different ways, as explained in the diagram below:
1. Using Your Own Subscription Key
You can use your own subscription key issued by public model providers (e.g., OpenAI, Gemini, etc.). Include the key in the HTTP request headers using the following format:
Obtaining an OpenAI API Key
To obtain an OpenAI API key, visit the OpenAI API Keys page. The OpenAI API key typically has the following format:
Obtaining a Gemini API Key
To obtain a Gemini API key, visit the Gemini API documentation. The Gemini API key usually has the following format:
Local LLM Key
For accessing a local LLM like InspectGPT, you may need a specific key depending on your deployment configuration. Please contact your system administrator for details on obtaining your local LLM key and its format.
2. Using LLMInspect API Token
Alternatively, you can use an API token issued by the LLMInspect authentication service to access both public and private model providers. Include the token in the HTTP request headers using the following format:
Using the LLMInspect API token allows for secure interaction with the API across various models without needing individual keys from each provider.
For Admin: Obtaining LLMInspect API Token
Admin can generate LLMInspect API token and provide them to the employees so they can have flawless access to API across all the models.
Use the following curl
command to request a token, and replace the placeholder values with your organization's credentials:
curl -X POST "https://your_domain/realms/InspectChat/protocol/openid-connect/token" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "client_id=your_client_id" \
-d "client_secret=your_client_secret" \
-d "username=your_username" \
-d "password=your_password" \
-d "grant_type=password"
On success, the server returns a JSON response containing an access_token
, along with other important fields:
{
"access_token": "x.x.x",
"expires_in": 300,
"refresh_expires_in": 1800,
"refresh_token": "x.x.x",
"token_type": "Bearer",
"scope": "profile email"
}
Explanation of Key Fields
- access_token: The main token used for authenticating API requests.
- expires_in: The duration (in seconds) until the
access_token
expires. In this example, the token is valid for 300 seconds (5 minutes). - refresh_expires_in: The duration (in seconds) until the
refresh_token
expires, allowing token renewal without reauthentication. - refresh_token: Used to renew the
access_token
, avoiding the need for a full reauthentication. - token_type: Indicates the type of token, generally
Bearer
. - scope: Lists the authorized scopes for this token, such as
profile
andemail
access.
Note for Admins: The
refresh_token
can be used to renew theaccess_token
before expiration.
API Usage
To perform requests with the LLMInspect API, use the following base URL:
By default, all requests are routed to OpenAI. You can, however, switch between different models by changing the headers or model key. Ensure that you specify the correct model in the request body, as shown in the following example.
Accessing Private / Local LLMs
To access locally deployed models (e.g. EUNOMATIX’s InspectGPT), set the X-Client-Id
header to InspectGPT
. Use the appropriate model and key based on your local deployment. For example, if you are using Mistral as your local LLM, the model would be mistral-tiny
.
Example header for local LLM requests:
Accessing OpenAI Models
By default all the requests to LLMInspect are route to OPENAI but to be explicit you can set the X-Client-Id
header to OpenAI
.
You can interact with any text-compatible OpenAI model such as:
gpt-4o
gpt-4o-mini
gpt-4-turbo
gpt-4
gpt-3.5-turbo
Here's an example of a request body for making a chat completion request:
{
"messages": [
{
"role": "user",
"content": "Hi, how are you?"
}
],
"stream": true,
"model": "gpt-3.5-turbo",
"temperature": 0.5,
"presence_penalty": 0,
"frequency_penalty": 0,
"top_p": 1
}
- Stream Mode: If
stream
is set totrue
, the API will return the response in chunks. - Non-Stream Mode: If
stream
is set tofalse
, you will receive the full response at once.
Note: For more information on request formatting, you can refer to the OpenAI API documentation.
Accessing OpenAI Image Models
By default all the requests to LLMInspect are route to OPENAI but to be explicit you can set the X-Client-Id
header to OpenAI
.
To create the images you have make request on the following endpoint.
You can access the OpenAI supported image models. Such as:
dall-e-3
dall-e-2
Here's an example of a request body for making a image generation request:
- Size: The size of the generated images. Must be one of
256x256
,512x512
, or1024x1024
fordall-e-2
. Must be one of1024x1024
,1792x1024
, or1024x1792
fordall-e-3
models.
Note: For more information on request formatting, you can refer to the OpenAI Image Generation API Docs.
Accessing Google Gemini Models
To interact with Gemini models, modify the request by setting the X-Client-Id
header to Gemini
and use the correct Gemini model in the request body. If you are using a model-specific key, update the Authorization
header accordingly.
Example header for Gemini requests:
Supported Gemini models include:
gemini-1.5-flash
gemini-1.5-flash-8b
gemini-1.5-pro
gemini-1.0-pro
Note: Vision models are not yet supported via the API.
Example Usage
Generating images with Dall-E 3
curl -X POST "https://{llminspect_domain}/v1/images/generations" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {YOUR_ACCESS_TOKEN}" \
-H "X-Client-Id: 'OpenAI'" \
-d '{
"model": "dall-e-3",
"prompt": "A cute baby sea otter",
"n": 1,
"size": "1024x1024"
}'
Explanation:
https://{llminspect_domain}/v1/images/generations
: Replace{llminspect_domain}
with the actual domain for your LLMInspect API.- Authorization Header: Replace
{YOUR_ACCESS_TOKEN}
with your valid OpenAI Key or LLMInspect API Token for API authentication. - X-Client-Id Header: Specifies OpenAI as the provider.
- Request Body: The JSON body contains the
model
,prompt
,n
, andsize
fields for image generation.
This command will send a request to generate one 1024x1024 image based on the given prompt. The response will provide you with the url
of the image.
Generating Code With GPT-4o
For generating code with the gpt-4o model via the LLMInspect API, Here's a curl
command. This command includes the X-Client-Id
header to specify OpenAI, the Bearer token for authorization, and a structured prompt aligned for code generation.
curl -X POST "https://{llminspect_domain}/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {YOUR_ACCESS_TOKEN}" \
-H "X-Client-Id: 'OpenAI'" \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are an AI coding assistant. Generate clean, efficient, and well-commented code as requested by the user."
},
{
"role": "user",
"content": "Write a Python function to find the factorial of a number using recursion."
}
],
"temperature": 0.3,
"presence_penalty": 0,
"frequency_penalty": 0,
"top_p": 1
}'
Explanation:
https://{llminspect_domain}/v1/chat/completions
: Replace{llminspect_domain}
with the actual domain for your LLMInspect API.- Authorization Header: Replace
{YOUR_ACCESS_TOKEN}
with your OpenAI Key or LLMInspect API Token. - X-Client-Id Header: Specifies OpenAI as the provider.
- Request Body:
- System Message: Sets the context for the assistant to produce code-oriented responses.
- User Message: Specifies the task, here asking for a Python function using recursion to calculate factorials.
- Model Parameters:
temperature
,presence_penalty
,frequency_penalty
, andtop_p
values are set to generate a balanced, consistent response.
This command will send a request to generate Python code based on the user's prompt, ensuring clarity and efficiency in the generated code.
API Access and LLMInspect Guardrails
LLMInspect offers powerful guardrails to ensure safe and secure interactions with language models. Based on the configurations set by your admin, your API requests shall be subject to these guardrails. If a request violates any of these guardrails, it may be blocked.
Supported Guardrails
- DetectSecrets: This guardrail identifies and prevents the sharing of sensitive information, such as secrets or API keys, within a request.
- DetectPII: Blocks the transmission of personally identifiable information (PII).
- Sentiment Analysis: Monitors the sentiment or tone of messages and can flag or block negative or harmful content.
- DetectUnusualPrompt: Flags any unusual or potentially harmful prompt requests that could lead to dangerous or unwanted outputs from the models.
Important: These guardrails are configured by your organization’s admin, and certain requests may be blocked based on these settings. For more information on the guardrails you can access the Security Guide.
REFERENCE
- OpenAI API Key Management: OpenAI API Keys page
- Gemini API Key Documentation: Gemini API documentation
- OpenAI Chat API Reference: OpenAI Chat API Documentation
- OpenAI Image Generation API Reference: OpenAI Image API Documentation