API Usage
API Usage
To perform requests with the LLMInspect API, use the following base URL:
By default, all requests are routed to OpenAI. You can, however, switch between different models by changing the headers or model key. Ensure that you specify the correct model in the request body, as shown in the following example.
Accessing Private / Local LLMs
To access locally deployed models (e.g. EUNOMATIX’s InspectGPT), set the X-Client-Id
header to InspectGPT
. Use the appropriate model and key based on your local deployment. For example, if you are using Mistral as your local LLM, the model would be mistral-tiny
.
Example header for local LLM requests:
Accessing OpenAI Models
By default all the requests to LLMInspect are route to OPENAI but to be explicit you can set the X-Client-Id
header to OpenAI
.
You can interact with any text-compatible OpenAI model such as:
gpt-4o
gpt-4o-mini
gpt-4-turbo
gpt-4
gpt-3.5-turbo
Here's an example of a request body for making a chat completion request:
{
"messages": [
{
"role": "user",
"content": "Hi, how are you?"
}
],
"stream": true,
"model": "gpt-3.5-turbo",
"temperature": 0.5,
"presence_penalty": 0,
"frequency_penalty": 0,
"top_p": 1
}
- Stream Mode: If
stream
is set totrue
, the API will return the response in chunks. - Non-Stream Mode: If
stream
is set tofalse
, you will receive the full response at once.
Note: For more information on request formatting, you can refer to the OpenAI API documentation.
Accessing OpenAI Image Models
By default all the requests to LLMInspect are route to OPENAI but to be explicit you can set the X-Client-Id
header to OpenAI
.
To create the images you have make request on the following endpoint.
You can access the OpenAI supported image models. Such as:
dall-e-3
dall-e-2
Here's an example of a request body for making a image generation request:
- Size: The size of the generated images. Must be one of
256x256
,512x512
, or1024x1024
fordall-e-2
. Must be one of1024x1024
,1792x1024
, or1024x1792
fordall-e-3
models.
Note: For more information on request formatting, you can refer to the OpenAI Image Generation API Docs.
Accessing Google Gemini Models
To interact with Gemini models, modify the request by setting the X-Client-Id
header to Gemini
and use the correct Gemini model in the request body. If you are using a model-specific key, update the Authorization
header accordingly.
Example header for Gemini requests:
Supported Gemini models include:
gemini-1.5-flash
gemini-1.5-flash-8b
gemini-1.5-pro
gemini-1.0-pro
Note: Vision models are not yet supported via the API.
Example Usage
Generating images with Dall-E 3
curl -X POST "https://{llminspect_domain}/v1/images/generations" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {YOUR_ACCESS_TOKEN}" \
-H "X-Client-Id: 'OpenAI'" \
-d '{
"model": "dall-e-3",
"prompt": "A cute baby sea otter",
"n": 1,
"size": "1024x1024"
}'
Explanation:
https://{llminspect_domain}/v1/images/generations
: Replace{llminspect_domain}
with the actual domain for your LLMInspect API.- Authorization Header: Replace
{YOUR_ACCESS_TOKEN}
with your valid OpenAI Key or LLMInspect API Token for API authentication. - X-Client-Id Header: Specifies OpenAI as the provider.
- Request Body: The JSON body contains the
model
,prompt
,n
, andsize
fields for image generation.
This command will send a request to generate one 1024x1024 image based on the given prompt. The response will provide you with the url
of the image.
Generating Code With GPT-4o
For generating code with the gpt-4o model via the LLMInspect API, Here's a curl
command. This command includes the X-Client-Id
header to specify OpenAI, the Bearer token for authorization, and a structured prompt aligned for code generation.
curl -X POST "https://{llminspect_domain}/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {YOUR_ACCESS_TOKEN}" \
-H "X-Client-Id: 'OpenAI'" \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are an AI coding assistant. Generate clean, efficient, and well-commented code as requested by the user."
},
{
"role": "user",
"content": "Write a Python function to find the factorial of a number using recursion."
}
],
"temperature": 0.3,
"presence_penalty": 0,
"frequency_penalty": 0,
"top_p": 1
}'
Explanation:
https://{llminspect_domain}/v1/chat/completions
: Replace{llminspect_domain}
with the actual domain for your LLMInspect API.- Authorization Header: Replace
{YOUR_ACCESS_TOKEN}
with your OpenAI Key or LLMInspect API Token. - X-Client-Id Header: Specifies OpenAI as the provider.
- Request Body:
- System Message: Sets the context for the assistant to produce code-oriented responses.
- User Message: Specifies the task, here asking for a Python function using recursion to calculate factorials.
- Model Parameters:
temperature
,presence_penalty
,frequency_penalty
, andtop_p
values are set to generate a balanced, consistent response.
This command will send a request to generate Python code based on the user's prompt, ensuring clarity and efficiency in the generated code.
API Access and LLMInspect Guardrails
LLMInspect offers powerful guardrails to ensure safe and secure interactions with language models. Based on the configurations set by your admin, your API requests shall be subject to these guardrails. If a request violates any of these guardrails, it may be blocked.
Supported Guardrails
- DetectSecrets: This guardrail identifies and prevents the sharing of sensitive information, such as secrets or API keys, within a request.
- DetectPII: Blocks the transmission of personally identifiable information (PII).
- Sentiment Analysis: Monitors the sentiment or tone of messages and can flag or block negative or harmful content.
- DetectUnusualPrompt: Flags any unusual or potentially harmful prompt requests that could lead to dangerous or unwanted outputs from the models.
Important: These guardrails are configured by your organization’s admin, and certain requests may be blocked based on these settings. For more information on the guardrails you can access the Security Guide.