API Usage

To perform requests with the LLMInspect API, use the following base URL:

https://{llminspect_domain}/v1/chat/completions

By default, all requests are routed to OpenAI. You can, however, switch between different models by changing the headers or model key. Ensure that you specify the correct model in the request body, as shown in the following example.

Accessing Private / Local LLMs

To access locally deployed models (e.g. EUNOMATIX’s InspectGPT), set the X-Client-Id header to InspectGPT. Use the appropriate model and key based on your local deployment. For example, if you are using Mistral as your local LLM, the model would be mistral-tiny.

Example header for local LLM requests:

X-Client-Id: 'InspectGPT'

Accessing OpenAI Models

By default all the requests to LLMInspect are route to OPENAI but to be explicit you can set the X-Client-Id header to OpenAI.

X-Client-Id: 'OpenAI'

You can interact with any text-compatible OpenAI model such as:

gpt-4o
gpt-4o-mini
gpt-4-turbo
gpt-4
gpt-3.5-turbo

Here's an example of a request body for making a chat completion request:

{
  "messages": [
    {
      "role": "user",
      "content": "Hi, how are you?"
    }
  ],
  "stream": true,
  "model": "gpt-3.5-turbo",
  "temperature": 0.5,
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "top_p": 1
}

Stream Mode: If stream is set to true, the API will return the response in chunks.
Non-Stream Mode: If stream is set to false, you will receive the full response at once.

Note: For more information on request formatting, you can refer to the OpenAI API documentation.

Accessing OpenAI Image Models

By default all the requests to LLMInspect are route to OPENAI but to be explicit you can set the X-Client-Id header to OpenAI.

X-Client-Id: 'OpenAI'

To create the images you have make request on the following endpoint.

https://{llminspect_domain}/v1/images/generations

You can access the OpenAI supported image models. Such as:

dall-e-3
dall-e-2

Here's an example of a request body for making a image generation request:

{
    "model": "dall-e-3",
    "prompt": "A cute baby sea otter",
    "n": 1,
    "size": "1024x1024"
  }

Size: The size of the generated images. Must be one of 256x256, 512x512, or 1024x1024 for dall-e-2. Must be one of 1024x1024, 1792x1024, or 1024x1792 for dall-e-3 models.

Note: For more information on request formatting, you can refer to the OpenAI Image Generation API Docs.

Accessing Google Gemini Models

To interact with Gemini models, modify the request by setting the X-Client-Id header to Gemini and use the correct Gemini model in the request body. If you are using a model-specific key, update the Authorization header accordingly.

Example header for Gemini requests:

X-Client-Id: 'Gemini'

Supported Gemini models include:

gemini-1.5-flash
gemini-1.5-flash-8b
gemini-1.5-pro
gemini-1.0-pro

Note: Vision models are not yet supported via the API.

Example Usage

Generating images with Dall-E 3

curl -X POST "https://{llminspect_domain}/v1/images/generations" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {YOUR_ACCESS_TOKEN}" \
-H "X-Client-Id: 'OpenAI'" \
-d '{
"model": "dall-e-3",
"prompt": "A cute baby sea otter",
"n": 1,
"size": "1024x1024"
}'

Explanation:

https://{llminspect_domain}/v1/images/generations: Replace {llminspect_domain} with the actual domain for your LLMInspect API.
Authorization Header: Replace {YOUR_ACCESS_TOKEN} with your valid OpenAI Key or LLMInspect API Token for API authentication.
X-Client-Id Header: Specifies OpenAI as the provider.
Request Body: The JSON body contains the model, prompt, n, and size fields for image generation.

This command will send a request to generate one 1024x1024 image based on the given prompt. The response will provide you with the url of the image.

Generating Code With GPT-4o

For generating code with the gpt-4o model via the LLMInspect API, Here's a curl command. This command includes the X-Client-Id header to specify OpenAI, the Bearer token for authorization, and a structured prompt aligned for code generation.

curl -X POST "https://{llminspect_domain}/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {YOUR_ACCESS_TOKEN}" \
-H "X-Client-Id: 'OpenAI'" \
-d '{
  "model": "gpt-4o",
  "messages": [
    {
      "role": "system",
      "content": "You are an AI coding assistant. Generate clean, efficient, and well-commented code as requested by the user."
    },
    {
      "role": "user",
      "content": "Write a Python function to find the factorial of a number using recursion."
    }
  ],
  "temperature": 0.3,
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "top_p": 1
}'

Explanation:

https://{llminspect_domain}/v1/chat/completions: Replace {llminspect_domain} with the actual domain for your LLMInspect API.
Authorization Header: Replace {YOUR_ACCESS_TOKEN} with your OpenAI Key or LLMInspect API Token.
X-Client-Id Header: Specifies OpenAI as the provider.
Request Body:
- System Message: Sets the context for the assistant to produce code-oriented responses.
- User Message: Specifies the task, here asking for a Python function using recursion to calculate factorials.
- Model Parameters: temperature, presence_penalty, frequency_penalty, and top_p values are set to generate a balanced, consistent response.

This command will send a request to generate Python code based on the user's prompt, ensuring clarity and efficiency in the generated code.

API Access and LLMInspect Guardrails

LLMInspect offers powerful guardrails to ensure safe and secure interactions with language models. Based on the configurations set by your admin, your API requests shall be subject to these guardrails. If a request violates any of these guardrails, it may be blocked.

Supported Guardrails

DetectSecrets: This guardrail identifies and prevents the sharing of sensitive information, such as secrets or API keys, within a request.
DetectPII: Blocks the transmission of personally identifiable information (PII).
Sentiment Analysis: Monitors the sentiment or tone of messages and can flag or block negative or harmful content.
DetectUnusualPrompt: Flags any unusual or potentially harmful prompt requests that could lead to dangerous or unwanted outputs from the models.

Important: These guardrails are configured by your organization’s admin, and certain requests may be blocked based on these settings. For more information on the guardrails you can access the Security Guide.