LLMInspect On-Premise Documentation

Welcome to the comprehensive documentation for LLMInspect On-Premise deployment. This documentation provides everything you need to deploy, configure, and manage your own secure AI platform within your organization's infrastructure.

🚀 Getting Started - Admin Guide

System administrators start here for deployment and infrastructure management.

Installation Guide - Complete deployment instructions and prerequisites
Installation Parameters - Configuration options and environment variables
Accessing LLMInspect - Service endpoints and default credentials
User Management - Keycloak configuration and user administration
Daily Operations - Maintenance, monitoring, and troubleshooting
Important Considerations - Security best practices and backup procedures

💬 Chat User Guide

End-users learn how to interact with AI models through the web interface.

Getting Started - Login process and system requirements
InspectChat GUI - Interface navigation and features
Chat Examples - Practical usage scenarios and best practices

🖥️ Desktop User Guide

Native desktop application setup and configuration.

Installation Options - Linux, macOS, and Windows applications
Backend Configuration - Connecting to your on-premise server
Troubleshooting - Common issues and solutions

🔌 API User Guide

Developers integrate LLMInspect into applications and workflows.

API Overview - Multi-LLM support and OpenAI compatibility
Authentication - API tokens and access management
API Usage - Endpoints, examples, and model switching
Reference - External documentation and resources

Platform Architecture

Your on-premise LLMInspect deployment includes these core components:

Component	Default Port	Purpose
Keycloak	4116	Authentication and user management
Grafana	5116	Monitoring and observability dashboard
InspectChat	6116	Web-based chat interface
Admin Panel	7116	Security configuration and GuardRails
API Gateway	8116	REST API for programmatic access

What is LLMInspect On-Premise?

LLMInspect On-Premise is a self-hosted AI safety and observability platform that provides:

🏢 Complete Data Control - All data remains within your infrastructure
🤖 Multi-Model AI Access - Support for OpenAI, Google Gemini, and local LLMs
🛡️ Enterprise Security - Advanced GuardRails and content filtering
📊 Full Observability - Comprehensive monitoring and audit capabilities
🔧 Flexible Deployment - Docker-based containerized architecture
🔐 Identity Integration - LDAP/Active Directory support via Keycloak

Deployment Requirements

Minimum System Requirements

CPU: 8+ cores
RAM: 32GB minimum
Storage: 100GB SSD
Network: HTTPS with valid certificates
Container Runtime: Docker Engine 20.10.x+ with Docker Compose 2.x+

Prerequisites

Valid SSL/TLS certificates (or auto-generated self-signed for testing)
Network access to required external services (if using cloud LLMs)
Local LLM deployment with at least 70B parameters for private AI

Common Deployment Journeys

For System Administrators

Review Prerequisites - Ensure system requirements are met
Deploy Platform - Run the installation script
Configure SSL - Set up secure communications
Create Users - Add team members
Set Up Monitoring - Configure health checks

For End Users

Access InspectChat - Log in to web interface
Select AI Model - Choose appropriate LLM
Start Chatting - Begin AI interactions
Upload Documents - Analyze files with AI
Generate Images - Create visual content

For Developers

Understand API - Learn OpenAI-compatible endpoints
Configure Authentication - Set up API access
Test Integration - Make first API calls
Switch Models - Access different LLM providers
Implement GuardRails - Understand security controls

Key Deployment Features

🔒 Self-Hosted Security

Complete Data Sovereignty - No data leaves your infrastructure
Custom GuardRails - Tailored content filtering and safety policies
Enterprise Authentication - Integration with existing identity systems
Audit Compliance - Complete interaction logging and reporting

🎯 AI Model Flexibility

Public Cloud Models - OpenAI GPT-4, Google Gemini integration
Private Local Models - Your own LLM deployments (InspectGPT)
Hybrid Configurations - Mix of public and private model access
Model Switching - Easy provider changes via headers or interface

📊 Comprehensive Monitoring

Real-time Dashboards - Grafana-powered observability
Usage Analytics - Track adoption, performance, and costs
Security Monitoring - GuardRail violations and threat detection
Performance Metrics - Response times and system health

Installation Quick Start

Basic Installation

# Extract deployment package
unzip llminspect-onpremise.zip
cd llminspect

# Run installer with auto-generated certificates
./llminspect-cli -p /full/path/to/llminspect -o install

Production Installation with Custom SSL

# Install with your own certificates
./llminspect-cli -p /full/path/to/llminspect -o install \
    --cert-file /path/to/certificate.crt \
    --cert-key /path/to/private.key \
    --domain your-domain.com

Access Your Deployment

After successful installation, access these services:

Service	URL	Default Credentials
InspectChat	`https://your-domain:6116`	`llminspect` / `llminspect_passw0rd`
Admin Panel	`https://your-domain:7116`	`llminspect` / `llminspect_passw0rd`
Grafana	`https://your-domain:5116`	`llminspect` / `llminspect_passw0rd`
Keycloak Admin	`https://your-domain:4116/admin`	`admin` / [your-set-password]
API Gateway	`https://your-domain:8116/v1/`	Bearer token required

⚠️ Security Note: Change default passwords immediately after first login.

Quick Reference by User Type

👨‍💼 System Administrators:

Installation Guide - Deploy the platform
User Management - Manage team access
Daily Operations - Maintain the system
Security Best Practices - Protect your deployment

👤 End Users:

Getting Started - First login and setup
Chat Interface - Use the web application
Desktop App - Install native applications

👨‍💻 Developers:

API Overview - Understand the API architecture
Authentication - Get API access
Usage Examples - Implementation examples

📊 Analysts & Compliance:

Monitoring Dashboard - System health
Audit Capabilities - Compliance reporting
Usage Analytics - Adoption insights

Most Common Tasks

Task	Documentation	Estimated Time
Complete platform installation	Installation Guide	30-60 minutes
First AI conversation	Text Generation	2 minutes
Set up desktop application	Desktop Setup	5 minutes
Make first API call	API Usage	10 minutes
Add new user account	User Management	3 minutes
Upload and analyze document	Document Upload	5 minutes
Configure LDAP integration	LDAP Setup	15 minutes

Support & Troubleshooting

Self-Service Resources

System Logs: Check Docker container logs for issues
Health Checks: Use built-in monitoring endpoints
Configuration Validation: Verify environment variables and certificates

Common Issues

Installation Failures: Use the clean operation before retrying
Authentication Problems: Check Keycloak configuration and user credentials
API Connection Issues: Verify network connectivity and port accessibility
Desktop App Issues: Confirm backend URL configuration and server status

Security Considerations

Data Protection

Encryption in Transit: HTTPS for all communications
Encryption at Rest: Database and file system protection
Access Control: Role-based permissions via Keycloak
Audit Logging: Complete interaction tracking

Network Security

Firewall Configuration: Restrict access to necessary ports only
Certificate Management: Regular SSL/TLS certificate renewal
VPN Integration: Secure remote access capabilities
Network Segmentation: Isolate AI workloads from other systems

Next Steps

Choose your deployment path:

🔧 I'm setting up the platform → Start with Admin Guide

💬 I want to use the chat interface → Go to Chat User Guide

🖥️ I need the desktop application → Check Desktop User Guide

🔌 I'm integrating via API → Visit API User Guide

🔍 I need specific information → Use the navigation menu or search function above