LLMInspect On-Premise Documentation
Welcome to the comprehensive documentation for LLMInspect On-Premise deployment. This documentation provides everything you need to deploy, configure, and manage your own secure AI platform within your organization's infrastructure.
Quick Navigation
🚀 Getting Started - Admin Guide
System administrators start here for deployment and infrastructure management.
- Installation Guide - Complete deployment instructions and prerequisites
- Installation Parameters - Configuration options and environment variables
- Accessing LLMInspect - Service endpoints and default credentials
- User Management - Keycloak configuration and user administration
- Daily Operations - Maintenance, monitoring, and troubleshooting
- Important Considerations - Security best practices and backup procedures
💬 Chat User Guide
End-users learn how to interact with AI models through the web interface.
- Getting Started - Login process and system requirements
- InspectChat GUI - Interface navigation and features
- Chat Examples - Practical usage scenarios and best practices
🖥️ Desktop User Guide
Native desktop application setup and configuration.
- Installation Options - Linux, macOS, and Windows applications
- Backend Configuration - Connecting to your on-premise server
- Troubleshooting - Common issues and solutions
🔌 API User Guide
Developers integrate LLMInspect into applications and workflows.
- API Overview - Multi-LLM support and OpenAI compatibility
- Authentication - API tokens and access management
- API Usage - Endpoints, examples, and model switching
- Reference - External documentation and resources
Platform Architecture
Your on-premise LLMInspect deployment includes these core components:
Component | Default Port | Purpose |
---|---|---|
Keycloak | 4116 | Authentication and user management |
Grafana | 5116 | Monitoring and observability dashboard |
InspectChat | 6116 | Web-based chat interface |
Admin Panel | 7116 | Security configuration and GuardRails |
API Gateway | 8116 | REST API for programmatic access |
What is LLMInspect On-Premise?
LLMInspect On-Premise is a self-hosted AI safety and observability platform that provides:
- 🏢 Complete Data Control - All data remains within your infrastructure
- 🤖 Multi-Model AI Access - Support for OpenAI, Google Gemini, and local LLMs
- 🛡️ Enterprise Security - Advanced GuardRails and content filtering
- 📊 Full Observability - Comprehensive monitoring and audit capabilities
- 🔧 Flexible Deployment - Docker-based containerized architecture
- 🔐 Identity Integration - LDAP/Active Directory support via Keycloak
Deployment Requirements
Minimum System Requirements
- CPU: 8+ cores
- RAM: 32GB minimum
- Storage: 100GB SSD
- Network: HTTPS with valid certificates
- Container Runtime: Docker Engine 20.10.x+ with Docker Compose 2.x+
Prerequisites
- Valid SSL/TLS certificates (or auto-generated self-signed for testing)
- Network access to required external services (if using cloud LLMs)
- Local LLM deployment with at least 70B parameters for private AI
Common Deployment Journeys
For System Administrators
- Review Prerequisites - Ensure system requirements are met
- Deploy Platform - Run the installation script
- Configure SSL - Set up secure communications
- Create Users - Add team members
- Set Up Monitoring - Configure health checks
For End Users
- Access InspectChat - Log in to web interface
- Select AI Model - Choose appropriate LLM
- Start Chatting - Begin AI interactions
- Upload Documents - Analyze files with AI
- Generate Images - Create visual content
For Developers
- Understand API - Learn OpenAI-compatible endpoints
- Configure Authentication - Set up API access
- Test Integration - Make first API calls
- Switch Models - Access different LLM providers
- Implement GuardRails - Understand security controls
Key Deployment Features
🔒 Self-Hosted Security
- Complete Data Sovereignty - No data leaves your infrastructure
- Custom GuardRails - Tailored content filtering and safety policies
- Enterprise Authentication - Integration with existing identity systems
- Audit Compliance - Complete interaction logging and reporting
🎯 AI Model Flexibility
- Public Cloud Models - OpenAI GPT-4, Google Gemini integration
- Private Local Models - Your own LLM deployments (InspectGPT)
- Hybrid Configurations - Mix of public and private model access
- Model Switching - Easy provider changes via headers or interface
📊 Comprehensive Monitoring
- Real-time Dashboards - Grafana-powered observability
- Usage Analytics - Track adoption, performance, and costs
- Security Monitoring - GuardRail violations and threat detection
- Performance Metrics - Response times and system health
Installation Quick Start
Basic Installation
# Extract deployment package
unzip llminspect-onpremise.zip
cd llminspect
# Run installer with auto-generated certificates
./llminspect-cli -p /full/path/to/llminspect -o install
Production Installation with Custom SSL
# Install with your own certificates
./llminspect-cli -p /full/path/to/llminspect -o install \
--cert-file /path/to/certificate.crt \
--cert-key /path/to/private.key \
--domain your-domain.com
Access Your Deployment
After successful installation, access these services:
Service | URL | Default Credentials |
---|---|---|
InspectChat | https://your-domain:6116 |
llminspect / llminspect_passw0rd |
Admin Panel | https://your-domain:7116 |
llminspect / llminspect_passw0rd |
Grafana | https://your-domain:5116 |
llminspect / llminspect_passw0rd |
Keycloak Admin | https://your-domain:4116/admin |
admin / [your-set-password] |
API Gateway | https://your-domain:8116/v1/ |
Bearer token required |
⚠️ Security Note: Change default passwords immediately after first login.
Quick Reference by User Type
👨💼 System Administrators:
- Installation Guide - Deploy the platform
- User Management - Manage team access
- Daily Operations - Maintain the system
- Security Best Practices - Protect your deployment
👤 End Users:
- Getting Started - First login and setup
- Chat Interface - Use the web application
- Desktop App - Install native applications
👨💻 Developers:
- API Overview - Understand the API architecture
- Authentication - Get API access
- Usage Examples - Implementation examples
📊 Analysts & Compliance:
- Monitoring Dashboard - System health
- Audit Capabilities - Compliance reporting
- Usage Analytics - Adoption insights
Most Common Tasks
Task | Documentation | Estimated Time |
---|---|---|
Complete platform installation | Installation Guide | 30-60 minutes |
First AI conversation | Text Generation | 2 minutes |
Set up desktop application | Desktop Setup | 5 minutes |
Make first API call | API Usage | 10 minutes |
Add new user account | User Management | 3 minutes |
Upload and analyze document | Document Upload | 5 minutes |
Configure LDAP integration | LDAP Setup | 15 minutes |
Support & Troubleshooting
Self-Service Resources
- System Logs: Check Docker container logs for issues
- Health Checks: Use built-in monitoring endpoints
- Configuration Validation: Verify environment variables and certificates
Common Issues
- Installation Failures: Use the clean operation before retrying
- Authentication Problems: Check Keycloak configuration and user credentials
- API Connection Issues: Verify network connectivity and port accessibility
- Desktop App Issues: Confirm backend URL configuration and server status
Security Considerations
Data Protection
- Encryption in Transit: HTTPS for all communications
- Encryption at Rest: Database and file system protection
- Access Control: Role-based permissions via Keycloak
- Audit Logging: Complete interaction tracking
Network Security
- Firewall Configuration: Restrict access to necessary ports only
- Certificate Management: Regular SSL/TLS certificate renewal
- VPN Integration: Secure remote access capabilities
- Network Segmentation: Isolate AI workloads from other systems
Next Steps
Choose your deployment path:
🔧 I'm setting up the platform → Start with Admin Guide
💬 I want to use the chat interface → Go to Chat User Guide
🖥️ I need the desktop application → Check Desktop User Guide
🔌 I'm integrating via API → Visit API User Guide
🔍 I need specific information → Use the navigation menu or search function above