Security guide

About This Guide

This security guide focuses on iGuard, InspectChat's comprehensive safeguard system designed to protect your organization's data and ensure secure AI interactions. This guide explains the various security measures and how they work to protect your communications.

Understanding iGuard

iGuard is InspectChat's built-in security system that acts as a protective layer between users and AI models. It automatically scans all communications for potential security risks, sensitive information, and policy violations.

Key Functions

Real-time message scanning
Automatic detection of sensitive information
Policy enforcement
Compliance monitoring
Immediate blocking of risky communications

Security Validations

1. Detect Secrets

Purpose: Prevents accidental sharing of sensitive credentials and keys.

What It Detects: - API Keys - Authentication Tokens - Passwords - SSH Keys - Database Connection Strings

Example Block Message: Whenever the safeguards system detects secrets following message will be displayed to the user. "🔐 -> Potential sensitive information detected: Please ensure you're not sharing any confidential data, passwords, or access keys."

Figure 1: Secret Detection Block Message

2. Detect PII (Personal Identifiable Information)

Purpose: Protects personal and sensitive information from exposure.

What It Detects:

Personal Information

📝 Social Security Numbers (SSN)
💳 Credit Card Numbers
📧 Email Addresses
📱 Phone Numbers (International formats)
🏠 Physical Addresses
🛂 Passport Numbers
🚗 Driver's License Numbers
📅 Birth Dates
👤 Person Names (First, Middle, Last)

Financial Information

🏦 Bank Account Numbers
💰 IBAN Codes
💵 Swift Codes
💳 CVV Numbers

Medical Information

🏥 Medical License Numbers
📋 Medical Record Numbers

Location Information

📍 IP Addresses
📫 ZIP/Postal Codes
🌍 GPS Coordinates
🏢 Location Identifiers

Government Identifiers

🪪 National ID Numbers
🏛️ Government Official Numbers

Digital Identifiers

💻 MAC Addresses
🌐 URLs containing personal info
📱 Device IDs
🔑 Cryptocurrency Addresses

Professional Information

👔 Employee Numbers
🏢 Corporate Email Patterns

Cultural Identifiers

🌍 Nationality
🗣️ Ethnicity
⛪ Religious Identifiers

Example Block Message: Whenever the safeguards system detects personally identifiable information following message will be displayed to the user: "**🔒 -> Personal information detected: For your privacy and security please avoid sharing sensitive information." Example Warn Message:

3. Sentiment Analysis

Purpose: Maintains professional communication standards and prevents harmful content.

Monitors For: - Hostile Language - Inappropriate Content - Unprofessional Tone - Harassment - Discriminatory Language

Threshold Settings: - Low Risk (0.3): Minor unprofessional language - Medium Risk (0.6): Concerning tone or content - High Risk (0.8): Severe violations

Example Block Message

4. Unusual Prompt Detection

Purpose: Identifies potentially harmful or suspicious requests.

Monitors For: - Code Injection Attempts - Prompt Engineering Attacks - System Command Requests - Policy Violation Attempts

Example Block Message

5. DetectSafeUnsafePrompt

Purpose: Blocks prompts that attempt to request unsafe or inappropriate content across multiple models, ensuring adherence to ethical and legal guidelines.

Capabilities:
DetectSafeUnsafePrompt is a robust system that identifies and blocks unsafe or harmful prompts across 13 categories, including both text-based and image-based inputs. It ensures that communication and content generation comply with organizational policies, ethical standards, and regulatory requirements.

Example Block Messages: ⚠️Safety check failed: Your request contains potentially harmful, unsafe, or inappropriate content.

Text Prompt Example:
Image Prompt Example:

Note: For DetectSafeUnsafePrompt to function correctly, ensure that Llama Guard is deployed and its URL is correctly set in the .env file. Add the following line to your .env file:
LLAMA_GUARD_URL=http://localhost:8888
Adjust the URL as needed based on your deployment configuration. For detailed deployment instructions, please refer to the Llama Guard Deployment Guide.

Configuring Safeguards

Administrators can customize iGuard settings:

Enable/Disable Validations: Control which checks are active.
Set Thresholds: Adjust sensitivity levels.
On Fail Actions: Define system responses (block or warn).

Administrators can customize iGuard settings in real-time through declarative configuration files, without the need to restart the system. This allows for immediate adaptation to new policies or threats, ensuring continuous protection and compliance.

Declarative Configuration for iGuard

Configurations for iGuard are defined in a YAML file, allowing for clear and human-readable settings. Changes to this configuration are applied in real-time, enabling administrators to adjust validations on-the-fly.

Here's an example of how the configuration can be set:

validations:
  - name: DetectSecrets
    enabled: True
    models:
      - OpenAI
      - Gemini
    on_fail: block

  - name: DetectPII
    enabled: True
    models:
      - OpenAI
      - Gemini
    on_fail: block
    mode: permissive

  - name: Sentiment
    enabled: True
    models:
      - OpenAI
      - Gemini
    on_fail: block
    threshold: 0.5

  - name: DetectUnusualPrompt
    enabled: True
    models:
      - OpenAI
      - Gemini
    on_fail: block

- name: DetectSafeUnsafePrompt
    enabled: True
    models:
      - OpenAI
      - Gemini
    on_fail: block

Response Actions

The `on_fail` Parameter

Determines how the system responds when a validation fails:

Block: Stops the request and notifies the user.
Warn: Allows the request but issues a warning.

Block Mode

Immediately stops the message
Displays error message
Logs the incident

Warn Mode

Shows warning message to the user.
Logs the warning