LlamaStack Kubernetes Operator¶
The LlamaStack Kubernetes Operator provides a simple and efficient way to deploy and manage LlamaStack distributions in Kubernetes clusters.
Overview¶
LlamaStack is a comprehensive framework for building AI applications with Large Language Models (LLMs). This Kubernetes operator simplifies the deployment and management of LlamaStack distributions, providing:
- Easy Deployment: Deploy LlamaStack with a single Kubernetes resource
- Scalability: Automatically scale LlamaStack instances based on demand
- Storage Management: Persistent storage for models and data
- Configuration Management: Flexible configuration options for different use cases
- Monitoring: Built-in observability and health checks
Quick Start¶
Get started with the LlamaStack Operator in just a few steps:
- 
Install the Operator 
- 
Deploy a LlamaStack Instance 
- 
Apply the Configuration 
Key Features¶
🚀 Simple Deployment¶
Deploy LlamaStack distributions with minimal configuration using Kubernetes-native resources.
📈 Auto-scaling¶
Automatically scale your LlamaStack instances based on resource utilization and demand.
💾 Persistent Storage¶
Built-in support for persistent storage to maintain models, cache, and application data.
🔧 Flexible Configuration¶
Support for multiple LlamaStack distributions and custom container images.
📊 Observability¶
Integrated monitoring, logging, and health checks for production deployments.
🔒 Security¶
Security best practices with RBAC, network policies, and secure defaults.
Architecture¶
graph TD
    A[LlamaStackDistribution CRD] --> B[Operator Controller]
    B --> C[Deployment]
    B --> D[Service]
    B --> E[ConfigMap]
    B --> F[PersistentVolumeClaim]
    C --> G[LlamaStack Pod 1]
    C --> H[LlamaStack Pod 2]
    C --> I[LlamaStack Pod N]
    G --> J[Storage Volume]
    H --> J
    I --> J
    D --> K[Load Balancer]
    K --> L[External Access]Use Cases¶
Development and Testing¶
- Quick setup for development environments
- Testing different LlamaStack configurations
- Prototyping AI applications
Production Deployments¶
- Scalable LlamaStack deployments
- High availability configurations
- Enterprise-grade security and monitoring
Multi-tenant Environments¶
- Isolated LlamaStack instances per team
- Resource quotas and limits
- Namespace-based separation
Getting Started¶
Ready to get started? Check out our comprehensive guides:
- Installation Guide - Install the operator in your cluster
- Quick Start Tutorial - Deploy your first LlamaStack instance
- Configuration Guide - Learn about configuration options
Documentation¶
- API Reference - Complete API documentation
- How-to Guides - Task-oriented guides
- Examples - Real-world configuration examples
- Contributing - Development and contribution guide
Community¶
- GitHub: llamastack/llama-stack-k8s-operator
- Issues: Report bugs and request features
- Discussions: Community discussions
License¶
This project is licensed under the Apache License 2.0. See the LICENSE file for details.