LlamaStack Kubernetes Operator¶
The LlamaStack Kubernetes Operator provides a simple and efficient way to deploy and manage LlamaStack distributions in Kubernetes clusters.
Overview¶
LlamaStack is a comprehensive framework for building AI applications with Large Language Models (LLMs). This Kubernetes operator simplifies the deployment and management of LlamaStack distributions, providing:
- Easy Deployment: Deploy LlamaStack with a single Kubernetes resource
- Scalability: Automatically scale LlamaStack instances based on demand
- Storage Management: Persistent storage for models and data
- Configuration Management: Flexible configuration options for different use cases
- Monitoring: Built-in observability and health checks
Quick Start¶
Get started with the LlamaStack Operator in just a few steps:
-
Install the Operator
-
Deploy a LlamaStack Instance
-
Apply the Configuration
Key Features¶
🚀 Simple Deployment¶
Deploy LlamaStack distributions with minimal configuration using Kubernetes-native resources.
📈 Auto-scaling¶
Automatically scale your LlamaStack instances based on resource utilization and demand.
💾 Persistent Storage¶
Built-in support for persistent storage to maintain models, cache, and application data.
🔧 Flexible Configuration¶
Support for multiple LlamaStack distributions and custom container images.
📊 Observability¶
Integrated monitoring, logging, and health checks for production deployments.
🔒 Security¶
Security best practices with RBAC, network policies, and secure defaults.
Architecture¶
graph TD
A[LlamaStackDistribution CRD] --> B[Operator Controller]
B --> C[Deployment]
B --> D[Service]
B --> E[ConfigMap]
B --> F[PersistentVolumeClaim]
C --> G[LlamaStack Pod 1]
C --> H[LlamaStack Pod 2]
C --> I[LlamaStack Pod N]
G --> J[Storage Volume]
H --> J
I --> J
D --> K[Load Balancer]
K --> L[External Access]
Use Cases¶
Development and Testing¶
- Quick setup for development environments
- Testing different LlamaStack configurations
- Prototyping AI applications
Production Deployments¶
- Scalable LlamaStack deployments
- High availability configurations
- Enterprise-grade security and monitoring
Multi-tenant Environments¶
- Isolated LlamaStack instances per team
- Resource quotas and limits
- Namespace-based separation
Getting Started¶
Ready to get started? Check out our comprehensive guides:
- Installation Guide - Install the operator in your cluster
- Quick Start Tutorial - Deploy your first LlamaStack instance
- Configuration Guide - Learn about configuration options
Documentation¶
- API Reference - Complete API documentation
- How-to Guides - Task-oriented guides
- Examples - Real-world configuration examples
- Contributing - Development and contribution guide
Community¶
- GitHub: llamastack/llama-stack-k8s-operator
- Issues: Report bugs and request features
- Discussions: Community discussions
License¶
This project is licensed under the Apache License 2.0. See the LICENSE file for details.