Understanding LlamaStack Distributions¶
This guide explains the different ways to deploy LlamaStack using the Kubernetes operator, focusing on the distinction between Supported Distributions and Bring-Your-Own (BYO) Distributions.
Distribution Types Overview¶
The LlamaStack Kubernetes Operator supports two main approaches for deploying LlamaStack:
๐ฏ Supported Distributions (Recommended)¶
Pre-configured, tested distributions maintained by the LlamaStack team with specific provider integrations.
๐ ๏ธ Bring-Your-Own (BYO) Distributions¶
Custom container images that you build and maintain yourself.
Supported Distributions¶
What are Supported Distributions?¶
Supported distributions are pre-built, tested container images that include: - โ Specific provider integrations (Ollama, vLLM, NVIDIA, etc.) - โ Optimized configurations for each provider - โ Tested compatibility with the operator - โ Regular updates and security patches - โ Documentation and examples
Available Pre-Built Distributions¶
Self-Hosted Distributions¶
These require you to provide the underlying infrastructure:
Distribution | Provider | Use Case | Infrastructure Required |
---|---|---|---|
ollama |
Ollama | Local inference with Ollama server | Ollama server |
vllm-gpu |
vLLM | High-performance GPU inference | GPU infrastructure |
tgi |
Text Generation Inference | Hugging Face TGI | TGI server setup |
nvidia |
NVIDIA | NVIDIA AI services | NVIDIA infrastructure |
remote-vllm |
Remote vLLM | Remote vLLM server | External vLLM server |
open-benchmark |
Benchmarking | Performance testing | Testing infrastructure |
External API Distributions¶
These only require API keys to external services:
Distribution | Provider | Use Case | Requirements |
---|---|---|---|
hf-endpoint |
Hugging Face | Hugging Face Inference Endpoints | HF API key |
hf-serverless |
Hugging Face | Hugging Face Serverless | HF API key |
bedrock |
AWS Bedrock | AWS Bedrock models | AWS credentials |
together |
Together AI | Together AI API | Together API key |
fireworks |
Fireworks AI | Fireworks AI API | Fireworks API key |
cerebras |
Cerebras | Cerebras inference | Cerebras API key |
sambanova |
SambaNova | SambaNova inference | SambaNova API key |
watsonx |
IBM watsonx | IBM watsonx models | IBM API key |
passthrough |
Generic | API passthrough | Target API access |
Using Supported Distributions¶
Basic Syntax¶
apiVersion: llamastack.io/v1alpha1
kind: LlamaStackDistribution
metadata:
name: my-distribution
spec:
server:
distribution:
name: "distribution-name" # Use distribution name
# ... other configuration
Example: Ollama Distribution¶
apiVersion: llamastack.io/v1alpha1
kind: LlamaStackDistribution
metadata:
name: ollama-llamastack
spec:
replicas: 1
server:
distribution:
name: "ollama"
containerSpec:
port: 8321
resources:
requests:
cpu: "1"
memory: "2Gi"
limits:
cpu: "2"
memory: "4Gi"
env:
- name: OLLAMA_URL
value: "http://ollama-server-service.ollama-dist.svc.cluster.local:11434"
storage:
size: "20Gi"
Example: vLLM GPU Distribution¶
apiVersion: llamastack.io/v1alpha1
kind: LlamaStackDistribution
metadata:
name: vllm-gpu-llamastack
spec:
replicas: 1
server:
distribution:
name: "vllm-gpu"
containerSpec:
port: 8321
resources:
requests:
cpu: "2"
memory: "8Gi"
nvidia.com/gpu: "1"
limits:
cpu: "4"
memory: "16Gi"
nvidia.com/gpu: "1"
env:
- name: MODEL_NAME
value: "meta-llama/Llama-2-7b-chat-hf"
- name: TENSOR_PARALLEL_SIZE
value: "1"
storage:
size: "50Gi"
Benefits of Supported Distributions¶
- ๐ Quick Setup: No need to build custom images
- ๐ Security: Regular security updates from LlamaStack team
- ๐ Documentation: Comprehensive guides and examples
- ๐งช Tested: Thoroughly tested with the operator
- ๐ง Optimized: Pre-configured for optimal performance
- ๐ Support: Community and official support available
Bring-Your-Own (BYO) Distributions¶
What are BYO Distributions?¶
BYO distributions allow you to use custom container images that you build and maintain: - ๐ ๏ธ Custom integrations not available in supported distributions - ๐จ Specialized configurations for your use case - ๐ง Custom dependencies and libraries - ๐ฆ Private or proprietary model integrations
Using BYO Distributions¶
Basic Syntax¶
apiVersion: llamastack.io/v1alpha1
kind: LlamaStackDistribution
metadata:
name: my-custom-distribution
spec:
server:
distribution:
image: "your-registry.com/custom-llamastack:tag" # Use custom image
# ... other configuration
Example: Custom Image¶
apiVersion: llamastack.io/v1alpha1
kind: LlamaStackDistribution
metadata:
name: custom-llamastack
spec:
replicas: 1
server:
distribution:
image: "myregistry.com/custom-llamastack:v1.0.0"
containerSpec:
port: 8321
resources:
requests:
cpu: "2"
memory: "4Gi"
limits:
cpu: "4"
memory: "8Gi"
env:
- name: CUSTOM_CONFIG_PATH
value: "/app/config/custom.yaml"
- name: API_KEY
valueFrom:
secretKeyRef:
name: custom-credentials
key: api-key
storage:
size: "100Gi"
podOverrides:
volumes:
- name: custom-config
configMap:
name: custom-llamastack-config
volumeMounts:
- name: custom-config
mountPath: "/app/config"
readOnly: true
Building Custom Images¶
Example Dockerfile¶
# Start from a supported distribution or base image
FROM llamastack/llamastack:latest
# Add your custom dependencies
RUN pip install custom-package-1 custom-package-2
# Copy custom configuration
COPY custom-config/ /app/config/
# Copy custom code
COPY src/ /app/src/
# Set custom environment variables
ENV CUSTOM_SETTING=value
# Override entrypoint if needed
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
Building and Pushing¶
# Build your custom image
docker build -t myregistry.com/custom-llamastack:v1.0.0 .
# Push to your registry
docker push myregistry.com/custom-llamastack:v1.0.0
BYO Distribution Considerations¶
Advantages¶
- ๐ฏ Full Control: Complete customization of the stack
- ๐ง Custom Integrations: Add proprietary or specialized providers
- ๐ฆ Private Models: Include private or fine-tuned models
- โก Optimizations: Custom performance optimizations
Responsibilities¶
- ๐ Security: You maintain security updates
- ๐งช Testing: You test compatibility with the operator
- ๐ Documentation: You document your custom setup
- ๐ Support: Limited community support for custom images
- ๐ Updates: You manage updates and compatibility
Key Differences Summary¶
Aspect | Supported Distributions | BYO Distributions |
---|---|---|
Setup Complexity | โ Simple (just specify name) | ๐ง Complex (build & maintain image) |
Maintenance | โ Handled by LlamaStack team | โ Your responsibility |
Security Updates | โ Automatic | โ Manual |
Documentation | โ Comprehensive | โ You create |
Support | โ Community + Official | โ ๏ธ Limited |
Customization | โ ๏ธ Limited to configuration | โ Full control |
Testing | โ Pre-tested | โ You test |
Time to Deploy | โ Minutes | โฑ๏ธ Hours/Days |
Choosing the Right Approach¶
Use Supported Distributions When:¶
- โ Your use case matches available providers (Ollama, vLLM, etc.)
- โ You want quick setup and deployment
- โ You prefer maintained and tested solutions
- โ You need community support
- โ Security and updates are important
Use BYO Distributions When:¶
- ๐ ๏ธ You need custom provider integrations
- ๐ง You have specialized requirements
- ๐ฆ You use proprietary or private models
- โก You need specific performance optimizations
- ๐ฏ You have the expertise to maintain custom images
Migration Between Approaches¶
From Supported to BYO¶
# Before (supported)
spec:
server:
distribution:
name: "ollama"
# After (BYO)
spec:
server:
distribution:
image: "myregistry.com/custom-ollama:v1.0.0"
From BYO to Supported¶
# Before (BYO)
spec:
server:
distribution:
image: "myregistry.com/custom-vllm:v1.0.0"
# After (supported)
spec:
server:
distribution:
name: "vllm-gpu"
Best Practices¶
For Supported Distributions¶
- Start Simple: Begin with basic configuration
- Use Environment Variables: Configure via
env
section - Monitor Resources: Set appropriate resource limits
- Check Documentation: Review provider-specific guides
For BYO Distributions¶
- Base on Supported Images: Start from
llamastack/llamastack:latest
- Document Everything: Maintain clear documentation
- Test Thoroughly: Test with the operator before production
- Version Control: Tag and version your custom images
- Security Scanning: Regularly scan for vulnerabilities
Next Steps¶
- Configuration Reference - Detailed configuration options
- Basic Deployment - Simple deployment examples
- Production Setup - Production-ready configurations
- Custom Images Guide - Building custom images
- Troubleshooting - Common issues and solutions