Configure Storage¶
Learn how to configure persistent storage for your LlamaStack distributions.
Storage Overview¶
LlamaStack distributions can use persistent storage for:
- Model files and weights
- Configuration data
- Logs and metrics
- User data and sessions
Basic Storage Configuration¶
Default Storage¶
By default, LlamaStack uses ephemeral storage:
apiVersion: llamastack.io/v1alpha1
kind: LlamaStackDistribution
metadata:
name: basic-llamastack
spec:
image: llamastack/llamastack:latest
# No storage configuration = ephemeral storage
Persistent Storage¶
Enable persistent storage:
apiVersion: llamastack.io/v1alpha1
kind: LlamaStackDistribution
metadata:
name: persistent-llamastack
spec:
image: llamastack/llamastack:latest
storage:
size: "50Gi"
storageClass: "standard"
accessMode: "ReadWriteOnce"
Storage Classes¶
Available Storage Classes¶
Common storage classes and their use cases:
Storage Class | Performance | Use Case |
---|---|---|
standard |
Standard | General purpose |
fast-ssd |
High | Model inference |
slow-hdd |
Low | Archival storage |
Custom Storage Class¶
Create a custom storage class for LlamaStack:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: llamastack-storage
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
iops: "3000"
throughput: "125"
allowVolumeExpansion: true
Advanced Storage Configurations¶
Multiple Volumes¶
Configure separate volumes for different purposes:
apiVersion: llamastack.io/v1alpha1
kind: LlamaStackDistribution
metadata:
name: multi-volume-llamastack
spec:
image: llamastack/llamastack:latest
storage:
models:
size: "100Gi"
storageClass: "fast-ssd"
mountPath: "/models"
data:
size: "50Gi"
storageClass: "standard"
mountPath: "/data"
logs:
size: "10Gi"
storageClass: "standard"
mountPath: "/logs"
Shared Storage¶
Configure shared storage across replicas:
apiVersion: llamastack.io/v1alpha1
kind: LlamaStackDistribution
metadata:
name: shared-storage-llamastack
spec:
image: llamastack/llamastack:latest
replicas: 3
storage:
size: "200Gi"
storageClass: "nfs"
accessMode: "ReadWriteMany" # Allows multiple pods to mount
Storage Optimization¶
Performance Tuning¶
Optimize storage for model inference:
Cost Optimization¶
Use tiered storage for cost efficiency:
spec:
storage:
hot:
size: "50Gi"
storageClass: "fast-ssd"
mountPath: "/models/active"
warm:
size: "200Gi"
storageClass: "standard"
mountPath: "/models/cache"
cold:
size: "1Ti"
storageClass: "slow-hdd"
mountPath: "/models/archive"
Backup and Recovery¶
Automated Backups¶
Configure automated backups:
spec:
backup:
enabled: true
schedule: "0 2 * * *" # Daily at 2 AM
retention: "30d"
destination: "s3://my-backup-bucket"
Manual Backup¶
Create manual backups:
# Create a snapshot
kubectl create volumesnapshot llamastack-backup \
--from-pvc=llamastack-storage
# Restore from snapshot
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: llamastack-restored
spec:
dataSource:
name: llamastack-backup
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
EOF
Monitoring Storage¶
Storage Metrics¶
Monitor storage usage:
# Check PVC status
kubectl get pvc
# Check storage usage
kubectl exec -it <pod-name> -- df -h
# Check I/O metrics
kubectl top pods --containers
Alerts¶
Set up storage alerts:
# Prometheus alert for high storage usage
- alert: HighStorageUsage
expr: kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "High storage usage on {{ $labels.persistentvolumeclaim }}"
Troubleshooting¶
Common Issues¶
PVC Stuck in Pending:
Out of Space:
# Expand volume (if supported)
kubectl patch pvc <pvc-name> -p '{"spec":{"resources":{"requests":{"storage":"100Gi"}}}}'
Performance Issues:
# Check I/O wait
kubectl exec -it <pod-name> -- iostat -x 1
# Check storage class parameters
kubectl describe storageclass <storage-class>