Platform · Deployment

SaaS, Private Cloud, or On-Prem.

Container-portable from day one - your data residency, your control model. The same Docker images run in every deployment model. Governance, observability, and the agent runtime are identical across all of them.

Overview

LuMay ships as Docker containers. The same images run in every deployment model - LuMay-managed SaaS, your cloud subscription, your private VPC, or your on-prem Kubernetes cluster. Governance, observability, and the agent runtime are identical across all models. The only thing that changes is who manages the infrastructure.

This matters for regulated enterprises. Your compliance team shouldn't have to assess a different security architecture depending on where the product is deployed. With LuMay, the same RBAC, RLS, audit logging, and PII controls that apply in the SaaS model apply identically in the on-prem model - because they're embedded in the application, not in the infrastructure configuration.

Deployment choice does not change the core

LuMay SaaS

Management API
Voice Agent
Data Services
Observability
Managed

Customer Cloud

Management API
Voice Agent
Data Services
Observability
BYOC

Private Cloud

Management API
Voice Agent
Data Services
Observability
VPC

On-Prem K8s

Management API
Voice Agent
Data Services
Observability
Data Center

The repeated shape communicates portability more strongly than any claim.

Deployment Models

ModelWho manages infraWhere data livesBest for
LuMay SaaSLuMayLuMay's Azure tenancyFast start; teams that don't want to manage infrastructure
Customer cloud (BYOC)LuMay manages apps; you own the subscriptionYour Azure or AWS subscriptionData-residency requirements; shared responsibility model
Private cloud / VPCYour ops team or LuMayYour VPC with network isolationRegulated industries with strict network controls
On-prem KubernetesYour ops teamYour data centreFull air-gap capability; defence, government, high-security environments
HybridSplit between LuMay and customerManagement plane in cloud; data plane on-premEnterprises with mixed environments and evolving data strategies

Build And Delivery Stack

ComponentTechnology
Container buildDocker multi-stage builds (digest-pinned base images, non-root user)
Container registryAzure Container Registry (ACR) - used across all deployment models
Default runtimeAzure Container Apps (ACA) with auto-scaling - used for SaaS and BYOC
Alternative runtimeAny Kubernetes distribution for private cloud, on-prem, and hybrid
CI/CDGitHub Actions with GitVersion for semantic versioning
Database migrationsAlembic - run post-deploy as a separate step, not baked into container startup
StoragePersistent volume claims for call recordings and voice samples
ObservabilityOpenTelemetry Collector → Grafana Tempo (traces), Prometheus (metrics), Loki (logs)

Scaling

ServiceMin replicasMax replicasScale trigger
Management API210CPU > 70% or p95 request latency > 500 ms
Voice Agent Engine120Concurrent WebSocket connections (active_calls_total metric)

The Management API maintains a minimum of 2 replicas for availability. The Voice Agent Engine maintains a minimum of 1 replica to prevent cold-start latency on inbound calls - scale-to-zero is disabled for the voice service.

Which deployment model fits your environment?

Book a 30-minute deployment review. We'll assess your data residency requirements, network controls, and compliance constraints, and recommend the right model.

Hi there! I'm MyLu!
Your Autonomous AI Guide