Open WebUI is a fork of LibreChat, an open source AI chat platform that we have extensively discussed on our blog and integrated on behalf of clients. Where LibreChat integrates with any well-known remote or local AI service on the market, Open WebUI is focused on integration with Ollama — one of the easiest ways to run & serve AI models locally on your own server or cluster. In fact, Open WebUI was formerly known as Ollama WebUI but changed its name, because it is developed by a separate team from Ollama.
There is growing interest in deploying local AI solutions such as Open WebUI + Ollama using Kubernetes, to facilitate scalability beyond a single server (a use case that is easily fulfiled by Docker Compose) and to consolidate the infrastructure with other apps which might use the local inference cluster.
This article describes a basic, production ready Open WebUI + Ollama deployment using AKS, Azure’s managed Kubernetes service. There are some Kubernetes resource types and objects pertaining to cert-manager and ingress-nginx that we have to add to the bare cluster, in order to access the Open WebUI service securely.
You need the Azure CLI, kubectl, and Helm installed on your local machine to complete this deployment. This tutorial assumes that you already have an Azure subscription, an AKS cluster with a minimum of one node in the nodePool deployed, and a DNS host with API for programmatically updating the records in your domain’s zone. You should use the az login, az configure, and az aks install-cli commands to set the default resource group and cluster name, as well as authenticating your kubectl command to the AKS cluster by creating a kubeconfig at ~/.kube/config.
- We will use cert-manager to issue a wildcard certificate that is valid for all subdomains of the 2nd level domain (SLD), and renewed automatically by the Kubernetes ClusterIssuer and Certificate objects through a DNS challenge at _acme-challenge.example.com before expiration every 90 days. For the example, the domain is delegated to Cloudflare DNS and the ACME issuer we will use is Let’s Encrypt.
- The NGINX Ingress Controller, represented by the Kubernetes IngressClass and Ingress objects, reverse proxies the HTTPS web requests made to the Open WebUI Service through the default Azure load balancer named “kubernetes” provisioned initially with the AKS cluster. As the Azure load balancer supports routing TCP transport layer traffic on L4 of the OSI model only, the SSL must be terminated at the NGINX ingress as opposed to the LB.
- There is also the option of using an Azure Application Gateway Ingress Controller (AGIC), which routes application layer traffic on L7 and terminates SSL using 1) a certificate bundle uploaded in PFX format, or 2) obtained from an ACME issuer through a managed app using workload identity to modify Azure DNS with the DNS Zone Contributor role.
- The cost of using an Azure Application Gateway (AGW) is considerably more than reusing the existing Azure Load Balancer with NGINX ingress, as there is a $0.20/hr ($146/mo) charge for the AGW in addition to capacity units and outbound data transfer. Microsoft has introduced a basic SKU for the Application Gateway at $0.0225/hr but it is still in preview (i.e. not available in all regions), and it is not perfectly clear if the basic SKU is compatible with AGIC for an AKS clsuter or not. There is also the additional complexity of setting up an additional Entra Workload ID or service principal, which is a requirement for AGIC to access Azure Resource Manager (ARM) to manage the AGW.
Create Kubernetes Deployments and Services (ClusterIP and NodePort)
First, it is necessary to deploy Ollama for model serving, followed by the Open WebUI frontend to the Kubernetes cluster using the Deployment and Service objects. For production, it is of course, best practice to expose a Pod only through an Ingress using a ClusterIP Service — so that it is accessible publicly only with HTTPS. To illustrate the functioning of the services for testing though, we will also expose the Pod on the Kubernetes node’s external IP through a NodePort Service.
This example also assumes that you have enabled the option to assign a public IP address to your Kubernetes node(s) in the nodePool at the time of deploying the AKS cluster.
It is needed to define two Persistent Volume Claims (PVC) which is the Kubernetes resource that is equivalent to Docker volumes for “persisting” data beyond the lifecycle of a container, in this case, Pod — one PVC called ollama-data for the /root/.ollama path storing downloaded model weights inside the ollama Pod, and one PVC called open-webui-data for the /app/backend/data path storing app data for Open WebUI inside that Pod.
We use the managed-csi storageClass that is pre-configured on AKS clusters to create a 32Gi Azure managed disk for each Deployment. It is a Kubernetes CSI driver provided by Microsoft that works similarly to REX-Ray, a storage plugin by Dell Technologies for Docker Swarm that is no longer actively under development. Note that Azure will deploy the closest disk size that meets (or exceeds) the amount of storage that you request in your Deployment — the managed-csi storageClass uses a Standard SSD with LRS (locally redundant storage). In this particular case as our deployment definition calls for 32Gi, it will use an E4 disk size.
If you wish to use other SSD types, such as the Premium SSD or Premium SSD v2 SKUs for higher performance or more durability with ZRS (zone redundant storage), you can use the pre-configured storageClasses like managed-csi-premium listed by the kubectl describe storageClasses command or create your own custom storageClass.
As the accessMode is ReadWriteOnce (RWO), an Azure Disk created in this way can only be mounted to one Pod at a time — if you require a shared storage that can be simultaneously mounted and written to by multiple Pods (running across different nodes as a Deployment or StatefulSet), then the Azure Files CSI driver that provides a managed file share which supports ReadWriteMany (RWM) may be a more suitable solution.
Create the following Kubernetes YAML files on your local machine and apply them to your AKS cluster using the kubectl apply -f <file name> command.
ollama.yaml
# ollama.yaml apiVersion: apps/v1 kind: Deployment metadata: name: ollama spec: replicas: 1 selector: matchLabels: app: ollama template: metadata: labels: app: ollama spec: containers: - name: ollama image: ollama/ollama:latest ports: - containerPort: 11434 volumeMounts: # <--- Add volume mounts - name: ollama-data mountPath: /root/.ollama volumes: # <--- Add volumes - name: ollama-data persistentVolumeClaim: claimName: ollama-pvc priorityClassName: system-node-critical strategy: type: Recreate --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: ollama-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 32Gi # adjust the storage size as needed storageClassName: managed-csi # reference the StorageClass --- apiVersion: v1 kind: Service metadata: name: ollama-service spec: selector: app: ollama ports: - name: http port: 11434 targetPort: 11434 protocol: TCP type: ClusterIP --- apiVersion: v1 kind: Service metadata: name: ollama-service-node-port spec: selector: app: ollama ports: - name: http port: 11434 targetPort: 11434 nodePort: 31434 protocol: TCP type: NodePort
openwebui.yaml
# openwebui.yaml apiVersion: apps/v1 kind: Deployment metadata: name: open-webui spec: replicas: 1 selector: matchLabels: app: open-webui template: metadata: labels: app: open-webui spec: containers: - name: open-webui image: ghcr.io/open-webui/open-webui:main ports: - containerPort: 8080 env: - name: OLLAMA_BASE_URL value: http://ollama-service.default.svc.cluster.local:11434 volumeMounts: - name: open-webui-data mountPath: /app/backend/data volumes: - name: open-webui-data persistentVolumeClaim: claimName: open-webui-pvc priorityClassName: system-node-critical strategy: type: Recreate --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: open-webui-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 32Gi # adjust the storage size as needed storageClassName: managed-csi # reference the StorageClass --- apiVersion: v1 kind: Service metadata: name: open-webui spec: selector: app: open-webui ports: - name: http port: 3000 targetPort: 8080 nodePort: 30080 type: NodePort
After creating the Deployments and Services, you should be able to pull your desired models either through the ollama run command inside the Pod, or using the Settings > Models part of Open WebUI once you have set the connection to Ollama by specifying http://ollama-service.default.svc.cluster.local:11434 as the Ollama Base URL in Connections after creating the initial user account.
You should now be able to access the Open WebUI dashboard (insecurely) in a web browser at your Kubernetes node’s public IP followed by the node port number, 30080 provided that inbound traffic to the node port range (30000-32767) is allowed for your AKS cluster’s network security group (NSG) — for example http://52.179.xxx.xx:30080. To get the public IP of your Kubernetes node, you can run:
$ kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME aks-agentpool-60568521-vmss000000 Ready agent 10d v1.28.9 10.224.0.4 52.179.xxx.xx Ubuntu 22.04.4 LTS 5.15.0-1060-azure containerd://1.7.15-1

To pull your desired model by executing a command inside the Ollama Pod, use the following kubectl commands to get the name of the running Pod and exec into it. You can find a list of available models at the Ollama library. If the Kubernetes node running your Ollama Pod is a VM size with CPUs only, then tinyllama (1.1B), phi (2.7B), or llama3 (8B) could be good initial models to try. If your Kubernetes node is running on a VM size with a GPU, then you can practically try any model, including the larger models such as llama3 with 70B parameters or more.
$ kubectl get pods NAME READY STATUS RESTARTS AGE ollama-69dd5867c4-6tc9g 1/1 Running 0 4d9h open-webui-85cfd5b9f6-p868t 1/1 Running 0 4d9h $ kubectl exec <pod name> -c ollama ollama run <model name>
Install cert-manager on AKS using Helm & issue the Let’s Encrypt wildcard certifcate
The next step will be to install cert-manager using Helm, and create the ClusterIssuer and a Certificate resource and object to issue a wildcard cert from Let’s Encrypt for accessing Open WebUI. Without enabling HTTPS, your login credentials for Open WebUI and chat contents could be intercepted. It is therefore recommended to create the initial Open WebUI account only after the Ingress has been set up, or to change the OWUI password immediately after having done so.
Because the dns-01 challenge for issuing a wildcard cert requires the ClusterIssuer to automatically create a CNAME record at _acme-challenge.example.com each time the certificate is issued or renewed, we must also generate a Cloudflare API Key and store it in a Kubernetes secret in the same namespace as cert-manager.
Deploy the Helm chart for cert-manager in a new namespace for cert-manager as follows:
$ helm repo add jetstack https://charts.jetstack.io $ helm repo update $ helm upgrade cert-manager jetstack/cert-manager \ --install \ --create-namespace \ --wait \ --namespace cert-manager \ --set installCRDs=true
Log in to the Cloudflare dashboard and generate an API token with the Zone.DNS.Edit and Zone.DNS.Read permissions. It is important to distinguish an API token, which allows setting “fine grained” permissions as opposed to an API key, which provides global access to the entire account.

Create a Kubernetes Secret called cloudflare-api-token-secret in the cert-manager Namespace.
# cf-api-token.yaml apiVersion: v1 kind: Secret metadata: name: cloudflare-api-token-secret namespace: cert-manager type: Opaque stringData: api-token: <API token>
Then, create the ClusterIssuer by applying the following YAML file. The difference between an Issuer (as described in the cert-manager docs) and a ClusterIssuer is that all namespaces can use a ClusterIssuer to issue a Certificate, but an Issuer can only be referenced within the same namespace.
# clusterissuer-letsencrypt-cf.yaml apiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: letsencrypt-cf namespace: default spec: acme: server: https://acme-v02.api.letsencrypt.org/directory email: <email address> # Name of a secret used to store the ACME account private key privateKeySecretRef: name: letsencrypt-cf solvers: - selector: {} dns01: cloudflare: email: <email address> apiTokenSecretRef: name: cloudflare-api-token-secret key: api-token
Finally, issue the Let’s Encrypt wildcard certificate using cert-manager by applying this YAML to create the Certificate object.
# certificate-cf-prod.yaml apiVersion: cert-manager.io/v1 kind: Certificate metadata: name: <cert name> namespace: default spec: secretName: <cert name>-tls-prod issuerRef: name: letsencrypt-cf kind: ClusterIssuer dnsNames: - '*.<example.com>'
You can check the issuance status by using the cmctl certificate status <cert name> command if you have it installed.
You should see output similar to below if the ClusterIssuer was able to access the Cloudflare API token through the Secret, create a CNAME record with the verification string provided by Let’s Encrypt’s ACME service, and successfully generate the certificate and store it in a Secret in the default namespace called <cert name>-tls-prod.
$ cmctl status certificate autoize-net Name: autoize-net Namespace: default Created at: 2024-05-16T19:29:25+02:00 Conditions: Ready: True, Reason: Ready, Message: Certificate is up to date and has not expired DNS Names: - *.autoize.net Events: <none> Issuer: Name: letsencrypt-cf Kind: ClusterIssuer Conditions: Ready: True, Reason: ACMEAccountRegistered, Message: The ACME account was registered with the ACME server Events: <none> Secret: Name: autoize-net-tls-prod Issuer Country: US Issuer Organisation: Let's Encrypt Issuer Common Name: R3 Key Usage: Digital Signature, Key Encipherment Extended Key Usages: Server Authentication, Client Authentication Public Key Algorithm: RSA Signature Algorithm: SHA256-RSA Subject Key ID: REDACTED Authority Key ID: REDACTED Serial Number: REDACTED Events: <none> Not Before: 2024-05-16T23:24:57+02:00 Not After: 2024-08-14T23:24:56+02:00 Renewal Time: 2024-07-15T23:24:56+02:00 No CertificateRequest found for this Certificate
Install the NGINX Ingress Controller using Helm & create an Ingress to Open WebUI
The last step to take before Open WebUI will be made available publicly over a secure connection is installing the NGINX Ingress Controller and creating the Ingress object in Kubernetes.
All major browsers (Chrome, Firefox) require a site to be properly secured with an HTTPS connection in order to access features such as the microphone, which is required for Open WebUI’s speech-to-text feature for transcribing voice prompts to the AI model.

To install the NGINX Ingress Controller, deploy the provided Helm chart to the AKS cluster from your local machine with the Helm CLI installed.
$ export NAMESPACE=ingress-basic $ helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx $ helm repo update $ helm install ingress-nginx ingress-nginx/ingress-nginx \ --create-namespace \ --namespace $NAMESPACE \ --set controller.service.annotations."service\.beta\.kubernetes\.io/azure-load-balancer-health-probe-request-path"=/healthz \ --set controller.service.externalTrafficPolicy=Local
As part of the Helm chart, a LoadBalancer resource will be deployed in your AKS cluster and an external IP address will be assigned to the “kubernetes” Azure load balancer. You may see <pending> while the IP is assigned. To check the status, invoke this command:
$ kubectl get services --namespace ingress-basic -o wide -w ingress-nginx-controller NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR ingress-nginx-controller LoadBalancer 10.0.253.25 51.8.xx.xx 80:31720/TCP,443:31762/TCP 2d17h app.kubernetes.io/component=controller, app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
From your domain’s DNS host (in this case, Cloudflare), create an A record pointing a subdomain (with the “orange cloud” for CDN unselected) to the EXTERNAL-IP associated with the LoadBalancer. For the example deployment, we are using owui.autoize.net.
Finally, apply this YAML to create a forwarding rule routing port 443 on the LoadBalancer’s External IP to port 3000 of the Open WebUI pod inside the Kubernetes cluster. Be sure to replace <example.com> with the actual domain name and <cert name> with the name previously specified when creating the Certificate resource.
# owui-ingress.yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: owui-ingress spec: ingressClassName: nginx rules: - host: owui.<example.com> http: paths: - pathType: Prefix backend: service: name: open-webui port: number: 3000 path: / tls: - hosts: - owui.<example.com> secretName: <cert name>-tls-prod