Using Azure Kubernetes Service with Grafana and Prometheus
Kubernetes is a popular choice for building cloud-native applications because it is scalable, available, and easy to use. It also has many other advantages, such as:
- Automated deployment and scaling: Kubernetes can automatically deploy and scale applications based on demand. This ensures that your applications are always running at peak performance.
- Self-healing: Kubernetes can automatically heal itself by restarting or replacing unhealthy containers. This helps to ensure that your applications are always available.
- Load balancing: Kubernetes can load balance traffic across multiple containers, ensuring that no single container is overloaded.
- Storage orchestration: Kubernetes can automatically mount storage volumes to containers, making it easy to manage persistent data.
- Security: Kubernetes provides a number of security features, such as RBAC (role-based access control) and network policies.
- Multi-cloud support: Kubernetes can be deployed on any cloud platform, making it a good choice for organizations that want to be cloud-agnostic.
Understand the tooling
Prometheus is an open-source monitoring system that was originally created by SoundCloud. It is a popular choice for monitoring cloud-native applications because it is scalable, reliable, and easy to use.
Prometheus collects metrics from various sources, such as applications, services, and infrastructure components. It stores these metrics in a time series database, which allows Prometheus to track the historical performance of your systems.
Prometheus uses an exporter architecture. Exporters are small, lightweight software programs that collect metrics from a variety of sources and expose them in a format that Prometheus can understand.
Once Prometheus discovers a new exporter, it will start collecting metrics from that source and store them in its database. Prometheus can also be configured to scrape metrics from specific endpoints.
Grafana is a visualization tool that can be used to view the metrics collected by Prometheus. Grafana can create charts, graphs, and dashboards to help you understand the data collected by Prometheus.
Together, Prometheus and Grafana form a powerful monitoring system that can help you keep your systems running smoothly.
Here are some additional benefits of using Prometheus and Grafana:
- Scalability: Prometheus is designed to be scalable, so it can be used to monitor large and complex systems.
- Reliability: Prometheus is a reliable system that has been used by some of the world’s largest companies.
- Ease of use: Prometheus is easy to use and configure.
- Open source: Prometheus is an open-source project, so it is free to use and modify.
Prerequisites
We will create a Kubernetes cluster using Azure Kubernetes Service (AKS), You will need an Azure account, the Azure CLI, Kubectl and Helm.
- Azure account
- Azure CLI
- Kubectl
- Helm
You will be able to install the latest versions of Kubectl and Helm using the Azure CLI or install them manually if you prefer.
Install the CLI tools on your local machine since you will need a forward a local port to access both the Prometheus and Grafana web interfaces.
Create an Azure Kubernetes Service (AKS) Cluster
Sign into the Azure CLI by running the login command.
az login
Install or update kubectl.
az aks install-cli
Create two bash/zsh variables which we will use in subsequent commands. You may change the syntax below if you are using another shell.
RESOURCE_GROUP=aks-prometheus
AKS_NAME=aks1
Create a resource group. We have chosen to create this in the eastus Azure region.
az group create --name $RESOURCE_GROUP --location eastus
Create a new AKS cluster using the az aks create command. Here we create a 3-node cluster using the B-series Burstable VM type which is cost-effective and suitable for small test/dev workloads such as this.
az aks create --resource-group $RESOURCE_GROUP \
--name $AKS_NAME \
--node-count 3 \
--node-vm-size Standard_B2s \
--generate-ssh-keys
This will take a few minutes to complete.
Authenticate to the cluster we have just created.
az aks get-credentials \
--resource-group $RESOURCE_GROUP \
--name $AKS_NAME
We can now access our Kubernetes cluster with kubectl. Use kubectl to see the nodes we have just created.
kubectl get nodes
Install Grafana and Prometheus
Prometheus can be installed either by using Helm or by using the official operator step by step. We’ll use the Helm chart because it’s quick and easy.
Add its repository to our repository list and update it.
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
Install the Helm chart into a namespace called monitoring, which will be created automatically.
helm install prometheus \
prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--create-namespace
The helm command will prompt you to check on the status of the deployed pods.
kubectl --namespace monitoring get pods -l "release=prometheus"
Make sure the pods are all “Running” before you continue. If in the unlikely circumstance, they do not reach the running state, you may want to troubleshoot them.
Explore the Prometheus and Grafana web interfaces
By default, all the monitoring options for Prometheus will be enabled.
Create a port forward to access the Prometheus query interface.
kubectl port-forward --namespace monitoring svc/prometheus-kube-prometheus-prometheus 9090
Open http://localhost:9090 in your web browser and explore the UI to see the raw metrics inside Prometheus.
The default username for Grafana is admin and the default password is prom-operator. You can change it in the Grafana UI later.
Note: To ensure security, do not expose your Prometheus or Grafana endpoints to the public internet using a Service or Ingress.
Go to Dashboards -> Manage where you will see many dashboards that have been created for you.
Since AKS is a managed Kubernetes service, it doesn’t allow you to see internal components such as the etcd store, the controller manager, the scheduler, etc. So, there’s no point in even trying to get those metrics out of the cluster because we won’t make it. Let’s just disable this option by upgrading our Prometheus release:
helm upgrade prometheus \
prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--set kubeEtcd.enabled=false \
--set kubeControllerManager.enabled=false \
--set kubeScheduler.enabled=false
Once executed, the output won’t change for you, the dashboard will continue to be empty, but we won’t be wasting resources trying to get its metrics.
Note: If you are running an older version of Kubernetes, it might be necessary to turn off the https metrics serving from the kubelet, since they expose the metrics over HTTP. For this, you’ll need to set the kubelet.serviceMonitor.https parameter in the helm chart to false:
helm upgrade prometheus \
prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--set kubeEtcd.enabled=false \
--set kubeControllerManager.enabled=false \
--set kubeScheduler.enabled=false \
--set kubelet.serviceMonitor.https=false
If you would like to clean up the Azure resources, run the following command which will delete everything in your resource group and avoid ongoing billing for these resources.
az group delete --name $RESOURCE_GROUP
Hope you enjoy monitoring cloud-native applications with Prometheus and Grafana!