Deploy ELK with SearchGuard and ElastAlert on AKS

This post is about some notes/hints/tips/traps of deploying a production-level Elasticsearch, Logstash and Kibana (ELK) stack with the free open-source security add-ons/plugins SearchGuard and ElastAlert on Azure Kubernetes Service (AKS).

Let me post some important references here and express my big thanks for the authors:

Here are some of my notes:

Preparation of AKS

  • AKS nodes are not directly reachable. To access them, one has to use temp pods as brokers as described in this official document. I personally think this is a good practice as the temp pods will be destroied after the one-time usage.

ELK docker images customization

Private docker registry accesses

  • To let K8s be accessible to some private docker registry, please DON’T manually connect to each K8s physical node to run docker login, which is a very ugly, error-prone and difficult to maintain way (think about if some nodes crash and some new ones are added when scaling out, or IP addresses are changed for some reason..)
  • The correct way is to create a Secret in your K8s and tell docker to use it when pulling images as described in the official doc Pull an Image from a Private Registry

Necessary extra configurations for SearchGuard

  • network.host: 0.0.0.0 is a must-have config. otherwise the DNS resolution will fail => master selection will fail => first data node will fail => no extra data nodes can get up.
  • it’s better to put configuation entries for SearchGuard in elasticsearch.yml and define it in ConfigMap, then mount it as a volumn with subpath
  • the health checking endpoint for data nodes has to be changed from /_cluster/health to /_searchguard/health

Post-running of init_sg.sh

The command looks like:

1
kubectl -n elk exec --tty=false elasticsearch-data-0 -- bin/init_sg.sh

  • only on the first data node is sufficient. not on master nodes and other data nodes.

Trouble shooting

to be continued…