The usual guides around Erlang/Elixir clustering involve setting up a StatefulSet and that is fine. This guide will instead detail how to achieve the same
result with a regular deployment, a headless service, an obscure CoreDNS k8s plugin option and some Erlang DNS discovery.
When starting up a distributed Erlang node (eg. with the -name <name> option) you’re instructing the VM to obtain the fully qualified domain name (ie. FQDN) of the host and assume that as the node name, for clustering purposes this
means it will only accept requests from other nodes that are able to reach the host through that name, here is a simple example that illustrates this:
Notice that imacpro.home is the same as the output of the hostname -f, this is the FQDN of this host.
Let’s try running a new node on another shell and cluster the two. We’ll do this and refer to n1 not by it’s FQDN but by the localhost address, it should still work right? After all we know that both nodes are running on the
same machine…
It failed! Ok, let’s try it out with localhost instead of 127.0.0.1:
Now it’s failing with even more error output, let’s try it out with the FQDN then:
Finally it was able to connect to n1 and cluster with it.
What our little experiment is telling us is that the FQDN that the node assumes and the one that other nodes use to cluster with it must match.
A practical check for this is trying out hostname -f and ensuring that using this name you are able to reach it from whatever other nodes that you’re looking to cluster.
Let’s now try and take this knowledge to Kubernetes and see how it applies, we’ll again use the simple web server Erlang app as our testbed.
Build the Docker image and deploy it in a pod, while you’re at it note one relevant field in deployment.yaml that is the
subdomain field, we’ll get to why it’s relevant shortly. Let’s attach to the container and find its FQDN:
The name that this host knows itself by is <pod-name>.<subdomain>.<namespace>.svc.<zone>, up until now we’re good. When we start a distributed Erlang this is the name it will assume, from the previous lesson we know that this must also be the name
that the other nodes must use when clustering. You’re probably wondering why is the subdomain portion of the FQDN is set to simple-web-service-headless, this is relevant but we’ll get to why it is so in a bit.
We don’t really need a StatefulSet to cluster Erlang VMs together, all they need is a way to find each other. A common discovery pattern is using a headless service coupled with DNS, what you’ll need to do to achieve this in a nutshell:
Create a headless service that groups all the pods in that deployment
When the VM starts up in each container, perform a DNS lookup on the service name, find all the other hostnames behind the service
First step is creating the headless service, it’s name will be simple-web-service-headless and it will all select all pods with the app:simple-web-service label
so let’s check that, first find out the FQDN of the headless service
Now that we know this let’s fetch it’s SRV records:
By scaling up the number of replicas in the deployment we should get back more ip addresses:
Hm.. we’re getting back two records, that’s expected as we have two pods behind the service. What’s not convenient is
the DNS record format being returned: 172-17-0-10.simple-web-service-headless.default.svc.cluster.local.. As explained previously we need this to match the name
that the node knows itself by, in this case we’d need simple-web-service-68b97dc4bf-qxwnl.simple-web-service.default.svc.cluster.local.
The Erlang VM will deny the clustering request if we use this ip address format hostname.
From Kubernetes version 1.13 CoreDNS is the default DNS server. It embeds a plugin to be used in the k8s environment and there’s a plugin option that is relevant to our interests:
endpoint_pod_names: uses the pod name of the pod targeted by the endpoint as the endpoint name in A records, e.g., endpoint-name.my-service.namespace.svc.cluster.local. in A 1.2.3.4
By default, the endpoint-name name selection is as follows: Use the hostname of the endpoint, or if hostname is not set, use the dashed form of the endpoint IP address
(e.g., 1-2-3-4.my-service.namespace.svc.cluster.local.) If this directive is included, then name selection for endpoints changes as follows:
Use the hostname of the endpoint, or if hostname is not set, use the pod name of the pod targeted by the endpoint. If there is no pod targeted by the endpoint, use the dashed IP address form.
This looks like what we need, by setting this option we should be getting back a DNS record with a pod name prefix instead of an ip address, let’s get to it
And retry the SRV lookup again
Alright, now we’re cooking, both names now match, we have everything ready to go ahead with the clustering. Now it should become clear why we decided on that specific subdomain deployment field, it
needs to match the headless service name so the entire hostname matches on both sides.
Summing up, the pod hostname that the own node sees is:
The SRV DNS record that resolves externally (with the endpoint_pod_names option applied) is:
From now it’s pretty straightforward, the following snippet finds all the hosts that are backing up the headless
service that we’ve created for the purpose of discovery