Setting up a low-cost Kubernetes Playground on GKE

With Microservices and Container Orchestration being in high demand it’s important to just be able to take these technologies out for a spin and kick the tires. The most prominent of these is probably Kubernetes, which is heavily inspired by Google’s Borg, and one of the systems I was missing the most when I left Google.

The easiest way I found to get up and running with Kubernetes is to use the Google Kubernetes Engine (GKE), which is part of the Google Cloud Platform suite. It allows you to spin up a cluster in a matter of a few minutes, has competitive pricing, and comes with a Free tier that may be sufficient for the occasional experimentation (especially if you use some cost-saving tricks).

All the GCP products are managed through the console, however I prefer to use the shell for tutorials since they are far easier to follow, and copy-pasting some commands is less ambiguous than describing buttons and links to click in a web UI.

Setting gcloud and kubectl

Since we will be using the shell almost exclusively for this tutorial we will have to install the command line tools first. First we need to download and install the Google Cloud SDK. Once you have install the SDK you should have the gcloud available in your shell. Next let’s authorize it to talk to your GCP account:

$ gcloud init

The output should look somthing like this, and a browser tab should have opened automatically to authorize gcloud to access your GCP account.

Your browser has been opened to visit:

    https://accounts.google.com/o/oauth2/auth?redirect_uri=http%3A%2F%2Flocalhost%3A8085%2F&prompt=select_account&response_type=code&client_id=0123456789.apps.googleusercontent.com&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fappengine.admin+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcompute+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Faccounts.reauth&access_type=offline

You are now logged in as [[email protected]].
Your current project is [None].  You can change this setting by running:
  $ gcloud config set project PROJECT_ID

After enabling access we can install the Kubernetes command line tool kubectl:

$ gcloud components install kubectl

This should install the command line utility kubectl. You can check that it is working by calling kubectl help.

Create a project

All resources we will be using are grouped into a GCP project. So the first thing to do once we have configured the command line tools, and given them access to the GCP account, is to select a project name and create it. The project ID is also one of the very few identifiers that have to be globally unique (i.e., there may only be one project for a specific name on all of GCP across all users), so you might have to try a few ones before finding an unused one:

$ gcloud projects create kube-playground-1
Create in progress for [https://cloudresourcemanager.googleapis.com/v1/projects/kube-playground-1].
Waiting for [operations/cp.8602796964951469078] to finish...done.

Next we’ll set some variables so that the following command line snippets are copy-pasteable:

$ export PROJECT_ID=kube-playground-1
$ export ZONE=us-central1-a
$ export CLUSTER=cluster-1
$ gcloud config set project $PROJECT_ID

Before we can start allocating some resources to our newly created project we’ll have to enable billing with a valid credit card. Sadly I have not yet found a good way to do that from the shell, so we’ll need to do that using the web based GCP console. Go to the project’s settings and enable billing. The address looks something like the following:

https://console.developers.google.com/project/${PROJECT_ID}/settings

Once we have billing enabled (don’t worry, we’ll soon reduce the cost to a minimum to keep the bill as small as possible), we can start enabling some services:

  • Google Compute Engine (GCE) allows us to create and manage virtual machines running in Google’s datacenters, and create volumes that we can use to attach to Kubernetes pods later.
  • Google Kubernetes Engine (GKE), because that’s what we want to play with, right?
$ gcloud services enable compute.googleapis.com
$ gcloud services enable container.googleapis.com

Create and configure the Kubernetes cluster

Now we can proceed to create a new cluster. This will spin up a Kubernetes master managed by GCP, and create an instance group, i.e., a group of idential virtual machines, with 3 instances. The Kubernetes master is currently free for clusters up to 5 nodes. Each virtual machine will automatically connect to the master and register as a node, so that the master can start scheduling jobs to them. This is the first action that will cost you money, but we’ll immediately reduce our footprint and minimize our expenses again.

$ gcloud --project=$PROJECT_ID container clusters create $CLUSTER \
         --zone=$ZONE \
         --machine-type=f1-micro \
         --num-nodes=3 \
         --disk-size=10 \
         --no-enable-cloud-logging \
         --enable-autorepair \
         --preemptible

Since we are planning on using this cluster mainly for experimentation, and want to keep the price minimal, I selected the smallest VM (--machine-type=f1-micro), and disabled Stackdriver logging (--no-enable-cloud-logging), reduced the disk size to 10GB (--disk-size=10), and enabled Preemptible Nodes (--preemptible).

Stackdriver logging, while an incredibly useful tool, is also quite resource hungry. It uses fluentd to ship logs from the nodes to the log-aggregator run by Google. This means that every node has a 100MB - 200MB fluentd process sitting in memory, which with our 600MB of memory we get with the smallest nodes is a noticeable overhead. Preemptible nodes means that the nodes will be regularly rebooted, to make room for more high priority tasks, but they also cost much less than the non-preemptible versions. Since we are using Kubernetes to restart the services should we get rebooted, this doesn’t have a huge impact on us, except some seconds of downtime every once in a while. For productive setups I’d suggest using the non-preemptible version though.

This command may take a while, so grab a coffee and wait for it to complete. Eventually you’ll see the following output.

NAME       LOCATION       MASTER_VERSION  MASTER_IP      MACHINE_TYPE  NODE_VERSION  NUM_NODES  STATUS
cluster-1  us-central1-a  1.8.10-gke.0    35.224.161.37  f1-micro      1.8.10-gke.0  3          RUNNING

Congratulations, you’ve just created your first Kubernetes cluster. Now let’s reduce its cost. As it stands now you’re running 3 virtual machines

A little known cost-saving secret is that even though we had to create the cluster with a minimum number of 3 instances, we can actually reduce that to 1 (or even 0 if we want to suspend the cluster in-between hacking sessions), by editing the underlying instance group.

Let’s find the instance group we want to edit by using gcloud again. The name should be something gke-clustername-default-pool-random-suffix).

$ gcloud --project=$PROJECT_ID compute instance-groups list
NAME                                     LOCATION       SCOPE  NETWORK  MANAGED  INSTANCES
gke-cluster-1-default-pool-0a671024-grp  us-central1-a  zone   default  Yes      3

$ export INSTANCE_GROUP=gke-cluster-1-default-pool-0a671024-grp

Once more we set the variable $INSTANCE_GROUP to the name from that command so the following commands are still copy-pasteable. Next we reduce the size to 1 so we are just using the free tier of 1 f1-micro instance per month.

gcloud compute instance-groups managed resize $INSTANCE_GROUP \
       --zone=$ZONE
       --size=1

You could also reduce the size to 0 to suspend the entire cluster:

gcloud compute instance-groups managed resize $INSTANCE_GROUP \
       --zone=$ZONE
       --size=0

Configure kubectl to talk to the cluster

Finally we need to get the credentials for kubectl to talk to the cluster we just created

gcloud --project=$PROJECT_NAME container clusters get-credentials $CLUSTER_NAME \
       --zone=$ZONE

After that you should be able to call kubectl get nodes and see the virtual machines that comprise your cluster.