Setting up a low-cost Kubernetes Playground on GKE

With Microservices and Container Orchestration being in high demand it’s important to just be able to take these technologies out for a spin and kick the tires. The most prominent of these is probably Kubernetes, which is heavily inspired by Google’s Borg, and one of the systems I was missing the most when I left Google.

The easiest way I found to get up and running with Kubernetes is to use the Google Kubernetes Engine (GKE), which is part of the Google Cloud Platform suite. It allows you to spin up a cluster in a matter of a few minutes, has competitive pricing, and comes with a Free tier that may be sufficient for the occasional experimentation (especially if you use some cost-saving tricks).

All the GCP products are managed through the console, however I prefer to use the shell for tutorials since they are far easier to follow, and copy-pasting some commands is less ambiguous than describing buttons and links to click in a web UI.

Setting gcloud and kubectl

Since we will be using the shell almost exclusively for this tutorial we will have to install the command line tools first. First we need to download and install the Google Cloud SDK. Once you have install the SDK you should have the gcloud available in your shell. Next let’s authorize it to talk to your GCP account:

$ gcloud init

The output should look somthing like this, and a browser tab should have opened automatically to authorize gcloud to access your GCP account.

Your browser has been opened to visit:

    https://accounts.google.com/o/oauth2/auth?redirect_uri=http%3A%2F%2Flocalhost%3A8085%2F&prompt=select_account&response_type=code&client_id=0123456789.apps.googleusercontent.com&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fappengine.admin+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcompute+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Faccounts.reauth&access_type=offline

You are now logged in as [[email protected]].
Your current project is [None].  You can change this setting by running:
  $ gcloud config set project PROJECT_ID

After enabling access we can install the Kubernetes command line tool kubectl:

$ gcloud components install kubectl

This should install the command line utility kubectl. You can check that it is working by calling kubectl help.

Create a project

All resources we will be using are grouped into a GCP project. So the first thing to do once we have configured the command line tools, and given them access to the GCP account, is to select a project name and create it. The project ID is also one of the very few identifiers that have to be globally unique (i.e., there may only be one project for a specific name on all of GCP across all users), so you might have to try a few ones before finding an unused one:

$ gcloud projects create kube-playground-1
Create in progress for [https://cloudresourcemanager.googleapis.com/v1/projects/kube-playground-1].
Waiting for [operations/cp.8602796964951469078] to finish...done.

Next we’ll set some variables so that the following command line snippets are copy-pasteable:

$ export PROJECT_ID=kube-playground-1
$ export ZONE=us-central1-a
$ export CLUSTER=cluster-1
$ gcloud config set project $PROJECT_ID

Before we can start allocating some resources to our newly created project we’ll have to enable billing with a valid credit card. Sadly I have not yet found a good way to do that from the shell, so we’ll need to do that using the web based GCP console. Go to the project’s settings and enable billing. The address looks something like the following:

https://console.developers.google.com/project/${PROJECT_ID}/settings

Once we have billing enabled (don’t worry, we’ll soon reduce the cost to a minimum to keep the bill as small as possible), we can start enabling some services:

  • Google Compute Engine (GCE) allows us to create and manage virtual machines running in Google’s datacenters, and create volumes that we can use to attach to Kubernetes pods later.
  • Google Kubernetes Engine (GKE), because that’s what we want to play with, right?
$ gcloud services enable compute.googleapis.com
$ gcloud services enable container.googleapis.com

Create and configure the Kubernetes cluster

Now we can proceed to create a new cluster. This will spin up a Kubernetes master managed by GCP, and create an instance group, i.e., a group of idential virtual machines, with 3 instances. The Kubernetes master is currently free for clusters up to 5 nodes. Each virtual machine will automatically connect to the master and register as a node, so that the master can start scheduling jobs to them. This is the first action that will cost you money, but we’ll immediately reduce our footprint and minimize our expenses again.

$ gcloud --project=$PROJECT_ID container clusters create $CLUSTER \
         --zone=$ZONE \
         --machine-type=f1-micro \
         --num-nodes=3 \
         --disk-size=10 \
         --no-enable-cloud-logging \
         --enable-autorepair \
         --preemptible

Since we are planning on using this cluster mainly for experimentation, and want to keep the price minimal, I selected the smallest VM (--machine-type=f1-micro), and disabled Stackdriver logging (--no-enable-cloud-logging), reduced the disk size to 10GB (--disk-size=10), and enabled Preemptible Nodes (--preemptible).

Stackdriver logging, while an incredibly useful tool, is also quite resource hungry. It uses fluentd to ship logs from the nodes to the log-aggregator run by Google. This means that every node has a 100MB - 200MB fluentd process sitting in memory, which with our 600MB of memory we get with the smallest nodes is a noticeable overhead. Preemptible nodes means that the nodes will be regularly rebooted, to make room for more high priority tasks, but they also cost much less than the non-preemptible versions. Since we are using Kubernetes to restart the services should we get rebooted, this doesn’t have a huge impact on us, except some seconds of downtime every once in a while. For productive setups I’d suggest using the non-preemptible version though.

This command may take a while, so grab a coffee and wait for it to complete. Eventually you’ll see the following output.

NAME       LOCATION       MASTER_VERSION  MASTER_IP      MACHINE_TYPE  NODE_VERSION  NUM_NODES  STATUS
cluster-1  us-central1-a  1.8.10-gke.0    35.224.161.37  f1-micro      1.8.10-gke.0  3          RUNNING

Congratulations, you’ve just created your first Kubernetes cluster. Now let’s reduce its cost. As it stands now you’re running 3 virtual machines

A little known cost-saving secret is that even though we had to create the cluster with a minimum number of 3 instances, we can actually reduce that to 1 (or even 0 if we want to suspend the cluster in-between hacking sessions), by editing the underlying instance group.

Let’s find the instance group we want to edit by using gcloud again. The name should be something gke-clustername-default-pool-random-suffix).

$ gcloud --project=$PROJECT_ID compute instance-groups list
NAME                                     LOCATION       SCOPE  NETWORK  MANAGED  INSTANCES
gke-cluster-1-default-pool-0a671024-grp  us-central1-a  zone   default  Yes      3

$ export INSTANCE_GROUP=gke-cluster-1-default-pool-0a671024-grp

Once more we set the variable $INSTANCE_GROUP to the name from that command so the following commands are still copy-pasteable. Next we reduce the size to 1 so we are just using the free tier of 1 f1-micro instance per month.

gcloud compute instance-groups managed resize $INSTANCE_GROUP \
       --zone=$ZONE
       --size=1

You could also reduce the size to 0 to suspend the entire cluster:

gcloud compute instance-groups managed resize $INSTANCE_GROUP \
       --zone=$ZONE
       --size=0

Configure kubectl to talk to the cluster

Finally we need to get the credentials for kubectl to talk to the cluster we just created

gcloud --project=$PROJECT_NAME container clusters get-credentials $CLUSTER_NAME \
       --zone=$ZONE

After that you should be able to call kubectl get nodes and see the virtual machines that comprise your cluster.

Privacy in Lightning FAQ

A few days ago I was contacted by Rachel O’Leary from CoinDesk regarding an article she was writing about the privacy of Lightning payments compared to Bitcoin on-chain transactions: Will Lightning Help or Hurt Bitcoin Privacy?. These are very common questions, so I thought I might as well publish my raw answers.

I’d like to preface this by saying that privacy is neither a binary, nor a linear thing. Privacy is a multi-facetted issue and comparing two systems is difficult, so a tradeoff may be sensible in one setting it may not be in another.

Why is privacy a consideration for Lightning developers?

Payments in the lightning network are fundamentally different from payments in bitcoin. We move away from a broadcast medium in which everybody sees every transaction, towards a model in which only the interested parties, i.e., sender, recipient and intermediate hops, see a payment. This is what gives us the high scalability in the first place, however it also makes it easier to identify the endpoints of a payment. Without the onion routing, every payment would contain the destination’s address in order for the payment to be routed correctly, so every node could know the destination, and with sufficient network access we could also infer who the sender is.

Will Lightning improve or worsen privacy?

That is hard to answer, since payments are so different. In bitcoin we have every payment permanently recorded in the blockchain, with a lot of research already showing the possibility of creating extensive user profiles, tracing payments and deanonymizing many users. It takes a lot of effort to stay truly anonymous in bitcoin.

Using lightning we don’t create that permanent record for payments in the first place so this kind of analysis is no longer possible. To learn about a payment now you have to be part of the route that the payment went through in the first place, otherwise you’ll be completely unaware of anything happening. Even if you are part of the route all you learn is that you got some funds from one side, along with instructions were to forward them next. You do not learn anything about the sender, the recipient or even your position in the route.

This protection is certainly not perfect, as some have pointed out, timing attacks can identify relationships between individual hops, and the public knowledge about the network’s topology may allow attackers to infer additional information. However I personally believe that we are in a better position. The payment information is ephemeral and attackers need substantially more resources and access to learn anything about the payments that occur in the network. It’s not perfect, but a step forward.

Why is there dispute about this?

The dispute is mostly comparing the security of our onion routing with Tor. The key observation is that our routing is restricted to follow the topology of the overlay network created by lightning. This is different from Tor, where each hop in a circuit can be chosen arbitrarily from a set of public nodes, whereas we are restricted to chose adjacent nodes.

This criticism is valid, and may impact the mix quality, though I can’t say what the exact impact is. While we are bound to follow the topology, it also means that to get from one point to another we have to traverse more hops, which again improves our mix quality. While Tor uses just a few hops for reasonable privacy, Lightning can use up to 20 hops, counteracting the limited choice for each individual hop.

This is less of a dispute about the privacy of bitcoin versus lightning, instead it is more about the onion routing as implemented in lightning versus the Tor network.

If I wanted to send a private Lightning transaction now, could I?

Ideally the client would always attempt to maximize your privacy for you, so there wouldn’t be anything specific required on your part. The countermeasures that all three teams (c-lightning, eclair, and lnd) are currently implementing comprise a topology randomization and route randomization. Topology randomization tries to avoid having hubs that can observe traffic, by opening channels in a random fashion, which also strengthens the network as a whole against single points of failure. Route randomization involves computing multiple routes and chosing one of them at random. This may result in slightly longer routes, and thus slightly higher fees, but increases privacy a lot by making the routes less predictable.

How will Hornet improve on some of these problems?

Hornet is a protocol that optimizes bidirectional communication along an existing circuit that was established using the sphinx protocol. We implement sphinx for our onion routing, so that determines the route that all subsequent communication would take. We stepped away from the idea of implementing hornet as part of lightning since it’d create a general purpose communication layer that we don’t really need. With its single roundtrip design sphinx perfectly matches our requirements for a payment, and adding a communication layer that is not bound to payments could potentially have unintended consequences. We want lightning to be a payment network, not a way to stream movies anonymously :-)

What other privacy proposals also address this?

We could separate the onion routing from the end-to-end connectivity, i.e., first creating a publicly routed base network that allows us to send payments from A to B, and then build onion routing on top of that. This would more closely match the Tor network in which the base network is TCP/IP and the onion routing is built on top. However, in the first iteration we decided that all payments should be private and thus use onion routing, and not allow users to skip the onion routing and just use the base routing. This may eventually be picked up again, but so far it’s not on the roadmap.

Blockchain Meetup Talk

I’m a big fan of meetups, with our first Zurich Bitcoin Meetup all the way back in 2011, with just 4 people attending, more on that another time. The Zurich Bitcoin meetups have become far more popular and better organized, thanks to Lucas, who has been organizing the locations and speakers.

And so it was my turn to speak about my research at the latest Meetup:

The talk has quite a large introduction about how Bitcoin works and why it does not scale. In the second part we talk about Duplex Micropayment Channels, how Payment Service Providers could emerge to build a fast payment network on top of Bitcoin and what some of the remaining challenges are.

I had a lot of fun, and it is always nice to have such an interested crowd. If you get a chance to give a talk at a meetup, go for it.

In the talk I said that most of this information is in my dissertation, so for those who would like to read up on these technologies the dissertation is available at Amazon: On The Scalability and Security of Bitcoin.

Getting git to play nicely with CDNs

Git is a really cool version control system. So cool in fact that I decided to use it to distribute the project I’m working on to several hundreds of Planetlab nodes. So I went ahead and created a repository with `git init –bare` somewhere in under the root of my local Apache2. Using pssh we can clone and pull from the repository simply by specifying the URL to that repo.

Obviously the traffic is still pretty high, after all every request still ends up at my machine, so I have to serve the whole repository once for each machine. Then I stumbled over CoralCDN, a free Content Distribution Network, that runs on Planetlab. So instead of cloning directly from my machine I took the URL of the repo, added .nyud.net to the domain and cloned from that URL instead.

The drop in traffic when cloning was immediate and I was happily working with this setup, for some time. Then I noticed that having the CDN cache the contents has its drawbacks: if I want to push changes quickly one after another, say, because I noticed a typo just after issuing the update, I have to wait for the cache to time out.

To solve this problem we have to set the objects files, which do not change because it is part of gits content addressable design, and set a short caching time for the few files that do change. Placing this .htaccess file in the repository and activating `mod_headers` and `mod_expires` should do the trick:

ExpiresActive On
ExpiresDefault A300
Header append Cache-Control "public"

<FilesMatch "(info|branches|refs|HEAD)">
  ExpiresDefault A10
  Header append Cache-Control "must-revalidate"
</FilesMatch>

1This sets everything to be cacheable for 5 minutes (300 seconds), except the references, which tells git where to look for the content.

Bitcoin's Getting Some Traction

It’s an amazing time to be part of the Bitcoin family. With the Wikileaks scandal we had some quite heated discussions on whether to promote ourselfs as an alternative way for them to acquire funds, but in the end we decided not to, preferring not to be associated with a company being investigated by some countries. However the decision seems to have already been taken for us: as this article in PCWorld demonstrates we are not the only ones making that connection.

Furthermore people are investing more and more resources into Bitcoin as the confidence in the future of the currency grows. Currently the Bitcoin economy containing 4’464’000 coins is worth just short of 1 million USD (MtGox). Meanwhile the growing interest increased the difficulty to generate blocks (the means to acquire new coins and confirm transactions) to incredible heights, and newcomers are getting frustrated at how long it takes them to earn their first real coins. Luckily the Bitcoin Faucet and a pooled mining effort should counteract part of this problem, but the trend is quite clear, people that do not invest heavily into GPUs are will have nearly no chance at accumulating large quantities just by mining, but then where does a country just give you freshly printed money?

In the meantime a lot of discussion is going on about improvements to the Protocol, and what should be part of the Bitcoin ecosystem, specifically an alternative DNS system is in discussion, which would piggyback on the currency transactions.

That should be it for now, if you’re interested why not give Bitcoin a try, join us on the Forum or read up on the latest Developer discussions?

Migrating to JRockit

I’ve been bothered with the now famous PermGen Space error while developing a web application on a local jetty instance quite often, and I was hoping that the problem wouldn’t prove to be that serious once deployed on a tomcat server, but quite the opposite is the case.

The problem happens when the JVM runs out of permanent generation heap space, which most of the time is due to classloaders not being correctly garbage collected. Permanent generation heap space is an optimization that the Sun JVM contains to speed up object creation, but the default size is too small if classes are loaded and unloaded often during runtime, which is exactly the mechanism most application servers load applications. So the first, quick and dirty, solution would be to enlarge the permanent generation heap space: -XX:MaxPermSize=256m. Sadly, this still doesn’t get rid of the problem. Another solution is to use a completely different JVM altogether: JRockit.

JRockit, a proprietary Java Virtual Machine (JVM) from BEA Systems, became part of Oracle Fusion Middleware in 2008. Many JRE class files distributed with BEA JRockit exactly replicate those distributed by Sun. JRockit overrides class files which relate closely to the JVM, therefore retaining API compatibility while enhancing the performance of the JVM. [from Wikipedia]

I wasn’t thrilled having to change JVM because it isn’t available in the openSuse repositories at all, and I wasn’t quite sure how hard it would be to make the switch. As I found out, it’s incredibly easy.

Getting the package

Getting your hands on the JRockit installation package isn’t all that easy, because BEA became part of Oracle and everything is still in transition. The download location is http://edelivery.oracle.com/, where you’ll be greated by a wizard to select the products to download. JRockit can be found under BEA Products and then BEA WebLogic Media Pack, scrolling down you’ll find the zip package you need depending on your operating system.

Installation

Installation is straight forward, just unzip the archive and then execute the contained installer:

$ unzip B46961-01.zip
Archive:  B46961-01.zip
  inflating: jrockit-R27.5.0-jdk1.6.0_03-linux-x64.bin
$ chmod +x jrockit-R27.5.0-jdk1.6.0_03-linux-x64.bin
$ sudo ./jrockit-R27.5.0-jdk1.6.0_03-linux-x64.bin

Now all you have to do is follow the instructions of the installer. When asked for a location to install JRockit into, I used /opt/jrockit but every location will do just fine. The next step is optional, but if you use update-alternatives I strongly suggest you to do it. We’ll add jrockit java and the the jrockit compiler (javac) as alternatives:

update-alternatives --install /usr/bin/java java /opt/jrockit/bin/java 300
update-alternatives --install /usr/bin/javac javac /opt/jrockit/bin/javac 300

So when doing an update-alternives we see the jrocki VM:

$ update-alternatives --config java
There are 2 programs which provide `java’.

  Selection    Command
 -----------------------------------------------
 +    1        /usr/lib64/jvm/jre-1.6.0.u7-sun/bin/java
 *    2        /opt/jrockit/bin/java

Enter to keep the default[*], or type selection number: so now we can easily switch between the Sun VM and the JRockit VM. That’s it. Now just check to see if we really have the JRockit VM and we’re ready to code:

$ java -version
java version "1.6.0_03"
Java(TM) SE Runtime Environment (build 1.6.0_03-b05)
BEA JRockit(R) (build R27.5.0-110_o-99226-1.6.0_03-20080528-1505-linux-ia32, compiled mode)