Kubernetes on Rails: now free for everyone!

12 minute read

Back in mid-2018, I slogged through learning Kubernetes in order to deploy a Rails web app using it, and I spent quite a bit of time turning that knowledge into a series of detailed blog posts.

A little while after completing those blog posts, I was going to make an editorial pass to tighten things up when I realized the content would be much better delivered via screencast rather than written out into long instructions and screenshots. In a fit of mania, I decided to do that and charge a small amount of money for access to the videos.

Now, over two years and nearly 150 customers later, purchases have died down a bit - I still get the odd purchase every couple weeks or so, including one this week - but at this point I feel bad because although I believe that the meat of the content is still valuable, some of the ecosystem has changed slightly so the material isn’t strictly up-to-date in the handheld, step-by-step fashion I intended the videos to be.

Therefore I’ve decided to make the screencasts freely available for all, and if you feel like you got something from them worthy of remuneration you can just PayPal me here:

Customer thank you

I’m so humbled that almost 150 people bought the screencasts. I wish I had asked for everybody’s first name when accepting payment so that I could display them all here (with their permission).

One thing that surprised me that I’m able to share is all the different countries that purchasers hailed from!

🇦🇷 (Argentina)
🇦🇹 (Austria)
🇦🇺 (Australia)
🇧🇦 (Bosnia and Herzegovina)
🇧🇪 (Belgium)
🇧🇷 (Brazil)
🇧🇾 (Belarus)
🇨🇦 (Canada)
🇨🇷 (Costa Rica)
🇩🇪 (Germany)
🇩🇰 (Denmark)
🇪🇸 (Spain)
🇫🇷 (France)
🇬🇧 (United Kingdom)
🇬🇷 (Greece)
🇬🇹 (Guatemala)
🇭🇰 (Hong Kong)
🇮🇩 (Indonesia)
🇮🇪 (Ireland)
🇮🇳 (India)
🇯🇵 (Japan)
🇰🇷 (South Korea)
🇱🇺 (Luxembourg)
🇲🇰 (Macedonia)
🇲🇽 (Mexico)
🇲🇾 (Malaysia)
🇳🇱 (Netherlands)
🇳🇴 (Norway)
🇳🇿 (New Zealand)
🇵🇦 (Panama)
🇵🇱 (Poland)
🇷🇴 (Romania)
🇷🇺 (Russia)
🇸🇪 (Sweden)
🇸🇬 (Singapore)
🇸🇻 (El Salvador)
🇹🇭 (Thailand)
🇹🇷 (Turkey)
🇺🇦 (Ukraine)
🇺🇸 (United States)
🇺🇾 (Uruguay)
🇿🇦 (South Africa)

If you bought the screencast and would like a shout-out here let me know and I will gladly post your name or @ or whatever you want right here.

Episodes

Without further ado, here’s the content:

Episode 1: Intro

Recorded: 2018/12/05
Duration: 08:20

We’ll clone the starter files repo in preparation for working through the course. We’ll take a peek under the covers at and locally spin up Captioned Image Uploader, the example Rails application that we’ll be deploying to Kubernetes throughout the rest of the course.

Show notes

Google Cloud signup
Starter files GitHub repo

Episode 2: Introduction to Google Cloud

Duration: 25:40

We’ll register for a Google Cloud account, create a project, and prep for our application deployment by creating our database, our GKE cluster, building and pushing our Docker image, and so on using both the GCP Web console as well as the gcloud CLI.

Show notes

Google Cloud SDK install instructions
Google Cloud resource hierarchy
Container Registry quickstart
Access scopes must match IAM role permissions (“You must set access scopes on the instance to authorize access.”)

Errata

  • In the video I made a mistake when I untar'ed and installed gcloud to the /tmp directory. Don't do this because the installer will modify your shell's path to look in /tmp for gcloud . Instead untar and do the install from your home directory - that's where gcloud should live. If you already extracted + installed to /tmp it's not a big deal though, you can just reinstall.
  • Turns out that when creating the GKE cluster, under "Advanced options" there is an "Enable VPC-native" checkbox you can check which will enable private IP networking. So if you do that you won't need to copy and paste the blob of CLI arguments to create the GKE cluster.
  • At 17:58 we give the cluster user "Full" access to Storage; on review I don't believe we needed to modify that as later on in the series we will be creating a Service Account which will have the necessary Storage permissions.

Episode 3: Introduction to Kubernetes concepts

Duration: 21:24

A guided talk through the fundamental Kubernetes resources that we’ll use to build our deployment. We’ll learn about Pods, Deployments, Jobs, CronJobs, Services, and Ingresses, and sketch a diagram of how they’ll all fit together to run our app.

Show notes

Pods documentation
Pod manifest example
Deployments documentation
Jobs documentation
Service documentation
Ingress documentation

Episode 4: Deploying our code

Duration: 34:39

We’ll get kubectl installed and connected to our GKE cluster, start using it to manipulate our cluster, write manifests for the Kubernetes resources we’ll need (Job, Deployment, Secrets, Service), and finally create them to get our application up and running! 🤩

Show notes

What we learned

  • Installing kubectl with gcloud components install
  • gcloud container clusters get-credentials standard-cluster-1 to tell kubectl to use the GKE cluster named standard-cluster-1
  • Kubernetes Jobs
    • Writing a manifest to run our database migration
    • Deleting a job
      kubectl 
      
  • Kubernetes manifests
    • How the template: key defines a Pod template for many different resource types
  • gcloud container images list to list available Docker images
  • gcloud sql instances list to get private IP address of SQL instance
  • gcloud sql users list --instance=captioned-image-db to get list of SQL users for instance
  • gcloud sql users set-password postgres --password=foobar to change postgres user password to foobar
  • Kubernetes Secrets
    • How to reference in manifests
    • 12:15 How to create:
      kubectl create generic app-secrets --from-literal=DATABASE_URL=postgres://...
      
    • 13:38 How to edit existing with
      kubectl edit secret app-secrets
      
    • They’re stored encoded with base-64
      • 19:30 Encoding plaintext into base-64 and copying to clipboard on Linux CLI using
        echo -n "whatever" | base64 --wrap=0 | xclip
        
  • kubectl commands
    • kubectl get jobs to list jobs (add -w flag to watch and update on changes)
    • kubectl get pods to list pods
    • kubectl logs db-migrate-qbxh6 to view logs (add -f flag to follow logs and update on changes)
    • kubectl delete jobs/db-migrate to delete <resource_type>/<resource_name>
  • Kubernetes Deployments
    • How to write manifest
    • Different strategies, surge, and unavailability settings
    • How the selector makes the Deployment apply to Pods with that label
    • Creating a Service for the Deployment using kubectl expose

Errata

  • 18:15 I said you could just update a Job’s manifest and re-apply it and it will fix itself. This is true of most resource types however I think this is actually not the case with Jobs - you have to delete the job and recreate it.

Addenda

  • At 31:35 we look at the logs for the running Rails server container, however you only see the Puma startup output. This is because the rest of the output is being written to a log file instead of output to STDOUT which is what the kubectl logs command is reading from. I didn’t bother in the screencast but we could change this behavior in Rails 5 by setting the RAILS_LOG_TO_STDOUT environment variable. Interestingly in my experience Stackdriver (GCP’s logging + monitoring solution, which we also didn’t explore) seems to be smart enough to read from the log file so it’s not a big deal.
  • One other command I forgot to mention that is pretty neat is kubectl scale which lets resize the number of Pods in the Deployment without having to edit and re-apply a manifest. Useful for quickly scaling up if you’re experiencing sudden load. Try it out!

Episode 5: Fixing image upload using Google Cloud Storage

Duration: 14:45

It’s alive! 😍 But it’s got a problem. 😭 We’ll fix an issue with image uploads by setting up Google Cloud Storage. Along the way we’ll learn how to use IAM Service Accounts and how to pop a remote Rails console.

Show notes

Shrine Google Cloud Storage plugin
Shrine initializer code gist

What we learned

  • kubectl exec
    • Opened a remote Rails console
  • Google Cloud Storage
  • Kubernetes Secrets
    • Creation and editing
  • Google Cloud IAM Service Accounts, Roles

Episode 6: Ingress, domain name, and HTTPS!

Duration: 30:47

So far we’ve been accessing our application directly over an internal Service. We’ll replace this with a more scalable solution by creating our first Ingress, giving it a domain name, and getting a TLS certificate through Let’s Encrypt to enable HTTPS.

Show notes

Duck DNS
cert-manager
My cert-manager v0.5.2 bug workaround
Let’s Encrypt
Helm installation
IPv6 website validation

What we learned

  • Kubernetes Ingress
    • Writing a manifest
    • Connecting to a service port
    • Assigning a global static IP
    • Authoring hostname and path rules
    • TLS configuration
  • Helm package installation
  • cert-manager
    • Issuer resource type
    • Certificate resource type
    • Annotating Ingresses to do ACME HTTP01 Let’s Encrypt dance
      • How cert-manager modifies our Ingress to make /.well-known/acme-challenge path available to Let’s Encrypt
  • kubectl get shortnames (kubectl get svc vs kubectl get services)
  • GKE Ingress specifics
    • Need to make separate Ingress to support IPv6
    • IPv6 Ingresses are free
    • 28:28 GKE Ingress can’t force TLS - use rack-ssl-enforcer gem or your reverse proxy config if you’re using say, nginx
  • 27:09 Grouping multiple related resources into a single manifest is a best practice

Addenda

  • I knowingly say “TLS certificate” instead of the more correct “X.509 certificate” for simplicity’s sake. Let’s Encrypt uses the same wording on their site so I think that’s okay.
  • One thing I wanted to mention but forgot to in the episode is that we’re using the HTTP-01 ACME challenge type which is the only challenge type we can use with a free Duck DNS domain name. However there is also a DNS-01 challenge type which responds to challenges by creating TXT records. In my experience the DNS-01 challenge type works a lot smoother with cert-manager than the HTTP-01, and it also enables the creation of wildcard certificates. We couldn’t do this in the screencast however because it would require viewers to buy a domain name and set up GCP Cloud DNS as the DNS provider.
  • Interestingly, the GKE Ingress doesn’t even read the hosts field of the tls spec, however it is needed by cert-manager to make the Let’s Encrypt request.
  • At 13:17 I mentioned we’re creating a Kubernetes resource in the kube-system namespace. I probably should’ve used this as an opportunity to talk a bit more about namespaces and how they can be used to separate applications. So instead I encourage you to read the documentation on them yourself. One handy flag worth mentioning is --all-namespaces; for instance to see all the pods running in your cluster you can do:

    kubectl get pods --all-namespaces
    

    This will be necessary if you ever have to debug cert-manager, for instance, *cough* because it will start a pod in the kube-system namespace. When you want to then, say, inspect the logs of the pod you found you have to specify the namespace with -n; for example:

    kubectl logs cert-manager-7d4bfc44ff-tp9g6 -n kube-system -f
    
  • 20:17 regarding “self check failed,” that means that cert-manager did a pre-test to see if the domain is reachable before handing things off to Let’s Encrypt. It’s meant to save you from prematurely making requests to Let’s Encrypt that it thinks will fail to save you from getting rate-limited. Which is a neat idea, except when it doesn’t work.
  • For the most detailed list of limitations with GCP’s Ingress, check out its GitHub repo

Episode 7: Boosting static asset performance using Cloud CDN

Duration: 23:00

Up until now we’ve been serving our static assets directly from our Rails server (boo, slow!). We’ll replace this with Cloud CDN (hooray, fast!). To accomplish this we’ll meet a new Kubernetes resource, BackendConfig, and learn how to wire it up through a new Service port and our Ingress.

Show notes

Rails Asset Pipeline CDN docs
Cloud CDN docs
BackendConfig docs
Apex.sh global latency testing tool
Cloud CDN in the web console

Bonus Episode 1: Provisioning cloud resources with Terraform

Duration: 01:33:19

While building our application, we provisioned GCP resources using the web console and the gcloud CLI. We’ll investigate using Terraform to replace our manual work with declarative templates which will make our deploys repeatable, versionable, and all the other benefits of moving infrastructure management to code.

Addenda

  • This episode was recorded before Terraform 0.12 was released. With 0.12 you no longer need to enclose all attributes in quotes and there are now a few more types of variables besides strings.
  • The SQL user we are creating has database superuser privileges. You may want to create a user with less privileges for your own app.
  • The different cluster types are referred to as regional or zonal, which you can read more about here, and more about how to create the different types on GCP’s “creating a cluster” guide.
  • There is a Kubernetes provider and a resource for Secrets. This is a much handier way to set the app-secrets Secret value, and this is the way I do it in the Helm episode. See the Starter Files repo for the .tf Terraform config.

Topics for further exploration:

Bonus Episode 2: Charting our app with Helm

Duration: 01:04:42

We get sick of running kubectl apply over and over and decide to use Helm, Kubernetes’s package manager, to template and package up our app into a reusable chart for simplified app deployment.

Show notes

Reminder: Steps to provision a brand new GCP project using our Terraform config:

  1. Create GCP project
  2. Enable Cloud Resource Manager API
  3. Enable Compute Engine API (needed to import VPC default network before plan/apply runs)
  4. Create a service account for Terraform to use with project owner permission
  5. Generate a key for the service account, copy the .json file to provision/keyfiles/keyfile.json
  6. Update the terraform.tfvars file to set variables to your own values
  7. cd to the provision directory, run terraform init to initialize terraform provider plugins
  8. Import the VPC default network with terraform import google_compute_network.vpc_default default
  9. Now you can terraform plan -out /tmp/plan and then terraform apply /tmp/plan

Note: If you get an error when performing terraform plan/apply in the beginning like Failed to create subnetwork. Please create Service Networking connection with service 'servicenetworking.googleapis.com' you may need to wait several minutes for the networking resources to fully initialize, then do terraform taint random_id.db-instance and redo terraform plan/apply to recreate the SQL instance.

Useful links

Updated: