Back in mid-2018, I slogged through learning Kubernetes in order to deploy a Rails web app using it, and I spent quite a bit of time turning that knowledge into a series of detailed blog posts.
A little while after completing those blog posts, I was going to make an editorial pass to tighten things up when I realized the content would be much better delivered via screencast rather than written out into long instructions and screenshots. In a fit of mania, I decided to do that and charge a small amount of money for access to the videos.
Now, over two years and nearly 150 customers later, purchases have died down a bit - I still get the odd purchase every couple weeks or so, including one this week - but at this point I feel bad because although I believe that the meat of the content is still valuable, some of the ecosystem has changed slightly so the material isn’t strictly up-to-date in the handheld, step-by-step fashion I intended the videos to be.
Therefore I’ve decided to make the screencasts freely available for all, and if you feel like you got something from them worthy of remuneration you can just PayPal me here:
Customer thank you
I’m so humbled that almost 150 people bought the screencasts. I wish I had asked for everybody’s first name when accepting payment so that I could display them all here (with their permission).
One thing that surprised me that I’m able to share is all the different countries that purchasers hailed from!
If you bought the screencast and would like a shout-out here let me know and I will gladly post your name or @ or whatever you want right here.
Without further ado, here’s the content:
Episode 1: Intro
We’ll clone the starter files repo in preparation for working through the course. We’ll take a peek under the covers at and locally spin up Captioned Image Uploader, the example Rails application that we’ll be deploying to Kubernetes throughout the rest of the course.
Episode 2: Introduction to Google Cloud
We’ll register for a Google Cloud account, create a project, and prep for our application deployment by creating our database, our GKE cluster, building and pushing our Docker image, and so on using both the GCP Web console as well as the gcloud CLI.
Google Cloud SDK install instructions
Google Cloud resource hierarchy
Container Registry quickstart
Access scopes must match IAM role permissions (“You must set access scopes on the instance to authorize access.”)
In the video I made a mistake when I untar'ed and installed
/tmpdirectory. Don't do this because the installer will modify your shell's path to look in
gcloud. Instead untar and do the install from your home directory - that's where
gcloudshould live. If you already extracted + installed to
/tmpit's not a big deal though, you can just reinstall.
- Turns out that when creating the GKE cluster, under "Advanced options" there is an "Enable VPC-native" checkbox you can check which will enable private IP networking. So if you do that you won't need to copy and paste the blob of CLI arguments to create the GKE cluster.
- At 17:58 we give the cluster user "Full" access to Storage; on review I don't believe we needed to modify that as later on in the series we will be creating a Service Account which will have the necessary Storage permissions.
Episode 3: Introduction to Kubernetes concepts
A guided talk through the fundamental Kubernetes resources that we’ll use to build our deployment. We’ll learn about Pods, Deployments, Jobs, CronJobs, Services, and Ingresses, and sketch a diagram of how they’ll all fit together to run our app.
Episode 4: Deploying our code
We’ll get kubectl installed and connected to our GKE cluster, start using it to manipulate our cluster, write manifests for the Kubernetes resources we’ll need (Job, Deployment, Secrets, Service), and finally create them to get our application up and running! 🤩
What we learned
gcloud components install
gcloud container clusters get-credentials standard-cluster-1to tell
kubectlto use the GKE cluster named
- Kubernetes Jobs
- Writing a manifest to run our database migration
- Deleting a job
- Kubernetes manifests
- How the
template:key defines a Pod template for many different resource types
- How the
gcloud container images listto list available Docker images
gcloud sql instances listto get private IP address of SQL instance
gcloud sql users list --instance=captioned-image-dbto get list of SQL users for instance
gcloud sql users set-password postgres --password=foobarto change
postgresuser password to
- Kubernetes Secrets
- How to reference in manifests
- 12:15 How to create:
kubectl create generic app-secrets --from-literal=DATABASE_URL=postgres://...
- 13:38 How to edit existing with
kubectl edit secret app-secrets
- They’re stored encoded with base-64
- 19:30 Encoding plaintext into base-64 and copying to clipboard on
Linux CLI using
echo -n "whatever" | base64 --wrap=0 | xclip
- 19:30 Encoding plaintext into base-64 and copying to clipboard on Linux CLI using
kubectl get jobsto list jobs (add
-wflag to watch and update on changes)
kubectl get podsto list pods
kubectl logs db-migrate-qbxh6to view logs (add
-fflag to follow logs and update on changes)
kubectl delete jobs/db-migrateto delete
- Kubernetes Deployments
- How to write manifest
- Different strategies, surge, and unavailability settings
- How the selector makes the Deployment apply to Pods with that label
- Creating a Service for the Deployment using
- 18:15 I said you could just update a Job’s manifest and re-apply it and it will fix itself. This is true of most resource types however I think this is actually not the case with Jobs - you have to delete the job and recreate it.
- At 31:35 we look at the logs for the running Rails server container,
however you only see the Puma startup output. This is because the rest of
the output is being written to a log file instead of output to STDOUT which
is what the
kubectl logscommand is reading from. I didn’t bother in the screencast but we could change this behavior in Rails 5 by setting the
RAILS_LOG_TO_STDOUTenvironment variable. Interestingly in my experience Stackdriver (GCP’s logging + monitoring solution, which we also didn’t explore) seems to be smart enough to read from the log file so it’s not a big deal.
- One other command I forgot to mention that is pretty neat is
kubectl scalewhich lets resize the number of Pods in the Deployment without having to edit and re-apply a manifest. Useful for quickly scaling up if you’re experiencing sudden load. Try it out!
Episode 5: Fixing image upload using Google Cloud Storage
It’s alive! 😍 But it’s got a problem. 😭 We’ll fix an issue with image uploads by setting up Google Cloud Storage. Along the way we’ll learn how to use IAM Service Accounts and how to pop a remote Rails console.
What we learned
- Opened a remote Rails console
- Google Cloud Storage
- Kubernetes Secrets
- Creation and editing
- Google Cloud IAM Service Accounts, Roles
Episode 6: Ingress, domain name, and HTTPS!
So far we’ve been accessing our application directly over an internal Service. We’ll replace this with a more scalable solution by creating our first Ingress, giving it a domain name, and getting a TLS certificate through Let’s Encrypt to enable HTTPS.
What we learned
- Kubernetes Ingress
- Writing a manifest
- Connecting to a service port
- Assigning a global static IP
- Authoring hostname and path rules
- TLS configuration
- Helm package installation
- Issuer resource type
- Certificate resource type
- Annotating Ingresses to do ACME HTTP01 Let’s Encrypt dance
- How cert-manager modifies our Ingress to make
/.well-known/acme-challengepath available to Let’s Encrypt
- How cert-manager modifies our Ingress to make
kubectl getshortnames (
kubectl get svcvs
kubectl get services)
- GKE Ingress specifics
- 27:09 Grouping multiple related resources into a single manifest is a best practice
- I knowingly say “TLS certificate” instead of the more correct “X.509 certificate” for simplicity’s sake. Let’s Encrypt uses the same wording on their site so I think that’s okay.
- One thing I wanted to mention but forgot to in the episode is that we’re using the HTTP-01 ACME challenge type which is the only challenge type we can use with a free Duck DNS domain name. However there is also a DNS-01 challenge type which responds to challenges by creating TXT records. In my experience the DNS-01 challenge type works a lot smoother with cert-manager than the HTTP-01, and it also enables the creation of wildcard certificates. We couldn’t do this in the screencast however because it would require viewers to buy a domain name and set up GCP Cloud DNS as the DNS provider.
- Interestingly, the GKE Ingress
doesn’t even read the
hostsfield of the
tlsspec, however it is needed by cert-manager to make the Let’s Encrypt request.
At 13:17 I mentioned we’re creating a Kubernetes resource in the
kube-systemnamespace. I probably should’ve used this as an opportunity to talk a bit more about namespaces and how they can be used to separate applications. So instead I encourage you to read the documentation on them yourself. One handy flag worth mentioning is
--all-namespaces; for instance to see all the pods running in your cluster you can do:
kubectl get pods --all-namespaces
This will be necessary if you ever have to debug cert-manager, for instance, *cough* because it will start a pod in the
kube-systemnamespace. When you want to then, say, inspect the logs of the pod you found you have to specify the namespace with
-n; for example:
kubectl logs cert-manager-7d4bfc44ff-tp9g6 -n kube-system -f
- 20:17 regarding “self check failed,” that means that cert-manager did a pre-test to see if the domain is reachable before handing things off to Let’s Encrypt. It’s meant to save you from prematurely making requests to Let’s Encrypt that it thinks will fail to save you from getting rate-limited. Which is a neat idea, except when it doesn’t work.
- For the most detailed list of limitations with GCP’s Ingress, check out its GitHub repo
Episode 7: Boosting static asset performance using Cloud CDN
Up until now we’ve been serving our static assets directly from our Rails server (boo, slow!). We’ll replace this with Cloud CDN (hooray, fast!). To accomplish this we’ll meet a new Kubernetes resource, BackendConfig, and learn how to wire it up through a new Service port and our Ingress.
Bonus Episode 1: Provisioning cloud resources with Terraform
While building our application, we provisioned GCP resources using the web console and the gcloud CLI. We’ll investigate using Terraform to replace our manual work with declarative templates which will make our deploys repeatable, versionable, and all the other benefits of moving infrastructure management to code.
- This episode was recorded before Terraform 0.12 was released. With 0.12 you no longer need to enclose all attributes in quotes and there are now a few more types of variables besides strings.
- The SQL user we are creating has database superuser privileges. You may want to create a user with less privileges for your own app.
- The different cluster types are referred to as regional or zonal, which you can read more about here, and more about how to create the different types on GCP’s “creating a cluster” guide.
- There is a Kubernetes provider and a resource for Secrets. This is a much handier way to set the app-secrets Secret value, and this is the way I do it in the Helm episode. See the Starter Files repo for the .tf Terraform config.
Topics for further exploration:
- Terraform Kubernetes provider getting started
- Example creating a Kubernetes Secret for storing a GCP service account key
- Terraform modules
- Terraform remote state
Bonus Episode 2: Charting our app with Helm
We get sick of running
kubectl apply over and over and decide
to use Helm, Kubernetes’s package manager, to template and package
up our app into a reusable chart for simplified app deployment.
Reminder: Steps to provision a brand new GCP project using our Terraform config:
- Create GCP project
- Enable Cloud Resource Manager API
- Enable Compute Engine API (needed to import VPC default network before plan/apply runs)
- Create a service account for Terraform to use with project owner permission
- Generate a key for the service account, copy the .json file to
- Update the
terraform.tfvarsfile to set variables to your own values
cdto the provision directory, run
terraform initto initialize terraform provider plugins
- Import the VPC default network with
terraform import google_compute_network.vpc_default default
- Now you can
terraform plan -out /tmp/planand then
terraform apply /tmp/plan
Note: If you get an error when performing terraform plan/apply in the beginning like
Failed to create subnetwork. Please create Service Networking connection with service 'servicenetworking.googleapis.com' you may need to wait several minutes for the networking resources to fully initialize, then do
terraform taint random_id.db-instance and redo
apply to recreate the SQL instance.
- Go text/template template reference
- Hugo, a static site generator written in Go - has useful explanations and tips on Go templating
- Helm quickstart
- Helm Charts documentation
- Chart Development Tips and Tricks
- The Chart Best Practices Guide - Making our captioned-images chart conform to best practices is left as an exercise to the reader 😁
- GKE guide: Using Google-managed SSL certificates