Docker builds images automatically by reading the instructions from a
Dockerfile -- a text file that contains all commands, in order, needed to
build a given image.
A Docker image consists of read-only layers each of which represents a
Dockerfile instruction.
The layers are stacked and each one is a delta of the
changes from the previous layer
When you run an image and generate a container, you add a new writable layer
(the “container layer”) on top of the underlying layers.
By “ephemeral,” we mean that the container can be stopped
and destroyed, then rebuilt and replaced with an absolute minimum set up and
configuration.
Inadvertently including files that are not necessary for building an image
results in a larger build context and larger image size.
To exclude files not relevant to the build (without restructuring your source
repository) use a .dockerignore file. This file supports exclusion patterns
similar to .gitignore files.
minimize image layers by leveraging build cache.
if your build contains several layers, you can order them from the
less frequently changed (to ensure the build cache is reusable) to the more
frequently changed
avoid
installing extra or unnecessary packages just because they might be “nice to
have.”
Each container should have only one concern.
Decoupling applications into
multiple containers makes it easier to scale horizontally and reuse containers
Limiting each container to one process is a good rule of thumb, but it is not a
hard and fast rule.
Use your best judgment to keep containers as clean and modular as possible.
do multi-stage builds
and only copy the artifacts you need into the final image. This allows you to
include tools and debug information in your intermediate build stages without
increasing the size of the final image.
avoid duplication of packages and make the
list much easier to update.
When building an image, Docker steps through the instructions in your
Dockerfile, executing each in the order specified.
the next
instruction is compared against all child images derived from that base
image to see if one of them was built using the exact same instruction. If
not, the cache is invalidated.
simply comparing the instruction in the Dockerfile with one
of the child images is sufficient.
For the ADD and COPY instructions, the contents of the file(s)
in the image are examined and a checksum is calculated for each file.
If anything has changed in the file(s), such
as the contents and metadata, then the cache is invalidated.
cache checking does not look at the
files in the container to determine a cache match.
In that case just
the command string itself is used to find a match.
Whenever possible, use current official repositories as the basis for your
images.
Using RUN apt-get update && apt-get install -y ensures your Dockerfile
installs the latest package versions with no further coding or manual
intervention.
cache busting
Docker executes these commands using the /bin/sh -c interpreter, which only
evaluates the exit code of the last operation in the pipe to determine success.
set -o pipefail && to ensure that an unexpected error prevents the
build from inadvertently succeeding.
The CMD instruction should be used to run the software contained by your
image, along with any arguments.
CMD should almost always be used in the form
of CMD [“executable”, “param1”, “param2”…]
CMD should rarely be used in the manner of CMD [“param”, “param”] in
conjunction with ENTRYPOINT
The ENV instruction is also useful for providing required environment
variables specific to services you wish to containerize,
Each ENV line creates a new intermediate layer, just like RUN commands
COPY
is preferred
COPY only
supports the basic copying of local files into the container
the best use for ADD is local tar file
auto-extraction into the image, as in ADD rootfs.tar.xz /
If you have multiple Dockerfile steps that use different files from your
context, COPY them individually, rather than all at once.
using ADD to fetch packages from remote URLs is
strongly discouraged; you should use curl or wget instead
The best use for ENTRYPOINT is to set the image’s main command, allowing that
image to be run as though it was that command (and then use CMD as the
default flags).
the image name can double as a reference to the binary as
shown in the command
The VOLUME instruction should be used to expose any database storage area,
configuration storage, or files/folders created by your docker container.
use VOLUME for any mutable and/or user-serviceable
parts of your image
If you absolutely need
functionality similar to sudo, such as initializing the daemon as root but
running it as non-root), consider using “gosu”.
always use absolute paths for your
WORKDIR
An ONBUILD command executes after the current Dockerfile build completes.
Think
of the ONBUILD command as an instruction the parent Dockerfile gives
to the child Dockerfile
A Docker build executes ONBUILD commands before any command in a child
Dockerfile.
Be careful when putting ADD or COPY in ONBUILD. The “onbuild” image
fails catastrophically if the new build’s context is missing the resource being
added.
"My First 5 Minutes on a Server, by Bryan Kennedy, is an excellent intro into securing a server against most attacks. We have a few modifications to his approach that we wanted to document as part of our efforts of externalizing our processes and best practices. We also wanted to spend a bit more time explaining a few things that younger engineers may benefit from."
There’s a lot wrong with this: you could be using the wrong version of code that has exploits, has a bug in it, or worse it could have malware bundled in on purpose—you just don’t know.
Keep Base Images Small
Node.js for example, it includes an extra 600MB of libraries you don’t need.
A chart is a collection of files
that describe a related set of Kubernetes resources.
A single chart
might be used to deploy something simple, like a memcached pod, or
something complex, like a full web app stack with HTTP servers,
databases, caches, and so on.
Charts are created as files laid out in a particular directory tree,
then they can be packaged into versioned archives to be deployed.
A chart is organized as a collection of files inside of a directory.
values.yaml # The default configuration values for this chart
charts/ # A directory containing any charts upon which this chart depends.
templates/ # A directory of templates that, when combined with values,
# will generate valid Kubernetes manifest files.
version: A SemVer 2 version (required)
apiVersion: The chart API version, always "v1" (required)
Every chart must have a version number. A version must follow the
SemVer 2 standard.
non-SemVer names are explicitly
disallowed by the system.
When generating a
package, the helm package command will use the version that it finds
in the Chart.yaml as a token in the package name.
the appVersion field is not related to the version field. It is
a way of specifying the version of the application.
appVersion: The version of the app that this contains (optional). This needn't be SemVer.
If the latest version of a chart in the
repository is marked as deprecated, then the chart as a whole is considered to
be deprecated.
deprecated: Whether this chart is deprecated (optional, boolean)
one chart may depend on any number of other charts.
dependencies can be dynamically linked through the requirements.yaml
file or brought in to the charts/ directory and managed manually.
the preferred method of declaring dependencies is by using a
requirements.yaml file inside of your chart.
A requirements.yaml file is a simple file for listing your
dependencies.
The repository field is the full URL to the chart repository.
you must also use helm repo add to add that repo locally.
helm dependency update
and it will use your dependency file to download all the specified
charts into your charts/ directory for you.
When helm dependency update retrieves charts, it will store them as
chart archives in the charts/ directory.
Managing charts with requirements.yaml is a good way to easily keep
charts updated, and also share requirements information throughout a
team.
All charts are loaded by default.
The condition field holds one or more YAML paths (delimited by commas).
If this path exists in the top parent’s values and resolves to a boolean value,
the chart will be enabled or disabled based on that boolean value.
The tags field is a YAML list of labels to associate with this chart.
all charts with tags can be enabled or disabled by
specifying the tag and a boolean value.
The --set parameter can be used as usual to alter tag and condition values.
Conditions (when set in values) always override tags.
The first condition path that exists wins and subsequent ones for that chart are ignored.
The keys containing the values to be imported can be specified in the parent chart’s requirements.yaml file
using a YAML list. Each item in the list is a key which is imported from the child chart’s exports field.
specifying the key data in our import list, Helm looks in the exports field of the child
chart for data key and imports its contents.
the parent key data is not contained in the parent’s final values. If you need to specify the
parent key, use the ‘child-parent’ format.
To access values that are not contained in the exports key of the child chart’s values, you will need to
specify the source key of the values to be imported (child) and the destination path in the parent chart’s
values (parent).
To drop a dependency into your charts/ directory, use the
helm fetch command
A dependency can be either a chart archive (foo-1.2.3.tgz) or an
unpacked chart directory.
name cannot start with _ or ..
Such files are ignored by the chart loader.
a single release is created with all the objects for the chart and its dependencies.
Helm Chart templates are written in the
Go template language, with the
addition of 50 or so add-on template
functions from the Sprig library and a
few other specialized functions
When
Helm renders the charts, it will pass every file in that directory
through the template engine.
Chart developers may supply a file called values.yaml inside of a
chart. This file can contain default values.
Chart users may supply a YAML file that contains values. This can be
provided on the command line with helm install.
When a user supplies custom values, these values will override the
values in the chart’s values.yaml file.
Template files follow the standard conventions for writing Go templates
{{default "minio" .Values.storage}}
Values that are supplied via a values.yaml file (or via the --set
flag) are accessible from the .Values object in a template.
pre-defined, are available to every template, and
cannot be overridden
the names are case
sensitive
Release.Name: The name of the release (not the chart)
Release.IsUpgrade: This is set to true if the current operation is an upgrade or rollback.
Release.Revision: The revision number. It begins at 1, and increments with
each helm upgrade
Chart: The contents of the Chart.yaml
Files: A map-like object containing all non-special files in the chart.
Files can be
accessed using {{index .Files "file.name"}} or using the {{.Files.Get name}} or
{{.Files.GetString name}} functions.
.helmignore
access the contents of the file
as []byte using {{.Files.GetBytes}}
Any unknown Chart.yaml fields will be dropped
Chart.yaml cannot be
used to pass arbitrarily structured data into the template.
A values file is formatted in YAML.
A chart may include a default
values.yaml file
be merged into the default
values file.
The default values file included inside of a chart must be named
values.yaml
accessible inside of templates using the
.Values object
Values files can declare values for the top-level chart, as well as for
any of the charts that are included in that chart’s charts/ directory.
Charts at a higher level have access to all of the variables defined
beneath.
lower level charts cannot access things in
parent charts
Values are namespaced, but namespaces are pruned.
the scope of the values has been reduced and the
namespace prefix removed
Helm supports special “global” value.
a way of sharing one top-level variable with all
subcharts, which is useful for things like setting metadata properties
like labels.
If a subchart declares a global variable, that global will be passed
downward (to the subchart’s subcharts), but not upward to the parent
chart.
global variables of parent charts take precedence over the global variables from subcharts.
helm lint
A chart repository is an HTTP server that houses one or more packaged
charts
Any HTTP server that can serve YAML files and tar files and can answer
GET requests can be used as a repository server.
Helm does not provide tools for uploading charts to
remote repository servers.
the only way to add a chart to $HELM_HOME/starters is to manually
copy it there.
Helm provides a hook mechanism to allow chart developers to intervene
at certain points in a release’s life cycle.
Execute a Job to back up a database before installing a new chart,
and then execute a second job after the upgrade in order to restore
data.
Hooks are declared as an annotation in the metadata section of a manifest
Hooks work like regular templates, but they have special annotations
pre-install
post-install: Executes after all resources are loaded into Kubernetes
pre-delete
post-delete: Executes on a deletion request after all of the release’s
resources have been deleted.
pre-upgrade
post-upgrade
pre-rollback
post-rollback: Executes on a rollback request after all resources
have been modified.
crd-install
test-success: Executes when running helm test and expects the pod to
return successfully (return code == 0).
test-failure: Executes when running helm test and expects the pod to
fail (return code != 0).
Hooks allow you, the chart developer, an opportunity to perform
operations at strategic points in a release lifecycle
Tiller then loads the hook with the lowest weight first (negative to positive)
Tiller returns the release name (and other data) to the client
If the resources is a Job kind, Tiller
will wait until the job successfully runs to completion.
if the job
fails, the release will fail. This is a blocking operation, so the
Helm client will pause while the Job is run.
If they
have hook weights (see below), they are executed in weighted order. Otherwise,
ordering is not guaranteed.
good practice to add a hook weight, and set it
to 0 if weight is not important.
The resources that a hook creates are not tracked or managed as part of the
release.
leave the hook resource alone.
To destroy such
resources, you need to either write code to perform this operation in a pre-delete
or post-delete hook or add "helm.sh/hook-delete-policy" annotation to the hook template file.
Hooks are just Kubernetes manifest files with special annotations in the
metadata section
One resource can implement multiple hooks
no limit to the number of different resources that
may implement a given hook.
When subcharts declare hooks, those are also evaluated. There is no way
for a top-level chart to disable the hooks declared by subcharts.
Hook weights can be positive or negative numbers but must be represented as
strings.
sort those hooks in ascending order.
Hook deletion policies
"before-hook-creation" specifies Tiller should delete the previous hook before the new hook is launched.
By default Tiller will wait for 60 seconds for a deleted hook to no longer exist in the API server before timing out.
Custom Resource Definitions (CRDs) are a special kind in Kubernetes.
The crd-install hook is executed very early during an installation, before
the rest of the manifests are verified.
A common reason why the hook resource might already exist is that it was not deleted following use on a previous install/upgrade.
Helm uses Go templates for templating
your resource files.
two special template functions: include and required
include
function allows you to bring in another template, and then pass the results to other
template functions.
The required function allows you to declare a particular
values entry as required for template rendering.
If the value is empty, the template
rendering will fail with a user submitted error message.
When you are working with string data, you are always safer quoting the
strings than leaving them as bare words
Quote Strings, Don’t Quote Integers
when working with integers do not quote the values
env variables values which are expected to be string
to include a template, and then perform an operation
on that template’s output, Helm has a special include function
The above includes a template called toYaml, passes it $value, and
then passes the output of that template to the nindent function.
Go provides a way for setting template options to control behavior
when a map is indexed with a key that’s not present in the map
The required function gives developers the ability to declare a value entry
as required for template rendering.
The tpl function allows developers to evaluate strings as templates inside a template.
Rendering a external configuration file
(.Files.Get "conf/app.conf")
Image pull secrets are essentially a combination of registry, username, and password.
Automatically Roll Deployments When ConfigMaps or Secrets change
configmaps or secrets are injected as configuration
files in containers
a restart may be required should those
be updated with a subsequent helm upgrade
The sha256sum function can be used to ensure a deployment’s
annotation section is updated if another file changes
checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
helm upgrade --recreate-pods
"helm.sh/resource-policy": keep
resources that should not be deleted when Helm runs a
helm delete
this resource becomes
orphaned. Helm will no longer manage it in any way.
create some reusable parts in your chart
In the templates/ directory, any file that begins with an
underscore(_) is not expected to output a Kubernetes manifest file.
by convention, helper templates and partials are placed in a
_helpers.tpl file.
The current best practice for composing a complex application from discrete parts
is to create a top-level umbrella chart that
exposes the global configurations, and then use the charts/ subdirectory to
embed each of the components.
SAP’s Converged charts: These charts
install SAP Converged Cloud a full OpenStack IaaS on Kubernetes. All of the charts are collected
together in one GitHub repository, except for a few submodules.
Deis’s Workflow:
This chart exposes the entire Deis PaaS system with one chart. But it’s different
from the SAP chart in that this umbrella chart is built from each component, and
each component is tracked in a different Git repository.
YAML is a superset of JSON
any valid JSON structure ought to be valid in YAML.
As a best practice, templates should follow a YAML-like syntax unless
the JSON syntax substantially reduces the risk of a formatting issue.
There are functions in Helm that allow you to generate random data,
cryptographic keys, and so on.
a chart repository is a location where packaged charts can be
stored and shared.
A chart repository is an HTTP server that houses an index.yaml file and
optionally some packaged charts.
Because a chart repository can be any HTTP server that can serve YAML and tar
files and can answer GET requests, you have a plethora of options when it comes
down to hosting your own chart repository.
It is not required that a chart package be located on the same server as the
index.yaml file.
A valid chart repository must have an index file. The
index file contains information about each chart in the chart repository.
The Helm project provides an open-source Helm repository server called ChartMuseum that you can host yourself.
$ helm repo index fantastic-charts --url https://fantastic-charts.storage.googleapis.com
A repository will not be added if it does not contain a valid
index.yaml
add the repository to their helm client via the helm
repo add [NAME] [URL] command with any name they would like to use to
reference the repository.
Helm has provenance tools which help chart users verify the integrity and origin
of a package.
Integrity is established by comparing a chart to a provenance record
The provenance file contains a chart’s YAML file plus several pieces of
verification information
Chart repositories serve as a centralized collection of Helm charts.
Chart repositories must make it possible to serve provenance files over HTTP via
a specific request, and must make them available at the same URI path as the chart.
We don’t want to be “the certificate authority” for all chart
signers. Instead, we strongly favor a decentralized model, which is part
of the reason we chose OpenPGP as our foundational technology.
The Keybase platform provides a public
centralized repository for trust information.
A chart contains a number of Kubernetes resources and components that work together.
A test in a helm chart lives under the templates/ directory and is a pod definition that specifies a container with a given command to run.
The pod definition must contain one of the helm test hook annotations: helm.sh/hook: test-success or helm.sh/hook: test-failure
helm test
nest your test suite under a tests/ directory like <chart-name>/templates/tests/
GitLab flow as a clearly defined set of best practices.
It combines feature-driven development and feature branches with issue tracking.
In Git, you add files from the working copy to the staging area. After that, you commit them to your local repo.
The third step is pushing to a shared remote repository.
The biggest problem is that many long-running branches emerge that all contain part of the changes.
It is a convention to call your default branch master and to mostly branch from and merge to this.
Nowadays, most organizations practice continuous delivery, which means that your default branch can be deployed.
Continuous delivery removes the need for hotfix and release branches, including all the ceremony they introduce.
Merging everything into the master branch and frequently deploying means you minimize the amount of unreleased code, which is in line with lean and continuous delivery best practices.
GitHub flow assumes you can deploy to production every time you merge a feature branch.
You can deploy a new version by merging master into the production branch.
If you need to know what code is in production, you can just checkout the production branch to see.
Production branch
Environment branches
have an environment that is automatically updated to the master branch.
deploy the master branch to staging.
To deploy to pre-production, create a merge request from the master branch to the pre-production branch.
Go live by merging the pre-production branch into the production branch.
Release branches
work with release branches if you need to release software to the outside world.
each branch contains a minor version
After announcing a release branch, only add serious bug fixes to the branch.
merge these bug fixes into master, and then cherry-pick them into the release branch.
Merging into master and then cherry-picking into release is called an “upstream first” policy
Tools such as GitHub and Bitbucket choose the name “pull request” since the first manual action is to pull the feature branch.
Tools such as GitLab and others choose the name “merge request” since the final action is to merge the feature branch.
If you work on a feature branch for more than a few hours, it is good to share the intermediate result with the rest of the team.
the merge request automatically updates when new commits are pushed to the branch.
If the assigned person does not feel comfortable, they can request more changes or close the merge request without merging.
In GitLab, it is common to protect the long-lived branches, e.g., the master branch, so that most developers can’t modify them.
if you want to merge into a protected branch, assign your merge request to someone with maintainer permissions.
After you merge a feature branch, you should remove it from the source control software.
Having a reason for every code change helps to inform the rest of the team and to keep the scope of a feature branch small.
If there is no issue yet, create the issue
The issue title should describe the desired state of the system.
For example, the issue title “As an administrator, I want to remove users without receiving an error” is better than “Admin can’t remove users.”
create a branch for the issue from the master branch
If you open the merge request but do not assign it to anyone, it is a “Work In Progress” merge request.
Start the title of the merge request with [WIP] or WIP: to prevent it from being merged before it’s ready.
When they press the merge button, GitLab merges the code and creates a merge commit that makes this event easily visible later on.
Merge requests always create a merge commit, even when the branch could be merged without one.
This merge strategy is called “no fast-forward” in Git.
Suppose that a branch is merged but a problem occurs and the issue is reopened.
In this case, it is no problem to reuse the same branch name since the first branch was deleted when it was merged.
At any time, there is at most one branch for every issue.
It is possible that one feature branch solves more than one issue.
GitLab closes these issues when the code is merged into the default branch.
If you have an issue that spans across multiple repositories, create an issue for each repository and link all issues to a parent issue.
use an interactive rebase (rebase -i) to squash multiple commits into one or reorder them.
you should never rebase commits you have pushed to a remote server.
Rebasing creates new commits for all your changes, which can cause confusion because the same change would have multiple identifiers.
if someone has already reviewed your code, rebasing makes it hard to tell what changed since the last review.
never rebase commits authored by other people.
it is a bad idea to rebase commits that you have already pushed.
If you revert a merge commit and then change your mind, revert the revert commit to redo the merge.
Often, people avoid merge commits by just using rebase to reorder their commits after the commits on the master branch.
Using rebase prevents a merge commit when merging master into your feature branch, and it creates a neat linear history.
every time you rebase, you have to resolve similar conflicts.
Sometimes you can reuse recorded resolutions (rerere), but merging is better since you only have to resolve conflicts once.
A good way to prevent creating many merge commits is to not frequently merge master into the feature branch.
keep your feature branches short-lived.
Most feature branches should take less than one day of work.
If your feature branches often take more than a day of work, try to split your features into smaller units of work.
You could also use feature toggles to hide incomplete features so you can still merge back into master every day.
you should try to prevent merge commits, but not eliminate them.
Your codebase should be clean, but your history should represent what actually happened.
If you rebase code, the history is incorrect, and there is no way for tools to remedy this because they can’t deal with changing commit identifiers
Commit often and push frequently
You should push your feature branch frequently, even when it is not yet ready for review.
A commit message should reflect your intention, not just the contents of the commit.
each merge request must be tested before it is accepted.
test the master branch after each change.
If new commits in master cause merge conflicts with the feature branch, merge master back into the branch to make the CI server re-run the tests.
When creating a feature branch, always branch from an up-to-date master.
Do not merge from upstream again if your code can work and merge cleanly without doing so.
trigger the query and therefore, we lose our Relation
leaving trivial ordering out of scopes all together.
where
where
.merge() makes it easy to use scopes from other models that have been joined into the query, reducing potential duplication.
ActiveRecord provides an easy API for doing many things with our database, but it also makes it pretty easy to do things inefficiently. The layer of abstraction hides what’s really happening.
first pure SQL, then ActiveRecord
Databases can only do fast lookups for columns with indexes, otherwise it’s doing a sequential scan
Add an index on every id column as well as any column that is used in a where clause.
use a Query class to encapsulate the potentially gnarly query.
subqueries
this Query returns an ActiveRecord::Relation
where
where
Single Responsibility Principle
Avoid ad-hoc queries outside of Scopes and Query Objects
encapsulate data access into scopes and Query objects
An ad-hoc query embedded in a controller (or view, task, etc) is harder to test in isolation and cannot be reused
CMD instruction should be used to run the software contained by your
image, along with any arguments
CMD should be given an interactive shell (bash, python,
perl, etc)
COPY them individually, rather than all at once
COPY
is preferred
using ADD to fetch packages from remote URLs is
strongly discouraged
always use COPY
The best use for ENTRYPOINT is to set the image's main command, allowing that
image to be run as though it was that command (and then use CMD as the
default flags).
the image name can double as a reference to the binary as
shown in the command above
ENTRYPOINT instruction can also be used in combination with a helper
script
The VOLUME instruction should be used to expose any database storage area,
configuration storage, or files/folders created by your docker container.
use USER to change to a non-root
user
avoid installing or using sudo
avoid switching USER back
and forth frequently.
always use absolute paths for your
WORKDIR
ONBUILD is only useful for images that are going to be built FROM a given
image
The “onbuild” image will
fail catastrophically if the new build's context is missing the resource being
added.
Helm will figure out where to install Tiller by reading your Kubernetes
configuration file (usually $HOME/.kube/config). This is the same file
that kubectl uses.
By default, when Tiller is installed, it does not have authentication enabled.
helm repo update
Without a max history set the history is kept indefinitely, leaving a large number of records for helm and tiller to maintain.
helm init --upgrade
Whenever you install a chart, a new release is created.
one chart can
be installed multiple times into the same cluster. And each can be
independently managed and upgraded.
helm list function will show you a list of all deployed releases.
helm delete
helm status
you
can audit a cluster’s history, and even undelete a release (with helm
rollback).
the Helm
server (Tiller).
The Helm client (helm)
brew install kubernetes-helm
Tiller, the server portion of Helm, typically runs inside of your
Kubernetes cluster.
it can also be run locally, and
configured to talk to a remote Kubernetes cluster.
Role-Based Access Control - RBAC for short
create a service account for Tiller with the right roles and permissions to access resources.
run Tiller in an RBAC-enabled Kubernetes cluster.
run kubectl get pods --namespace
kube-system and see Tiller running.
helm inspect
Helm will look for Tiller in the kube-system namespace unless
--tiller-namespace or TILLER_NAMESPACE is set.
For development, it is sometimes easier to work on Tiller locally, and
configure it to connect to a remote Kubernetes cluster.
even when running locally, Tiller will store release
configuration in ConfigMaps inside of Kubernetes.
helm version should show you both
the client and server version.
Tiller stores its data in Kubernetes ConfigMaps, you can safely
delete and re-install Tiller without worrying about losing any data.
helm reset
The --node-selectors flag allows us to specify the node labels required
for scheduling the Tiller pod.
--override allows you to specify properties of Tiller’s
deployment manifest.
helm init --override manipulates the specified properties of the final
manifest (there is no “values” file).
The --output flag allows us skip the installation of Tiller’s deployment
manifest and simply output the deployment manifest to stdout in either
JSON or YAML format.
By default, tiller stores release information in ConfigMaps in the namespace
where it is running.
switch from the default backend to the secrets
backend, you’ll have to do the migration for this on your own.
a beta SQL storage backend that stores release
information in an SQL database (only postgres has been tested so far).
Once you have the Helm Client and Tiller successfully installed, you can
move on to using Helm to manage charts.
Helm requires that kubelet have access to a copy of the socat program to proxy connections to the Tiller API.
A Release is an instance of a chart running in a Kubernetes cluster.
One chart can often be installed many times into the same cluster.
helm init --client-only
helm init --dry-run --debug
A panic in Tiller is almost always the result of a failure to negotiate with the
Kubernetes API server
Tiller and Helm have to negotiate a common version to make sure that they can safely
communicate without breaking API assumptions
helm delete --purge
Helm stores some files in $HELM_HOME, which is
located by default in ~/.helm
A Chart is a Helm package. It contains all of the resource definitions
necessary to run an application, tool, or service inside of a Kubernetes
cluster.
it like the Kubernetes equivalent of a Homebrew formula,
an Apt dpkg, or a Yum RPM file.
A Repository is the place where charts can be collected and shared.
Set the $HELM_HOME environment variable
each time it is installed, a new release is created.
Helm installs charts into Kubernetes, creating a new release for
each installation. And to find new charts, you can search Helm chart
repositories.
chart repository is named
stable by default
helm search shows you all of the available charts
helm inspect
To install a new package, use the helm install command. At its
simplest, it takes only one argument: The name of the chart.
If you want to use your own release name, simply use the
--name flag on helm install
additional configuration steps you can or
should take.
Helm does not wait until all of the resources are running before it
exits. Many charts require Docker images that are over 600M in size, and
may take a long time to install into the cluster.
helm status
helm inspect
values
helm inspect values stable/mariadb
override any of these settings in a YAML formatted file,
and then pass that file during installation.
helm install -f config.yaml stable/mariadb
--values (or -f): Specify a YAML file with overrides.
--set (and its variants --set-string and --set-file): Specify overrides on the command line.
Values that have been --set can be cleared by running helm upgrade with --reset-values
specified.
Chart
designers are encouraged to consider the --set usage when designing the format
of a values.yaml file.
--set-file key=filepath is another variant of --set.
It reads the file and use its content as a value.
inject a multi-line text into values without dealing with indentation in YAML.
An unpacked chart directory
When a new version of a chart is released, or when you want to change
the configuration of your release, you can use the helm upgrade
command.
Kubernetes charts can be large and
complex, Helm tries to perform the least invasive upgrade.
It will only
update things that have changed since the last release
If both are used, --set values are merged into --values with higher precedence.
The helm get command is a useful tool for looking at a release in the
cluster.
helm rollback
A release version is an incremental revision. Every time an install,
upgrade, or rollback happens, the revision number is incremented by 1.
helm history
a release name cannot be
re-used.
you can rollback a
deleted resource, and have it re-activate.
helm repo list
helm repo add
helm repo update
The Chart Development Guide explains how to develop your own
charts.
helm create
helm lint
helm package
Charts that are archived can be loaded into chart repositories.
chart repository server
Tiller can be installed into any namespace.
Limiting Tiller to only be able to install into specific namespaces and/or resource types is controlled by Kubernetes RBAC roles and rolebindings
Release names are unique PER TILLER INSTANCE
Charts should only contain resources that exist in a single namespace.
not recommended to have multiple Tillers configured to manage resources in the same namespace.
a client-side Helm plugin. A plugin is a
tool that can be accessed through the helm CLI, but which is not part of the
built-in Helm codebase.
Helm plugins are add-on tools that integrate seamlessly with Helm. They provide
a way to extend the core feature set of Helm, but without requiring every new
feature to be written in Go and added to the core tool.
Helm plugins live in $(helm home)/plugins
The Helm plugin model is partially modeled on Git’s plugin model
helm referred to as the porcelain layer, with
plugins being the plumbing.
command is the command that this plugin will
execute when it is called.
Environment variables are interpolated before the plugin
is executed.
The command itself is not executed in a shell. So you can’t oneline a shell script.
Helm is able to fetch Charts using HTTP/S
Variables like KUBECONFIG are set for the plugin if they are set in the
outer environment.
In Kubernetes, granting a role to an application-specific service account is a best practice to ensure that your application is operating in the scope that you have specified.
restrict Tiller’s capabilities to install resources to certain namespaces, or to grant a Helm client running access to a Tiller instance.
Service account with cluster-admin role
The cluster-admin role is created by default in a Kubernetes cluster
Deploy Tiller in a namespace, restricted to deploying resources only in that namespace
Deploy Tiller in a namespace, restricted to deploying resources in another namespace
When running a Helm client in a pod, in order for the Helm client to talk to a Tiller instance, it will need certain privileges to be granted.
SSL Between Helm and Tiller
The Tiller authentication model uses client-side SSL certificates.
creating an internal CA, and using both the
cryptographic and identity functions of SSL.
Helm is a powerful and flexible package-management and operations tool for Kubernetes.
default installation applies no security configurations
with a cluster that is well-secured in a private network with no data-sharing or no other users or teams.
With great power comes great responsibility.
Choose the Best Practices you should apply to your helm installation
Role-based access control, or RBAC
Tiller’s gRPC endpoint and its usage by Helm
Kubernetes employ a role-based access control (or RBAC) system (as do modern operating systems) to help mitigate the damage that can be done if credentials are misused or bugs exist.
In the default installation the gRPC endpoint that Tiller offers is available inside the cluster (not external to the cluster) without authentication configuration applied.
Tiller stores its release information in ConfigMaps. We suggest changing the default to Secrets.
release information
charts
charts are a kind of package that not only installs containers you may or may not have validated yourself, but it may also install into more than one namespace.
As with all shared software, in a controlled or shared environment you must validate all software you install yourself before you install it.
Helm’s provenance tools to ensure the provenance and integrity of charts
"Helm will figure out where to install Tiller by reading your Kubernetes configuration file (usually $HOME/.kube/config). This is the same file that kubectl uses."
use the StatefulSet workload controller to maintain identity for each of the pods, and to use Persistent Volumes to persist data so it can survive a service restart.
a way to extend Kubernetes functionality with application specific logic using custom resources and custom controllers.
An Operator can automate various features of an application, but it should be specific to a single application
Kubebuilder is a comprehensive development kit for building and publishing Kubernetes APIs and Controllers using CRDs
Design declarative APIs for operators, not imperative APIs. This aligns well with Kubernetes APIs that are declarative in nature.
With declarative APIs, users only need to express their desired cluster state, while letting the operator perform all necessary steps to achieve it.
scaling, backup, restore, and monitoring. An operator should be made up of multiple controllers that specifically handle each of the those features.
the operator can have a main controller to spawn and manage application instances, a backup controller to handle backup operations, and a restore controller to handle restore operations.
each controller should correspond to a specific CRD so that the domain of each controller's responsibility is clear.
If you keep a log for every container, you will likely end up with unmanageable amount of logs.
integrate application-specific details to the log messages such as adding a prefix for the application name.
you may have to use external logging tools such as Google Stackdriver, Elasticsearch, Fluentd, or Kibana to perform the aggregations.
adding labels to metrics to facilitate aggregation and analysis by monitoring systems.
a more viable option is for application pods to expose a metrics HTTP endpoint for monitoring tools to scrape.
A good way to achieve this is to use open-source application-specific exporters for exposing Prometheus-style metrics.
DevOps is a set of practices that automates the processes between software development and IT teams, in order that they can build, test, and release software faster and more reliably.
increased trust, faster software releases, ability to solve critical issues quickly, and better manage unplanned work.
bringing together the best of software development and IT operations.
a firm handshake between development and operations
DevOps isn’t magic, and transformations don’t happen overnight.
Infrastructure as code
Culture is the #1 success factor in DevOps.
Building a culture of shared responsibility, transparency and faster feedback is the foundation of every high performing DevOps team.
'not our problem' mentality
DevOps is that change in mindset of looking at the development process holistically and breaking down the barrier between Dev and Ops.
Speed is everything.
Lack of automated test and review cycles block the release to production and poor incident response time kills velocity and team confidence
Open communication helps Dev and Ops teams swarm on issues, fix incidents, and unblock the release pipeline faster.
Unplanned work is a reality that every team faces–a reality that most often impacts team productivity.
“cross-functional collaboration.”
All the tooling and automation in the world are useless if they aren’t accompanied by a genuine desire on the part of development and IT/Ops professionals to work together.
DevOps doesn’t solve tooling problems. It solves human problems.
Forming project- or product-oriented teams to replace function-based teams is a step in the right direction.
sharing a common goal and having a plan to reach it together
join sprint planning sessions, daily stand-ups, and sprint demos.
DevOps culture across every department
open channels of communication, and talk regularly
continuous delivery: the practice of running each code change through a gauntlet of automated tests, often facilitated by cloud-based infrastructure, then packaging up successful builds and promoting them up toward production using automated deploys.
automated deploys alert IT/Ops to server “drift” between environments, which reduces or eliminates surprises when it’s time to release.
“configuration as code.”
when DevOps uses automated deploys to send thoroughly tested code to identically provisioned environments, “Works on my machine!” becomes irrelevant.
A DevOps mindset sees opportunities for continuous improvement everywhere.
regular retrospectives
A/B testing
failure is inevitable. So you might as well set up your team to absorb it, recover, and learn from it (some call this “being anti-fragile”).
Postmortems focus on where processes fell down and how to strengthen them – not on which team member f'ed up the code.
Our engineers are responsible for QA, writing, and running their own tests to get the software out to customers.
How long did it take to go from development to deployment?
How long does it take to recover after a system failure?
service level agreements (SLAs)
Devops isn't any single person's job. It's everyone's job.
DevOps is big on the idea that the same people who build an application should be involved in shipping and running it.
developers and operators pair with each other in each phase of the application’s lifecycle.
most organizations practice continuous delivery, which means that your default branch can be deployed.
Merging everything into the master branch and frequently deploying means you minimize the amount of unreleased code, which is in line with lean and continuous delivery best practices.
you can deploy to production every time you merge a feature branch.
deploy a new version by merging master into the production branch.
you can have your deployment script create a tag on each deployment.
to have an environment that is automatically updated to the master branch
commits only flow downstream, ensures that everything is tested in all environments.
first merge these bug fixes into master, and then cherry-pick them into the release branch.
Merging into master and then cherry-picking into release is called an “upstream first” policy
“merge request” since the final action is to merge the feature branch.
“pull request” since the first manual action is to pull the feature branch
it is common to protect the long-lived branches
After you merge a feature branch, you should remove it from the source control software
When you are ready to code, create a branch for the issue from the master branch.
This branch is the place for any work related to this change.
A merge request is an online place to discuss the change and review the code.
If you open the merge request but do not assign it to anyone, it is a “Work In Progress” merge request.
Start the title of the merge request with “[WIP]” or “WIP:” to prevent it from being merged before it’s ready.
To automatically close linked issues, mention them with the words “fixes” or “closes,” for example, “fixes #14” or “closes #67.” GitLab closes these issues when the code is merged into the default branch.
If you have an issue that spans across multiple repositories, create an issue for each repository and link all issues to a parent issue.
With Git, you can use an interactive rebase (rebase -i) to squash multiple commits into one or reorder them.
you should never rebase commits you have pushed to a remote server.
Rebasing creates new commits for all your changes, which can cause confusion because the same change would have multiple identifiers.
if someone has already reviewed your code, rebasing makes it hard to tell what changed since the last review.
never rebase commits authored by other people.
it is a bad idea to rebase commits that you have already pushed.
always use the “no fast-forward” (--no-ff) strategy when you merge manually.
you should try to avoid merge commits in feature branches
people avoid merge commits by just using rebase to reorder their commits after the commits on the master branch.
Using rebase prevents a merge commit when merging master into your feature branch, and it creates a neat linear history.
you should never rebase commits you have pushed to a remote server
Sometimes you can reuse recorded resolutions (rerere), but merging is better since you only have to resolve conflicts once.
not frequently merge master into the feature branch.
utilizing new code,
resolving merge conflicts
updating long-running branches.
just cherry-picking a commit.
If your feature branch has a merge conflict, creating a merge commit is a standard way of solving this.
keep your feature branches short-lived.
split your features into smaller units of work
you should try to prevent merge commits, but not eliminate them.
Your codebase should be clean, but your history should represent what actually happened.
Splitting up work into individual commits provides context for developers looking at your code later.
push your feature branch frequently, even when it is not yet ready for review.
Commit often and push frequently
A commit message should reflect your intention, not just the contents of the commit.
Testing before merging
When using GitLab flow, developers create their branches from this master branch, so it is essential that it never breaks.
Therefore, each merge request must be tested before it is accepted.
When creating a feature branch, always branch from an up-to-date master
Trunk-based development is a version control management practice where developers merge small, frequent updates to a core “trunk” or main branch.
Gitflow and trunk-based development.
Gitflow, which was popularized first, is a stricter development model where only certain individuals can approve changes to the main code. This maintains code quality and minimizes the number of bugs.
Trunk-based development is a more open model since all developers have access to the main code. This enables teams to iterate quickly and implement CI/CD.
Developers can create short-lived branches with a few small commits compared to other long-lived feature branching strategies.
Gitflow is an alternative Git branching model that uses long-lived feature branches and multiple primary branches.
Gitflow also has separate primary branch lines for development, hotfixes, features, and releases.
Trunk-based development is far more simplified since it focuses on the main branch as the source of fixes and releases.
Trunk-based development eases the friction of code integration.
trunk-based development model reduces these conflicts.
Adding an automated test suite and code coverage monitoring for this stream of commits enables continuous integration.
When new code is merged into the trunk, automated integration and code coverage tests run to validate the code quality.
Trunk-based development strives to keep the trunk branch “green”, meaning it's ready to deploy at any commit.
With continuous integration, developers perform trunk-based development in conjunction with automated tests that run after each committee to a trunk.
If trunk-based development was like music it would be a rapid staccato -- short, succinct notes in rapid succession, with the repository commits being the notes.
Instead of creating a feature branch and waiting to build out the complete specification, developers can instead create a trunk commit that introduces the feature flag and pushes new trunk commits that build out the feature specification within the flag.
Automated testing is necessary for any modern software project intending to achieve CI/CD.
Short running unit and integration tests are executed during development and upon code merge.
Automated tests provide a layer of preemptive code review.
Once a branch merges, it is best practice to delete it.
A repository with a large amount of active branches has some unfortunate side effects
Merge branches to the trunk at least once a day
The “continuous” in CI/CD implies that updates are constantly flowing.
ED25519 is more vulnerable to quantum computation than is RSA
best practice to be using a hardware token
to use a yubikey via gpg: with this method you use your gpg subkey as an ssh key
sit down and spend an hour thinking about your backup and recovery strategy first
never share a private keys between physical devices
allows you to revoke a single credential if you lose (control over) that device
If a private key ever turns up on the wrong machine,
you *know* the key and both source and destination
machines have been compromised.
centralized management of authentication/authorization
I have setup a VPS, disabled passwords, and setup a key with a passphrase to gain access. At this point my greatest worry is losing this private key, as that means I can't access the server.What is a reasonable way to backup my private key?
a mountable disk image that's encrypted
a system that can update/rotate your keys across all of your servers on the fly in case one is compromised or assumed to be compromised.
different keys for different purposes per client device
fall back to password plus OTP
relying completely on the security of your disk, against either physical or cyber.
It is better to use a different passphrase for each key but it is also less convenient unless you're using a password manager (personally, I'm using KeePass)
- RSA is pretty standard, and generally speaking is fairly secure for key lengths >=2048. RSA-2048 is the default for ssh-keygen, and is compatible with just about everything.
public-key authentication has somewhat unexpected side effect of preventing MITM per this security consulting firm
Disable passwords and only allow keys even for root with PermitRootLogin without-password
You should definitely use a different passphrase for keys stored on separate computers,
designing and implementing a REST API in an intentionally simplistic task management web application, and will cover some best practices to ensure maintainability of the code.
each individual request should have no context of the requests that came before it.
each request that modifies the database should act on one and only one row of one and only one table
The resource endpoints should return representations of the resource as data, usually XML or JSON.
POST for create, PUT for update, PATCH for upsert (update and insert).
an existing API should never be modified, except for critical bugfixes
Rather than changing existing endpoints, expose a new version
using unique database ids in the route chain allows users to access short routes, and simplifies resource lookup
while exposing internal database ids to the consumer and requiring the consumer to maintain a reference to ids on their end
The downfall is longer nested routes
require reauthentication on a per-request level
Devise.secure_compare helps avoid timing attacks
Defensive programming is a software design principle that dictates that a piece of software should be designed to continue functioning in unforeseen circumstances.