Docker builds images automatically by reading the instructions from a
Dockerfile -- a text file that contains all commands, in order, needed to
build a given image.
A Docker image consists of read-only layers each of which represents a
Dockerfile instruction.
The layers are stacked and each one is a delta of the
changes from the previous layer
When you run an image and generate a container, you add a new writable layer
(the “container layer”) on top of the underlying layers.
By “ephemeral,” we mean that the container can be stopped
and destroyed, then rebuilt and replaced with an absolute minimum set up and
configuration.
Inadvertently including files that are not necessary for building an image
results in a larger build context and larger image size.
To exclude files not relevant to the build (without restructuring your source
repository) use a .dockerignore file. This file supports exclusion patterns
similar to .gitignore files.
minimize image layers by leveraging build cache.
if your build contains several layers, you can order them from the
less frequently changed (to ensure the build cache is reusable) to the more
frequently changed
avoid
installing extra or unnecessary packages just because they might be “nice to
have.”
Each container should have only one concern.
Decoupling applications into
multiple containers makes it easier to scale horizontally and reuse containers
Limiting each container to one process is a good rule of thumb, but it is not a
hard and fast rule.
Use your best judgment to keep containers as clean and modular as possible.
do multi-stage builds
and only copy the artifacts you need into the final image. This allows you to
include tools and debug information in your intermediate build stages without
increasing the size of the final image.
avoid duplication of packages and make the
list much easier to update.
When building an image, Docker steps through the instructions in your
Dockerfile, executing each in the order specified.
the next
instruction is compared against all child images derived from that base
image to see if one of them was built using the exact same instruction. If
not, the cache is invalidated.
simply comparing the instruction in the Dockerfile with one
of the child images is sufficient.
For the ADD and COPY instructions, the contents of the file(s)
in the image are examined and a checksum is calculated for each file.
If anything has changed in the file(s), such
as the contents and metadata, then the cache is invalidated.
cache checking does not look at the
files in the container to determine a cache match.
In that case just
the command string itself is used to find a match.
Whenever possible, use current official repositories as the basis for your
images.
Using RUN apt-get update && apt-get install -y ensures your Dockerfile
installs the latest package versions with no further coding or manual
intervention.
cache busting
Docker executes these commands using the /bin/sh -c interpreter, which only
evaluates the exit code of the last operation in the pipe to determine success.
set -o pipefail && to ensure that an unexpected error prevents the
build from inadvertently succeeding.
The CMD instruction should be used to run the software contained by your
image, along with any arguments.
CMD should almost always be used in the form
of CMD [“executable”, “param1”, “param2”…]
CMD should rarely be used in the manner of CMD [“param”, “param”] in
conjunction with ENTRYPOINT
The ENV instruction is also useful for providing required environment
variables specific to services you wish to containerize,
Each ENV line creates a new intermediate layer, just like RUN commands
COPY
is preferred
COPY only
supports the basic copying of local files into the container
the best use for ADD is local tar file
auto-extraction into the image, as in ADD rootfs.tar.xz /
If you have multiple Dockerfile steps that use different files from your
context, COPY them individually, rather than all at once.
using ADD to fetch packages from remote URLs is
strongly discouraged; you should use curl or wget instead
The best use for ENTRYPOINT is to set the image’s main command, allowing that
image to be run as though it was that command (and then use CMD as the
default flags).
the image name can double as a reference to the binary as
shown in the command
The VOLUME instruction should be used to expose any database storage area,
configuration storage, or files/folders created by your docker container.
use VOLUME for any mutable and/or user-serviceable
parts of your image
If you absolutely need
functionality similar to sudo, such as initializing the daemon as root but
running it as non-root), consider using “gosu”.
always use absolute paths for your
WORKDIR
An ONBUILD command executes after the current Dockerfile build completes.
Think
of the ONBUILD command as an instruction the parent Dockerfile gives
to the child Dockerfile
A Docker build executes ONBUILD commands before any command in a child
Dockerfile.
Be careful when putting ADD or COPY in ONBUILD. The “onbuild” image
fails catastrophically if the new build’s context is missing the resource being
added.
"My First 5 Minutes on a Server, by Bryan Kennedy, is an excellent intro into securing a server against most attacks. We have a few modifications to his approach that we wanted to document as part of our efforts of externalizing our processes and best practices. We also wanted to spend a bit more time explaining a few things that younger engineers may benefit from."
There’s a lot wrong with this: you could be using the wrong version of code that has exploits, has a bug in it, or worse it could have malware bundled in on purpose—you just don’t know.
Keep Base Images Small
Node.js for example, it includes an extra 600MB of libraries you don’t need.
Name servers host a domain’s DNS information in a text file called a zone file.
Start of Authority (SOA) records
specifying DNS records, which match domain names to IP addresses.
Every domain’s zone file contains the domain administrator’s email address, the name servers, and the DNS records.
Your ISP’s DNS resolver queries a root nameserver for the proper TLD nameserver. In other words, it asks the root nameserver, *Where can I find the nameserver for .com domains?*
In actuality, ISPs cache a lot of DNS information after they’ve looked it up the first time.
caching is a good thing, but it can be a problem if you’ve recently made a change to your DNS information
An A record points your domain or subdomain to your Linode’s IP address,
use an asterisk (*) as your subdomain
An AAAA record is just like an A record, but for IPv6 IP addresses.
An AXFR record is a type of DNS record used for DNS replication
DNS Certification Authority Authorization uses DNS to allow the holder of a domain to specify which certificate authorities are allowed to issue certificates for that domain.
A CNAME record or Canonical Name record matches a domain or subdomain to a different domain.
Some mail servers handle mail oddly for domains with CNAME records, so you should not use a CNAME record for a domain that gets email.
MX records cannot reference CNAME-defined hostnames.
Chaining or looping CNAME records is not recommended.
a CNAME record does not function the same way as a URL redirect.
A DKIM record or DomainKeys Identified Mail record displays the public key for authenticating messages that have been signed with the DKIM protocol
DKIM records are implemented as text records.
An MX record or mail exchanger record sets the mail delivery destination for a domain or subdomain.
An MX record should ideally point to a domain that is also the hostname for its server.
Priority allows you to designate a fallback server (or servers) for mail for a particular domain. Lower numbers have a higher priority.
NS records or name server records set the nameservers for a domain or subdomain.
You can also set up different nameservers for any of your subdomains
Primary nameservers get configured at your registrar and secondary subdomain nameservers get configured in the primary domain’s zone file.
The order of NS records does not matter. DNS requests are sent randomly to the different servers
A PTR record or pointer record matches up an IP address to a domain or subdomain, allowing reverse DNS queries to function.
opposite service an A record does
PTR records are usually set with your hosting provider. They are not part of your domain’s zone file.
An SOA record or Start of Authority record labels a zone file with the name of the host where it was originally created.
Minimum TTL: The minimum amount of time other servers should keep data cached from this zone file.
An SPF record or Sender Policy Framework record lists the designated mail servers for a domain or subdomain.
An SPF record for your domain tells other receiving mail servers which outgoing server(s) are valid sources of email so they can reject spoofed mail from your domain that has originated from unauthorized servers.
Make sure your SPF records are not too strict.
An SRV record or service record matches up a specific service that runs on your domain or subdomain to a target domain.
Service: The name of the service must be preceded by an underscore (_) and followed by a period (.)
Protocol: The name of the protocol must be proceeded by an underscore (_) and followed by a period (.)
Port: The TCP or UDP port on which the service runs.
Target: The target domain or subdomain. This domain must have an A or AAAA record that resolves to an IP address.
A TXT record or text record provides information about the domain in question to other resources on the internet.
"Duet is self hosted so your data is always private, and it's completely brandable so that it matches your business. Best of all, its low one time fee means you will save hundreds over similar software
"
"For the longest time I did not know what everything meant in htop.
I thought that load average 1.0 on my two core machine means that the CPU usage is at 50%. That's not quite right. And also, why does it say 1.0?
I decided to look everything up and document it here.
They also say that the best way to learn something is to try to teach it.
"
"Rocket.Chat is an incredible product because we have an incredible developer community.
Over 200 contributors have made our platform a dynamic and innovative toolkit, from group messages and video calls to helpdesk killer features.
Our contributors are the reason we're the best cross-platform open source chat solution available today."
"The YubiKey 4 is the strong authentication bullseye the industry has been aiming at for years, enabling one single key to secure an unlimited number of applications.
Yubico's 4th generation YubiKey is built on high-performance secure elements. It includes the same range of one-time password and public key authentication protocols as in the YubiKey NEO, excluding NFC, but with stronger public/private keys, faster crypto operations and the world's first touch-to-sign feature.
With the YubiKey 4 platform, we have further improved our manufacturing and ordering process, enabling customers to order exactly what functions they want in 500+ unit volumes, with no secrets stored at Yubico or shared with a third-party organization. The best part? An organization can securely customize 1,000 YubiKeys in less than 10 minutes.
For customers who require NFC, the YubiKey NEO is our full-featured key with both contact (USB) and contactless (NFC, MIFARE) communications."
trigger the query and therefore, we lose our Relation
leaving trivial ordering out of scopes all together.
where
where
.merge() makes it easy to use scopes from other models that have been joined into the query, reducing potential duplication.
ActiveRecord provides an easy API for doing many things with our database, but it also makes it pretty easy to do things inefficiently. The layer of abstraction hides what’s really happening.
first pure SQL, then ActiveRecord
Databases can only do fast lookups for columns with indexes, otherwise it’s doing a sequential scan
Add an index on every id column as well as any column that is used in a where clause.
use a Query class to encapsulate the potentially gnarly query.
subqueries
this Query returns an ActiveRecord::Relation
where
where
Single Responsibility Principle
Avoid ad-hoc queries outside of Scopes and Query Objects
encapsulate data access into scopes and Query objects
An ad-hoc query embedded in a controller (or view, task, etc) is harder to test in isolation and cannot be reused
CMD instruction should be used to run the software contained by your
image, along with any arguments
CMD should be given an interactive shell (bash, python,
perl, etc)
COPY them individually, rather than all at once
COPY
is preferred
using ADD to fetch packages from remote URLs is
strongly discouraged
always use COPY
The best use for ENTRYPOINT is to set the image's main command, allowing that
image to be run as though it was that command (and then use CMD as the
default flags).
the image name can double as a reference to the binary as
shown in the command above
ENTRYPOINT instruction can also be used in combination with a helper
script
The VOLUME instruction should be used to expose any database storage area,
configuration storage, or files/folders created by your docker container.
use USER to change to a non-root
user
avoid installing or using sudo
avoid switching USER back
and forth frequently.
always use absolute paths for your
WORKDIR
ONBUILD is only useful for images that are going to be built FROM a given
image
The “onbuild” image will
fail catastrophically if the new build's context is missing the resource being
added.
"String API. It's difficult and obtuse, and people often wish it were more like string APIs in other languages. Today, I'm going to explain just why Swift's String API is designed the way it is (or at least, why I think it is) and why I ultimately think it's the best string API out there in terms of its fundamental design."
"Everybody that has ever implemented file upload by hand in a Rails app knows that it's no cakewalk, not to mention a major security risk. That's why we use gems to handle file upload for us! But often it's hard to decide which one to choose for your project."
ED25519 is more vulnerable to quantum computation than is RSA
best practice to be using a hardware token
to use a yubikey via gpg: with this method you use your gpg subkey as an ssh key
sit down and spend an hour thinking about your backup and recovery strategy first
never share a private keys between physical devices
allows you to revoke a single credential if you lose (control over) that device
If a private key ever turns up on the wrong machine,
you *know* the key and both source and destination
machines have been compromised.
centralized management of authentication/authorization
I have setup a VPS, disabled passwords, and setup a key with a passphrase to gain access. At this point my greatest worry is losing this private key, as that means I can't access the server.What is a reasonable way to backup my private key?
a mountable disk image that's encrypted
a system that can update/rotate your keys across all of your servers on the fly in case one is compromised or assumed to be compromised.
different keys for different purposes per client device
fall back to password plus OTP
relying completely on the security of your disk, against either physical or cyber.
It is better to use a different passphrase for each key but it is also less convenient unless you're using a password manager (personally, I'm using KeePass)
- RSA is pretty standard, and generally speaking is fairly secure for key lengths >=2048. RSA-2048 is the default for ssh-keygen, and is compatible with just about everything.
public-key authentication has somewhat unexpected side effect of preventing MITM per this security consulting firm
Disable passwords and only allow keys even for root with PermitRootLogin without-password
You should definitely use a different passphrase for keys stored on separate computers,
A chart is a collection of files
that describe a related set of Kubernetes resources.
A single chart
might be used to deploy something simple, like a memcached pod, or
something complex, like a full web app stack with HTTP servers,
databases, caches, and so on.
Charts are created as files laid out in a particular directory tree,
then they can be packaged into versioned archives to be deployed.
A chart is organized as a collection of files inside of a directory.
values.yaml # The default configuration values for this chart
charts/ # A directory containing any charts upon which this chart depends.
templates/ # A directory of templates that, when combined with values,
# will generate valid Kubernetes manifest files.
version: A SemVer 2 version (required)
apiVersion: The chart API version, always "v1" (required)
Every chart must have a version number. A version must follow the
SemVer 2 standard.
non-SemVer names are explicitly
disallowed by the system.
When generating a
package, the helm package command will use the version that it finds
in the Chart.yaml as a token in the package name.
the appVersion field is not related to the version field. It is
a way of specifying the version of the application.
appVersion: The version of the app that this contains (optional). This needn't be SemVer.
If the latest version of a chart in the
repository is marked as deprecated, then the chart as a whole is considered to
be deprecated.
deprecated: Whether this chart is deprecated (optional, boolean)
one chart may depend on any number of other charts.
dependencies can be dynamically linked through the requirements.yaml
file or brought in to the charts/ directory and managed manually.
the preferred method of declaring dependencies is by using a
requirements.yaml file inside of your chart.
A requirements.yaml file is a simple file for listing your
dependencies.
The repository field is the full URL to the chart repository.
you must also use helm repo add to add that repo locally.
helm dependency update
and it will use your dependency file to download all the specified
charts into your charts/ directory for you.
When helm dependency update retrieves charts, it will store them as
chart archives in the charts/ directory.
Managing charts with requirements.yaml is a good way to easily keep
charts updated, and also share requirements information throughout a
team.
All charts are loaded by default.
The condition field holds one or more YAML paths (delimited by commas).
If this path exists in the top parent’s values and resolves to a boolean value,
the chart will be enabled or disabled based on that boolean value.
The tags field is a YAML list of labels to associate with this chart.
all charts with tags can be enabled or disabled by
specifying the tag and a boolean value.
The --set parameter can be used as usual to alter tag and condition values.
Conditions (when set in values) always override tags.
The first condition path that exists wins and subsequent ones for that chart are ignored.
The keys containing the values to be imported can be specified in the parent chart’s requirements.yaml file
using a YAML list. Each item in the list is a key which is imported from the child chart’s exports field.
specifying the key data in our import list, Helm looks in the exports field of the child
chart for data key and imports its contents.
the parent key data is not contained in the parent’s final values. If you need to specify the
parent key, use the ‘child-parent’ format.
To access values that are not contained in the exports key of the child chart’s values, you will need to
specify the source key of the values to be imported (child) and the destination path in the parent chart’s
values (parent).
To drop a dependency into your charts/ directory, use the
helm fetch command
A dependency can be either a chart archive (foo-1.2.3.tgz) or an
unpacked chart directory.
name cannot start with _ or ..
Such files are ignored by the chart loader.
a single release is created with all the objects for the chart and its dependencies.
Helm Chart templates are written in the
Go template language, with the
addition of 50 or so add-on template
functions from the Sprig library and a
few other specialized functions
When
Helm renders the charts, it will pass every file in that directory
through the template engine.
Chart developers may supply a file called values.yaml inside of a
chart. This file can contain default values.
Chart users may supply a YAML file that contains values. This can be
provided on the command line with helm install.
When a user supplies custom values, these values will override the
values in the chart’s values.yaml file.
Template files follow the standard conventions for writing Go templates
{{default "minio" .Values.storage}}
Values that are supplied via a values.yaml file (or via the --set
flag) are accessible from the .Values object in a template.
pre-defined, are available to every template, and
cannot be overridden
the names are case
sensitive
Release.Name: The name of the release (not the chart)
Release.IsUpgrade: This is set to true if the current operation is an upgrade or rollback.
Release.Revision: The revision number. It begins at 1, and increments with
each helm upgrade
Chart: The contents of the Chart.yaml
Files: A map-like object containing all non-special files in the chart.
Files can be
accessed using {{index .Files "file.name"}} or using the {{.Files.Get name}} or
{{.Files.GetString name}} functions.
.helmignore
access the contents of the file
as []byte using {{.Files.GetBytes}}
Any unknown Chart.yaml fields will be dropped
Chart.yaml cannot be
used to pass arbitrarily structured data into the template.
A values file is formatted in YAML.
A chart may include a default
values.yaml file
be merged into the default
values file.
The default values file included inside of a chart must be named
values.yaml
accessible inside of templates using the
.Values object
Values files can declare values for the top-level chart, as well as for
any of the charts that are included in that chart’s charts/ directory.
Charts at a higher level have access to all of the variables defined
beneath.
lower level charts cannot access things in
parent charts
Values are namespaced, but namespaces are pruned.
the scope of the values has been reduced and the
namespace prefix removed
Helm supports special “global” value.
a way of sharing one top-level variable with all
subcharts, which is useful for things like setting metadata properties
like labels.
If a subchart declares a global variable, that global will be passed
downward (to the subchart’s subcharts), but not upward to the parent
chart.
global variables of parent charts take precedence over the global variables from subcharts.
helm lint
A chart repository is an HTTP server that houses one or more packaged
charts
Any HTTP server that can serve YAML files and tar files and can answer
GET requests can be used as a repository server.
Helm does not provide tools for uploading charts to
remote repository servers.
the only way to add a chart to $HELM_HOME/starters is to manually
copy it there.
Helm provides a hook mechanism to allow chart developers to intervene
at certain points in a release’s life cycle.
Execute a Job to back up a database before installing a new chart,
and then execute a second job after the upgrade in order to restore
data.
Hooks are declared as an annotation in the metadata section of a manifest
Hooks work like regular templates, but they have special annotations
pre-install
post-install: Executes after all resources are loaded into Kubernetes
pre-delete
post-delete: Executes on a deletion request after all of the release’s
resources have been deleted.
pre-upgrade
post-upgrade
pre-rollback
post-rollback: Executes on a rollback request after all resources
have been modified.
crd-install
test-success: Executes when running helm test and expects the pod to
return successfully (return code == 0).
test-failure: Executes when running helm test and expects the pod to
fail (return code != 0).
Hooks allow you, the chart developer, an opportunity to perform
operations at strategic points in a release lifecycle
Tiller then loads the hook with the lowest weight first (negative to positive)
Tiller returns the release name (and other data) to the client
If the resources is a Job kind, Tiller
will wait until the job successfully runs to completion.
if the job
fails, the release will fail. This is a blocking operation, so the
Helm client will pause while the Job is run.
If they
have hook weights (see below), they are executed in weighted order. Otherwise,
ordering is not guaranteed.
good practice to add a hook weight, and set it
to 0 if weight is not important.
The resources that a hook creates are not tracked or managed as part of the
release.
leave the hook resource alone.
To destroy such
resources, you need to either write code to perform this operation in a pre-delete
or post-delete hook or add "helm.sh/hook-delete-policy" annotation to the hook template file.
Hooks are just Kubernetes manifest files with special annotations in the
metadata section
One resource can implement multiple hooks
no limit to the number of different resources that
may implement a given hook.
When subcharts declare hooks, those are also evaluated. There is no way
for a top-level chart to disable the hooks declared by subcharts.
Hook weights can be positive or negative numbers but must be represented as
strings.
sort those hooks in ascending order.
Hook deletion policies
"before-hook-creation" specifies Tiller should delete the previous hook before the new hook is launched.
By default Tiller will wait for 60 seconds for a deleted hook to no longer exist in the API server before timing out.
Custom Resource Definitions (CRDs) are a special kind in Kubernetes.
The crd-install hook is executed very early during an installation, before
the rest of the manifests are verified.
A common reason why the hook resource might already exist is that it was not deleted following use on a previous install/upgrade.
Helm uses Go templates for templating
your resource files.
two special template functions: include and required
include
function allows you to bring in another template, and then pass the results to other
template functions.
The required function allows you to declare a particular
values entry as required for template rendering.
If the value is empty, the template
rendering will fail with a user submitted error message.
When you are working with string data, you are always safer quoting the
strings than leaving them as bare words
Quote Strings, Don’t Quote Integers
when working with integers do not quote the values
env variables values which are expected to be string
to include a template, and then perform an operation
on that template’s output, Helm has a special include function
The above includes a template called toYaml, passes it $value, and
then passes the output of that template to the nindent function.
Go provides a way for setting template options to control behavior
when a map is indexed with a key that’s not present in the map
The required function gives developers the ability to declare a value entry
as required for template rendering.
The tpl function allows developers to evaluate strings as templates inside a template.
Rendering a external configuration file
(.Files.Get "conf/app.conf")
Image pull secrets are essentially a combination of registry, username, and password.
Automatically Roll Deployments When ConfigMaps or Secrets change
configmaps or secrets are injected as configuration
files in containers
a restart may be required should those
be updated with a subsequent helm upgrade
The sha256sum function can be used to ensure a deployment’s
annotation section is updated if another file changes
checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
helm upgrade --recreate-pods
"helm.sh/resource-policy": keep
resources that should not be deleted when Helm runs a
helm delete
this resource becomes
orphaned. Helm will no longer manage it in any way.
create some reusable parts in your chart
In the templates/ directory, any file that begins with an
underscore(_) is not expected to output a Kubernetes manifest file.
by convention, helper templates and partials are placed in a
_helpers.tpl file.
The current best practice for composing a complex application from discrete parts
is to create a top-level umbrella chart that
exposes the global configurations, and then use the charts/ subdirectory to
embed each of the components.
SAP’s Converged charts: These charts
install SAP Converged Cloud a full OpenStack IaaS on Kubernetes. All of the charts are collected
together in one GitHub repository, except for a few submodules.
Deis’s Workflow:
This chart exposes the entire Deis PaaS system with one chart. But it’s different
from the SAP chart in that this umbrella chart is built from each component, and
each component is tracked in a different Git repository.
YAML is a superset of JSON
any valid JSON structure ought to be valid in YAML.
As a best practice, templates should follow a YAML-like syntax unless
the JSON syntax substantially reduces the risk of a formatting issue.
There are functions in Helm that allow you to generate random data,
cryptographic keys, and so on.
a chart repository is a location where packaged charts can be
stored and shared.
A chart repository is an HTTP server that houses an index.yaml file and
optionally some packaged charts.
Because a chart repository can be any HTTP server that can serve YAML and tar
files and can answer GET requests, you have a plethora of options when it comes
down to hosting your own chart repository.
It is not required that a chart package be located on the same server as the
index.yaml file.
A valid chart repository must have an index file. The
index file contains information about each chart in the chart repository.
The Helm project provides an open-source Helm repository server called ChartMuseum that you can host yourself.
$ helm repo index fantastic-charts --url https://fantastic-charts.storage.googleapis.com
A repository will not be added if it does not contain a valid
index.yaml
add the repository to their helm client via the helm
repo add [NAME] [URL] command with any name they would like to use to
reference the repository.
Helm has provenance tools which help chart users verify the integrity and origin
of a package.
Integrity is established by comparing a chart to a provenance record
The provenance file contains a chart’s YAML file plus several pieces of
verification information
Chart repositories serve as a centralized collection of Helm charts.
Chart repositories must make it possible to serve provenance files over HTTP via
a specific request, and must make them available at the same URI path as the chart.
We don’t want to be “the certificate authority” for all chart
signers. Instead, we strongly favor a decentralized model, which is part
of the reason we chose OpenPGP as our foundational technology.
The Keybase platform provides a public
centralized repository for trust information.
A chart contains a number of Kubernetes resources and components that work together.
A test in a helm chart lives under the templates/ directory and is a pod definition that specifies a container with a given command to run.
The pod definition must contain one of the helm test hook annotations: helm.sh/hook: test-success or helm.sh/hook: test-failure
helm test
nest your test suite under a tests/ directory like <chart-name>/templates/tests/