Docker builds images automatically by reading the instructions from a
Dockerfile -- a text file that contains all commands, in order, needed to
build a given image.
A Docker image consists of read-only layers each of which represents a
Dockerfile instruction.
The layers are stacked and each one is a delta of the
changes from the previous layer
When you run an image and generate a container, you add a new writable layer
(the “container layer”) on top of the underlying layers.
By “ephemeral,” we mean that the container can be stopped
and destroyed, then rebuilt and replaced with an absolute minimum set up and
configuration.
Inadvertently including files that are not necessary for building an image
results in a larger build context and larger image size.
To exclude files not relevant to the build (without restructuring your source
repository) use a .dockerignore file. This file supports exclusion patterns
similar to .gitignore files.
minimize image layers by leveraging build cache.
if your build contains several layers, you can order them from the
less frequently changed (to ensure the build cache is reusable) to the more
frequently changed
avoid
installing extra or unnecessary packages just because they might be “nice to
have.”
Each container should have only one concern.
Decoupling applications into
multiple containers makes it easier to scale horizontally and reuse containers
Limiting each container to one process is a good rule of thumb, but it is not a
hard and fast rule.
Use your best judgment to keep containers as clean and modular as possible.
do multi-stage builds
and only copy the artifacts you need into the final image. This allows you to
include tools and debug information in your intermediate build stages without
increasing the size of the final image.
avoid duplication of packages and make the
list much easier to update.
When building an image, Docker steps through the instructions in your
Dockerfile, executing each in the order specified.
the next
instruction is compared against all child images derived from that base
image to see if one of them was built using the exact same instruction. If
not, the cache is invalidated.
simply comparing the instruction in the Dockerfile with one
of the child images is sufficient.
For the ADD and COPY instructions, the contents of the file(s)
in the image are examined and a checksum is calculated for each file.
If anything has changed in the file(s), such
as the contents and metadata, then the cache is invalidated.
cache checking does not look at the
files in the container to determine a cache match.
In that case just
the command string itself is used to find a match.
Whenever possible, use current official repositories as the basis for your
images.
Using RUN apt-get update && apt-get install -y ensures your Dockerfile
installs the latest package versions with no further coding or manual
intervention.
cache busting
Docker executes these commands using the /bin/sh -c interpreter, which only
evaluates the exit code of the last operation in the pipe to determine success.
set -o pipefail && to ensure that an unexpected error prevents the
build from inadvertently succeeding.
The CMD instruction should be used to run the software contained by your
image, along with any arguments.
CMD should almost always be used in the form
of CMD [“executable”, “param1”, “param2”…]
CMD should rarely be used in the manner of CMD [“param”, “param”] in
conjunction with ENTRYPOINT
The ENV instruction is also useful for providing required environment
variables specific to services you wish to containerize,
Each ENV line creates a new intermediate layer, just like RUN commands
COPY
is preferred
COPY only
supports the basic copying of local files into the container
the best use for ADD is local tar file
auto-extraction into the image, as in ADD rootfs.tar.xz /
If you have multiple Dockerfile steps that use different files from your
context, COPY them individually, rather than all at once.
using ADD to fetch packages from remote URLs is
strongly discouraged; you should use curl or wget instead
The best use for ENTRYPOINT is to set the image’s main command, allowing that
image to be run as though it was that command (and then use CMD as the
default flags).
the image name can double as a reference to the binary as
shown in the command
The VOLUME instruction should be used to expose any database storage area,
configuration storage, or files/folders created by your docker container.
use VOLUME for any mutable and/or user-serviceable
parts of your image
If you absolutely need
functionality similar to sudo, such as initializing the daemon as root but
running it as non-root), consider using “gosu”.
always use absolute paths for your
WORKDIR
An ONBUILD command executes after the current Dockerfile build completes.
Think
of the ONBUILD command as an instruction the parent Dockerfile gives
to the child Dockerfile
A Docker build executes ONBUILD commands before any command in a child
Dockerfile.
Be careful when putting ADD or COPY in ONBUILD. The “onbuild” image
fails catastrophically if the new build’s context is missing the resource being
added.
every Kubernetes operation is exposed as an API endpoint and can be executed by an HTTP request to this endpoint.
the main job of kubectl is to carry out HTTP requests to the Kubernetes API
Kubernetes maintains an internal state of resources, and all Kubernetes operations are CRUD operations on these resources.
Kubernetes is a fully resource-centred system
Kubernetes API reference is organised as a list of resource types with their associated operations.
This is how kubectl works for all commands that interact with the Kubernetes cluster.
kubectl simply makes HTTP requests to the appropriate Kubernetes API endpoints.
it's totally possible to control Kubernetes with a tool like curl by manually issuing HTTP requests to the Kubernetes API.
Kubernetes consists of a set of independent components that run as separate processes on the nodes of a cluster.
components on the master nodes
Storage backend: stores resource definitions (usually etcd is used)
API server: provides Kubernetes API and manages storage backend
Controller manager: ensures resource statuses match specifications
Scheduler: schedules Pods to worker nodes
component on the worker nodes
Kubelet: manages execution of containers on a worker node
triggers the ReplicaSet controller, which is a sub-process of the controller manager.
the scheduler, who watches for Pod definitions that are not yet scheduled to a worker node.
creating and updating resources in the storage backend on the master node.
The kubelet of the worker node your ReplicaSet Pods have been scheduled to instructs the configured container runtime (which may be Docker) to download the required container images and run the containers.
Kubernetes components (except the API server and the storage backend) work by watching for resource changes in the storage backend and manipulating resources in the storage backend.
However, these components do not access the storage backend directly, but only through the Kubernetes API.
double usage of the Kubernetes API for internal components as well as for external users is a fundamental design concept of Kubernetes.
All other Kubernetes components and users read, watch, and manipulate the state (i.e. resources) of Kubernetes through the Kubernetes API
The storage backend stores the state (i.e. resources) of Kubernetes.
command completion is a shell feature that works by the means of a completion script.
A completion script is a shell script that defines the completion behaviour for a specific command. Sourcing a completion script enables completion for the corresponding command.
kubectl completion zsh
/etc/bash_completion.d directory (create it, if it doesn't exist)
source <(kubectl completion bash)
source <(kubectl completion zsh)
autoload -Uz compinit
compinit
the API reference, which contains the full specifications of all resources.
kubectl api-resources
displays the resource names in their plural form (e.g. deployments instead of deployment). It also displays the shortname (e.g. deploy) for those resources that have one. Don't worry about these differences. All of these name variants are equivalent for kubectl.
.spec
custom columns output format comes in. It lets you freely define the columns and the data to display in them. You can choose any field of a resource to be displayed as a separate column in the output
kubectl get pods -o custom-columns='NAME:metadata.name,NODE:spec.nodeName'
kubectl explain pod.spec.
kubectl explain pod.metadata.
browse the resource specifications and try it out with any fields you like!
JSONPath is a language to extract data from JSON documents (it is similar to XPath for XML).
with kubectl explain, only a subset of the JSONPath capabilities is supported
Many fields of Kubernetes resources are lists, and this operator allows you to select items of these lists. It is often used with a wildcard as [*] to select all items of the list.
kubectl get pods -o custom-columns='NAME:metadata.name,IMAGES:spec.containers[*].image'
a Pod may contain more than one container.
The availability zones for each node are obtained through the special failure-domain.beta.kubernetes.io/zone label.
kubectl get nodes -o yaml
kubectl get nodes -o json
The default kubeconfig file is ~/.kube/config
with multiple clusters, then you have connection parameters for multiple clusters configured in your kubeconfig file.
Within a cluster, you can set up multiple namespaces (a namespace is kind of "virtual" clusters within a physical cluster)
overwrite the default kubeconfig file with the --kubeconfig option for every kubectl command.
Namespace: the namespace to use when connecting to the cluster
a one-to-one mapping between clusters and contexts.
When kubectl reads a kubeconfig file, it always uses the information from the current context.
just change the current context in the kubeconfig file
to switch to another namespace in the same cluster, you can change the value of the namespace element of the current context
kubectl also provides the --cluster, --user, --namespace, and --context options that allow you to overwrite individual elements and the current context itself, regardless of what is set in the kubeconfig file.
for switching between clusters and namespaces is kubectx.
kubectl config get-contexts
just have to download the shell scripts named kubectl-ctx and kubectl-ns to any directory in your PATH and make them executable (for example, with chmod +x)
kubectl proxy
kubectl get roles
kubectl get pod
Kubectl plugins are distributed as simple executable files with a name of the form kubectl-x. The prefix kubectl- is mandatory,
To install a plugin, you just have to copy the kubectl-x file to any directory in your PATH and make it executable (for example, with chmod +x)
krew itself is a kubectl plugin
check out the kubectl-plugins GitHub topic
The executable can be of any type, a Bash script, a compiled Go program, a Python script, it really doesn't matter. The only requirement is that it can be directly executed by the operating system.
kubectl plugins can be written in any programming or scripting language.
you can write more sophisticated plugins with real programming languages, for example, using a Kubernetes client library. If you use Go, you can also use the cli-runtime library, which exists specifically for writing kubectl plugins.
a kubeconfig file consists of a set of contexts
changing the current context means changing the cluster, if you have only a single context per cluster.
Zabbix by default uses "pull" model when a server connects to agents on each monitoring machine, agents periodically gather the info and send it to a server.
Prometheus prefers "pull" model when a server gather info from client machines.
Prometheus requires an application to be instrumented with Prometheus client library (available in different programming languages) for preparing metrics.
expose metrics for Prometheus (similar to "agents" for Zabbix)
Zabbix uses its own tcp-based communication protocol between agents and a server.
Prometheus uses HTTP with protocol buffers (+ text format for ease of use with curl).
Prometheus offers basic tool for exploring gathered data and visualizing it in simple graphs on its native server and also offers a minimal dashboard builder PromDash. But Prometheus is and is designed to be supported by modern visualizing tools like Grafana.
Prometheus offers solution for alerting that is separated from its core into Alertmanager application.