Kubernetes releases before v1.24 included a direct integration with Docker Engine,
using a component named dockershim. That special direct integration is no longer
part of Kubernetes
You need to install a
container runtime
into each node in the cluster so that Pods can run there.
Kubernetes 1.26 requires that you use a runtime that
conforms with the
Container Runtime Interface (CRI).
On Linux, control groups
are used to constrain resources that are allocated to processes.
Both kubelet and the
underlying container runtime need to interface with control groups to enforce
resource management for pods and containers and set
resources such as cpu/memory requests and limits.
When the cgroupfs
driver is used, the kubelet and the container runtime directly interface with
the cgroup filesystem to configure cgroups.
The cgroupfs driver is not recommended when
systemd is the
init system
When systemd is chosen as the init
system for a Linux distribution, the init process generates and consumes a root control group
(cgroup) and acts as a cgroup manager.
Two cgroup managers result in two views of the available and in-use resources in
the system.
Changing the cgroup driver of a Node that has joined a cluster is a sensitive operation.
If the kubelet has created Pods using the semantics of one cgroup driver, changing the container
runtime to another cgroup driver can cause errors when trying to re-create the Pod sandbox
for such existing Pods. Restarting the kubelet may not solve such errors.
The approach to mitigate this instability is to use systemd as the cgroup driver for
the kubelet and the container runtime when systemd is the selected init system.
Kubernetes 1.26 defaults to using v1 of the CRI API.
If a container runtime does not support the v1 API, the kubelet falls back to
using the (deprecated) v1alpha2 API instead.
You can use role-based access control
(RBAC) and other
security mechanisms to make sure that users and workloads can get access to the
resources they need, while keeping workloads, and the cluster itself, secure.
You can set limits on the resources that users and workloads can access
by managing policies and
container resources.
you need to plan how to scale to relieve increased
pressure from more requests to the control plane and worker nodes or scale down to reduce unused
resources.
Managed control plane: Let the provider manage the scale and availability
of the cluster's control plane, as well as handle patches and upgrades.
The simplest Kubernetes cluster has the entire control plane and worker node
services running on the same machine.
You can deploy a control plane using tools such
as kubeadm, kops, and kubespray.
Secure communications between control plane services
are implemented using certificates.
Certificates are automatically generated
during deployment or you can generate them using your own certificate authority.
Separate and backup etcd service: The etcd services can either run on the
same machines as other control plane services or run on separate machines
Create multiple control plane systems: For high availability, the
control plane should not be limited to a single machine
Some deployment tools set up Raft
consensus algorithm to do leader election of Kubernetes services. If the
primary goes away, another service elects itself and take over.
Groups of zones are referred to as regions.
if you installed with kubeadm, there are instructions to help you with
Certificate Management
and Upgrading kubeadm clusters.
Production-quality workloads need to be resilient and anything they rely
on needs to be resilient (such as CoreDNS).
Add nodes to the cluster: If you are managing your own cluster you can
add nodes by setting up your own machines and either adding them manually or
having them register themselves to the cluster’s apiserver.
Set up node health checks: For important workloads, you want to make sure
that the nodes and pods running on those nodes are healthy.
Authentication: The apiserver can authenticate users using client
certificates, bearer tokens, an authenticating proxy, or HTTP basic auth.
Authorization: When you set out to authorize your regular users, you will probably choose
between RBAC and ABAC authorization.
Role-based access control (RBAC): Lets you
assign access to your cluster by allowing specific sets of permissions to authenticated users.
Permissions can be assigned for a specific namespace (Role) or across the entire cluster
(ClusterRole).
Attribute-based access control (ABAC): Lets you
create policies based on resource attributes in the cluster and will allow or deny access
based on those attributes.
Set limits on workload resources
Set namespace limits: Set per-namespace quotas on things like memory and CPU
Prepare for DNS demand: If you expect workloads to massively scale up,
your DNS service must be ready to scale up as well.
Version constraints within the configuration
itself determine which versions of dependencies are potentially compatible,
but after selecting a specific version of each dependency Terraform remembers
the decisions it made in a dependency lock file
At present, the dependency lock file tracks only provider dependencies.
Terraform does not remember version selections for remote modules, and so
Terraform will always select the newest available module version that meets
the specified version constraints.
The lock file is always named .terraform.lock.hcl, and this name is intended
to signify that it is a lock file for various items that Terraform caches in
the .terraform
Terraform automatically creates or updates the dependency lock file each time
you run the terraform init command.
You should
include this file in your version control repository
If a particular provider has no existing recorded selection, Terraform will
select the newest available version that matches the given version constraint,
and then update the lock file to include that selection.
the "trust on first use" model
you can pre-populate checksums for a variety of
different platforms in your lock file using
the terraform providers lock command,
which will then allow future calls to terraform init to verify that the
packages available in your chosen mirror match the official packages from
the provider's origin registry.
The h1: and
zh: prefixes on these values represent different hashing schemes, each
of which represents calculating a checksum using a different algorithm.
zh:: a mnemonic for "zip hash"
h1:: a mnemonic for "hash scheme 1", which is the current preferred hashing
scheme.
To determine whether there still exists a dependency on a given provider,
Terraform uses two sources of truth: the configuration itself, and the state.
Version constraints within the configuration
itself determine which versions of dependencies are potentially compatible,
but after selecting a specific version of each dependency Terraform remembers
the decisions it made in a dependency lock file so that it can (by default)
make the same decisions again in future.
At present, the dependency lock file tracks only provider dependencies.
Terraform will always select the newest available module version that meets
the specified version constraints.
Every resource type is implemented by a provider; without providers, Terraform
can't manage any kind of infrastructure.
The Terraform Registry
is the main directory of publicly available Terraform providers, and hosts
providers for most major infrastructure platforms.
Dependency Lock File
documents an additional HCL file that can be included with a configuration,
which tells Terraform to always use a specific set of provider versions.
Terraform CLI finds and installs providers when
initializing a working directory. It can
automatically download providers from a Terraform registry, or load them from
a local mirror or cache.
To save time and bandwidth, Terraform CLI supports an optional plugin
cache. You can enable the cache using the plugin_cache_dir setting in
the CLI configuration file.
you can use Terraform CLI to create a
dependency lock file
and commit it to version control along with your configuration.
Each provider may offer data sources
alongside its set of resource
types.
When distinguishing from data resources, the primary kind of resource (as declared
by a resource block) is known as a managed resource.
Each data resource is associated with a single data source, which determines
the kind of object (or objects) it reads and what query constraint arguments
are available.
Terraform reads data resources during the planning phase when possible, but
announces in the plan when it must defer reading resources until the apply
phase to preserve the order of operations.
local-only data sources exist for
rendering templates,
reading local files, and
rendering AWS IAM policies.
As with managed resources, when count or for_each is present it is important to
distinguish the resource itself from the multiple resource instances it
creates. Each instance will separately read from its data source with its
own variant of the constraint arguments, producing an indexed result.
Data instance arguments may refer to computed values, in which case the
attributes of the instance itself cannot be resolved until all of its
arguments are defined. I
triggers - A map of values which should cause this set of provisioners to
re-run. Values are meant to be interpolated references to variables or
attributes of other resources.
"triggers - A map of values which should cause this set of provisioners to re-run. Values are meant to be interpolated references to variables or attributes of other resources.
"
each.value — The map value corresponding to this instance. (If a set was
provided, this is the same as each.key.)
for_each keys cannot be the result (or rely on the result of) of impure functions,
including uuid, bcrypt, or timestamp, as their evaluation is deferred during the
main evaluation step.
The value used in for_each is used
to identify the resource instance and will always be disclosed in UI output,
which is why sensitive values are not allowed.
if you would like to call keys(local.map), where
local.map is an object with sensitive values (but non-sensitive keys), you can create a
value to pass to for_each with toset([for k,v in local.map : k]).
for_each
can't refer to any resource attributes that aren't known until after a
configuration is applied (such as a unique ID generated by the remote API when
an object is created).
he for_each argument
does not implicitly convert lists or tuples to sets.
Transform a multi-level nested structure into a flat list by
using nested for expressions with the flatten function.
Instances are
identified by a map key (or set member) from the value provided to for_each
Within nested provisioner or connection blocks, the special
self object refers to the current resource instance, not the resource block
as a whole.
Conversion from list to set discards the ordering of the items in the list and
removes any duplicate elements.
the native syntax of the Terraform language, which is
a rich language designed to be relatively easy for humans to read and write.
Terraform's configuration language is based on a more general
language called HCL, and HCL's documentation usually uses the word "attribute"
instead of "argument."
A particular block type may have any number of required labels, or it may
require none
After the block type keyword and any labels, the block body is delimited
by the { and } characters
Identifiers can contain letters, digits, underscores (_), and hyphens (-).
The first character of an identifier must not be a digit, to avoid ambiguity
with literal numbers.
The # single-line comment style is the default comment style and should be
used in most cases.
he idiomatic style
is to use the Unix convention
Indent two spaces for each nesting level.
align their equals signs
Use empty lines to separate logical groups of arguments within a block.
Use one blank line to separate the arguments from
the blocks.
"meta-arguments" (as defined by
the Terraform language semantics)
Avoid separating multiple blocks of the same type with other blocks of
a different type, unless the block types are defined by semantics to
form a family.
Resource names must start with a letter or underscore, and may
contain only letters, digits, underscores, and dashes.
Each resource is associated with a single resource type, which determines
the kind of infrastructure object it manages and what arguments and other
attributes the resource supports.
Each resource type is implemented by a provider,
which is a plugin for Terraform that offers a collection of resource types.
By convention, resource type names start with their
provider's preferred local name.
Most publicly available providers are distributed on the
Terraform Registry, which also
hosts their documentation.
The Terraform language defines several meta-arguments, which can be used with
any resource type to change the behavior of resources.
use precondition and postcondition blocks to specify assumptions and guarantees about how the resource operates.
Some resource types provide a special timeouts nested block argument that
allows you to customize how long certain operations are allowed to take
before being considered to have failed.
Timeouts are handled entirely by the resource type implementation in the
provider
Most
resource types do not support the timeouts block at all.
A resource block declares that you want a particular infrastructure object
to exist with the given settings.
Destroy resources that exist in the state but no longer exist in the configuration.
Destroy and re-create resources whose arguments have changed but which cannot be updated in-place due to remote API limitations.
Expressions within a Terraform module can access
information about resources in the same module, and you can use that information
to help configure other resources. Use the <RESOURCE TYPE>.<NAME>.<ATTRIBUTE>
syntax to reference a resource attribute in an expression.
resources often provide
read-only attributes with information obtained from the remote API; this often
includes things that can't be known until the resource is created, like the
resource's unique random ID.
data sources,
which are a special type of resource used only for looking up information.
some dependencies cannot be recognized implicitly in configuration.
local-only resource types exist for
generating private keys,
issuing self-signed TLS certificates,
and even generating random ids.
The behavior of local-only resources is the same as all other resources, but
their result data exists only within the Terraform state.
The count meta-argument accepts a whole number, and creates that many
instances of the resource or module.
count.index — The distinct index number (starting with 0) corresponding
to this instance.
the count value must be known
before Terraform performs any remote resource actions. This means count
can't refer to any resource attributes that aren't known until after a
configuration is applied
Within nested provisioner or connection blocks, the special
self object refers to the current resource instance, not the resource block
as a whole.
This was fragile, because the resource instances were still identified by their
index instead of the string values in the list.
the overriding
effect is compounded, with later blocks taking precedence over earlier blocks.
Terraform has special handling of any configuration
file whose name ends in _override.tf or _override.tf.json. This special
handling also applies to a file named literally override.tf or
override.tf.json.Terraform initially skips these override files when loading configuration,
and then afterwards processes each one in turn (in lexicographical order).
If the original block defines a default value and an override block changes
the variable's type, Terraform attempts to convert the default value to
the overridden type, producing an error if this conversion is not possible.
Processes are visible to other containers in the pod. This includes all
information visible in /proc, such as passwords that were passed as arguments
or environment variables. These are protected only by regular Unix permissions.
Container filesystems are visible to other containers in the pod through the
/proc/$pid/root link. This makes debugging easier, but it also means
that filesystem secrets are protected only by filesystem permissions.
Ephemeral containers differ from other containers in that they lack guarantees
for resources or execution, and they will never be automatically restarted, so
they are not appropriate for building applications.
Ephemeral containers are created using a special ephemeralcontainers handler
in the API rather than by adding them directly to pod.spec, so it's not
possible to add an ephemeral container using kubectl edit
distroless images
enable you to deploy minimal container images that reduce attack surface
and exposure to bugs and vulnerabilities.
enable process namespace
sharing so
you can view processes in other containers.
What’s one of the biggest benefits of Docker? Clearly reproducibility: It doesn’t matter where you run your images, or when you run them: The result will always be the same.
For example, in Alpine 3.5, the package Node.js might be 2.0, and in Alpine 3.4 it’s 1.9. By pinning down the repository to Alpine 3.4, you will alwaysget Node.js 1.9, because Alpine 3.4 is an old version and not updated anymore.
Unfortunately Alpine Linux does not keep old packages.
"What's one of the biggest benefits of Docker? Clearly reproducibility: It doesn't matter where you run your images, or when you run them: The result will always be the same."
It requires to disable docker's iptables function first, but this also means that we give up docker's network management function.
This causes containers will not be able to access the external network.
such as -A POSTROUTING ! -o docker0 -s 172.17.0.0/16 -j MASQUERADE. But this only allows containers that belong to network 172.17.0.0/16 can access outside.
This mesh allows us to easily disable and enable parts of the internal network in a data center for maintenance or to deal with a problem.
As part of this protocol, operators define policies which decide which prefixes (a collection of adjacent IP addresses) are advertised to peers (the other networks they connect to), or accepted from peers.