"In software engineering, a fluent interface (as first coined by Eric Evans and Martin Fowler) is an implementation of an object oriented API that aims to provide more readable code.
A fluent interface is normally implemented by using method cascading (concretely method chaining) to relay the instruction context of a subsequent call (but a fluent interface entails more than just method chaining [1]). Generally, the context is
defined through the return value of a called method
self-referential, where the new context is equivalent to the last context
terminated through the return of a void context."
"Amazon Ion is a richly-typed, self-describing, hierarchical data serialization format offering interchangeable binary and text representations. The text format (a superset of JSON) is easy to read and author, supporting rapid prototyping. The binary representation is efficient to store, transmit, and skip-scan parse. The rich type system provides unambiguous semantics for long-term preservation of business data which can survive multiple generations of software evolution.
Ion was built to solve the rapid development, decoupling, and efficiency challenges faced every day while engineering large-scale, service-oriented architectures. Ion has been addressing these challenges within Amazon for nearly a decade, and we believe others will benefit as well.
"
"My First 5 Minutes on a Server, by Bryan Kennedy, is an excellent intro into securing a server against most attacks. We have a few modifications to his approach that we wanted to document as part of our efforts of externalizing our processes and best practices. We also wanted to spend a bit more time explaining a few things that younger engineers may benefit from."
Before moving our production infrastructure over however, we decided that we wanted to start developing with them locally first. We could shake out any issues with our applications before risking the production environment.
using Chef and Vagrant to provision local VMs
Engineers at IFTTT currently all use Apple computers
Docker Engine swarm mode makes it easy to publish ports for services to make
them available to resources outside the swarm.
All nodes participate in an
ingress routing mesh.
routing mesh enables each node in the swarm to
accept connections on published ports for any service running in the swarm, even
if there’s no task running on the node.
"SQL Notebook is a free Windows app for exploring and manipulating tabular data. It is powered by a supercharged SQLite engine, supporting both standard SQL queries and SQL Notebook-specific commands. Everything you need to answer analysis questions about your data, regardless of its format or origin, is built into SQL Notebook."
"In this blog post, I will review some of the MySQL replication concepts that are part of the MySQL environment (and Percona Server for MySQL specifically). I will also try to clarify some of the misconceptions people have about replication.
Since I've been working on the Solution Engineering team, I've noticed that - although information is plentiful - replication is often misunderstood or incompletely understood."
globalLock.currentQueue.total: This number can indicate a possible concurrency issue if it’s consistently high. This can happen if a lot of requests are waiting for a lock to be released.
globalLock.totalTime: If this is higher than the total database uptime, the database has been in a lock state for too long.
Unlike relational databases such as MySQL or PostgreSQL, MongoDB uses JSON-like documents for storing data.
Databases operate in an environment that consists of numerous reads, writes, and updates.
When a lock occurs, no other operation can read or modify the data until the operation that initiated the lock is finished.
locks.deadlockCount: Number of times the lock acquisitions have encountered deadlocks
Is the database frequently locking from queries? This might indicate issues with the schema design, query structure, or system architecture.
For version 3.2 on, WiredTiger is the default.
MMAPv1 locks whole collections, not individual documents.
WiredTiger performs locking at the document level.
When the MMAPv1 storage engine is in use, MongoDB will use memory-mapped files to store data.
All available memory will be allocated for this usage if the data set is large enough.
db.serverStatus().mem
mem.resident: Roughly equivalent to the amount of RAM in megabytes that the database process uses
If mem.resident exceeds the value of system memory and there’s a large amount of unmapped data on disk, we’ve most likely exceeded system capacity.
If the value of mem.mapped is greater than the amount of system memory, some operations will experience page faults.
The WiredTiger storage engine is a significant improvement over MMAPv1 in performance and concurrency.
By default, MongoDB will reserve 50 percent of the available memory for the WiredTiger data cache.
wiredTiger.cache.bytes currently in the cache – This is the size of the data currently in the cache.
wiredTiger.cache.tracked dirty bytes in the cache – This is the size of the dirty data in the cache.
we can look at the wiredTiger.cache.bytes read into cache value for read-heavy applications. If this value is consistently high, increasing the cache size may improve overall read performance.
check whether the application is read-heavy. If it is, increase the size of the replica set and distribute the read operations to secondary members of the set.
write-heavy, use sharding within a sharded cluster to distribute the load.
Replication is the propagation of data from one node to another
Replication sets handle this replication.
Sometimes, data isn’t replicated as quickly as we’d like.
a particularly thorny problem if the lag between a primary and secondary node is high and the secondary becomes the primary
use the db.printSlaveReplicationInfo() or the rs.printSlaveReplicationInfo() command to see the status of a replica set from the perspective of the secondary member of the set.
shows how far behind the secondary members are from the primary. This number should be as low as possible.
monitor this metric closely.
watch for any spikes in replication delay.
Always investigate these issues to understand the reasons for the lag.
One replica set is primary. All others are secondary.
it’s not normal for nodes to change back and forth between primary and secondary.
use the profiler to gain a deeper understanding of the database’s behavior.
Enabling the profiler can affect system performance, due to the additional activity.
"globalLock.currentQueue.total: This number can indicate a possible concurrency issue if it's consistently high. This can happen if a lot of requests are waiting for a lock to be released."
In a cluster, logs should have a separate storage and lifecycle independent of nodes, pods, or containers. This concept is called cluster-level logging.
Cluster-level logging architectures require a separate backend to store, analyze, and query logs
Kubernetes
does not provide a native storage solution for log data.
use kubectl logs --previous to retrieve logs from a previous instantiation of a container.
A container engine handles and redirects any output generated to a containerized application's stdout and stderr streams
The Docker JSON logging driver treats each line as a separate message.
By default, if a container restarts, the kubelet keeps one terminated container with its logs.
An important consideration in node-level logging is implementing log rotation,
so that logs don't consume all available storage on the node
You can also set up a container runtime to
rotate an application's logs automatically.
The two kubelet flags container-log-max-size and container-log-max-files can be used to configure the maximum size for each log file and the maximum number of files allowed for each container respectively.
The kubelet and container runtime do not run in containers.
On machines with systemd, the kubelet and container runtime write to journald. If
systemd is not present, the kubelet and container runtime write to .log files
in the /var/log directory.
System components inside containers always write
to the /var/log directory, bypassing the default logging mechanism.
Kubernetes does not provide a native solution for cluster-level logging
Use a node-level logging agent that runs on every node.
implement cluster-level logging by including a node-level logging agent on each node.
the logging agent is a container that has access to a directory with log files from all of the application containers on that node.
the logging agent must run on every node, it is recommended to run the agent
as a DaemonSet
Node-level logging creates only one agent per node and doesn't require any changes to the applications running on the node.
Containers write stdout and stderr, but with no agreed format. A node-level agent collects these logs and forwards them for aggregation.
Each sidecar container prints a log to its own stdout or stderr stream.
It is not recommended to write log entries with different formats to the same log
stream
writing logs to a file and
then streaming them to stdout can double disk usage.
If you have
an application that writes to a single file, it's recommended to set
/dev/stdout as the destination
it's recommended to use stdout and stderr directly and leave rotation
and retention policies to the kubelet.
Using a logging agent in a sidecar container can lead
to significant resource consumption. Moreover, you won't be able to access
those logs using kubectl logs because they are not controlled
by the kubelet.
deployment.yaml: A basic manifest for creating a Kubernetes deployment
using the suffix .yaml for YAML files and .tpl for helpers.
It is just fine to put a plain YAML file like this in the templates/ directory.
helm get manifest
The helm get manifest command takes a release name (full-coral) and prints
out all of the Kubernetes resources that were uploaded to the server. Each file
begins with --- to indicate the start of a YAML document
Names should be unique to a release
The name: field is limited to 63 characters because of limitations to
the DNS system.
release names are limited to 53 characters
{{ .Release.Name }}
A template directive is enclosed in {{ and }} blocks.
The values that are passed into a template can be thought of as namespaced objects, where a dot (.) separates each namespaced element.
The leading dot before Release indicates that we start with the top-most namespace for this scope
The Release object is one of the built-in objects for Helm
When you want to test the template rendering, but not actually install anything, you can use helm install ./mychart --debug --dry-run
Using --dry-run will make it easier to test your code, but it won’t ensure that Kubernetes itself will accept the templates you generate.
Objects are passed into a template from the template engine.
create new objects within your templates
Objects can be simple, and have just one value. Or they can contain other objects or functions.
Release is one of the top-level objects that you can access in your templates.
Release.Namespace: The namespace to be released into (if the manifest doesn’t override)
Values: Values passed into the template from the values.yaml file and from user-supplied files. By default, Values is empty.
Chart: The contents of the Chart.yaml file.
Files: This provides access to all non-special files in a chart.
Files.Get is a function for getting a file by name
Files.GetBytes is a function for getting the contents of a file as an array of bytes instead of as a string. This is useful for things like images.
Template: Contains information about the current template that is being executed
BasePath: The namespaced path to the templates directory of the current chart
The built-in values always begin with a capital letter.
Go’s naming convention
use only initial lower case letters in order to distinguish local names from those built-in.
If this is a subchart, the values.yaml file of a parent chart
Individual parameters passed with --set
values.yaml is the default, which can be overridden by a parent chart’s values.yaml, which can in turn be overridden by a user-supplied values file, which can in turn be overridden by --set parameters.
While structuring data this way is possible, the recommendation is that you keep your values trees shallow, favoring flatness.
If you need to delete a key from the default values, you may override the value of the key to be null, in which case Helm will remove the key from the overridden values merge.
Kubernetes would then fail because you can not declare more than one livenessProbe handler.
When injecting strings from the .Values object into the template, we ought to quote these strings.
quote
Template functions follow the syntax functionName arg1 arg2...
While we talk about the “Helm template language” as if it is Helm-specific, it is actually a combination of the Go template language, some extra functions, and a variety of wrappers to expose certain objects to the templates.
Drawing on a concept from UNIX, pipelines are a tool for chaining together a series of template commands to compactly express a series of transformations.
pipelines are an efficient way of getting several things done in sequence
The repeat function will echo the given string the given number of times
default DEFAULT_VALUE GIVEN_VALUE. This function allows you to specify a default value inside of the template, in case the value is omitted.
all static default values should live in the values.yaml, and should not be repeated using the default command
Operators are implemented as functions that return a boolean value.
To use eq, ne, lt, gt, and, or, not etcetera place the operator at the front of the statement followed by its parameters just as you would a function.
if and
if or
with to specify a scope
range, which provides a “for each”-style loop
block declares a special kind of fillable template area
A pipeline is evaluated as false if the value is:
a boolean false
a numeric zero
an empty string
a nil (empty or null)
an empty collection (map, slice, tuple, dict, array)
incorrect YAML because of the whitespacing
When the template engine runs, it removes the contents inside of {{ and }}, but it leaves the remaining whitespace exactly as is.
{{- (with the dash and space added) indicates that whitespace should be chomped left, while -}} means whitespace to the right should be consumed.
Newlines are whitespace!
an * at the end of the line indicates a newline character that would be removed
Be careful with the chomping modifiers.
the indent function
Scopes can be changed. with can allow you to set the current scope (.) to a particular object.
Inside of the restricted scope, you will not be able to access the other objects from the parent scope.
range
The range function will “range over” (iterate through) the pizzaToppings list.
Just like with sets the scope of ., so does a range operator.
The toppings: |- line is declaring a multi-line string.
not a YAML list. It’s a big string.
the data in ConfigMaps data is composed of key/value pairs, where both the key and the value are simple strings.
The |- marker in YAML takes a multi-line string.
range can be used to iterate over collections that have a key and a value (like a map or dict).
In Helm templates, a variable is a named reference to another object. It follows the form $name
Variables are assigned with a special assignment operator: :=
{{- $relname := .Release.Name -}}
capture both the index and the value
the integer index (starting from zero) to $index and the value to $topping
For data structures that have both a key and a value, we can use range to get both
Variables are normally not “global”. They are scoped to the block in which they are declared.
one variable that is always global - $ - this variable will always point to the root context.
$.
$.
Helm template language is its ability to declare multiple templates and use them together.
A named template (sometimes called a partial or a subtemplate) is simply a template defined inside of a file, and given a name.
when naming templates: template names are global.
If you declare two templates with the same name, whichever one is loaded last will be the one used.
you should be careful to name your templates with chart-specific names.
templates in subcharts are compiled together with top-level templates
naming convention is to prefix each defined template with the name of the chart: {{ define "mychart.labels" }}