Skip to main content

Home/ Larvata/ Group items tagged scale

Rss Feed Group items tagged

張 旭

Active Record Migrations - Ruby on Rails Guides - 0 views

    • 張 旭
       
       跟 belongs_to 與 has_many 設定對應的 Migrattion
    • 張 旭
       
      has_and_belongs_to_many 的對應?
  • add_column and remove_column
  • ...114 more annotations...
  • allowing your schema and changes to be database independent.
  • each migration as being a new 'version' of the database
  • each migration modifies it to add or remove tables, columns, or entries
  • Active Record will also update your db/schema.rb file to match the up-to-date structure of your database.
  • A primary key column called id will also be added implicitly, as it's the default primary key for all Active Record models
  • roll this migration back, it will remove the table
  • timestamps macro adds two columns, created_at and updated_at
  • On databases that support transactions with statements that change the schema, migrations are wrapped in a transaction
  • reversible
  • use up and down instead of change
  • Migrations are stored as files in the db/migrate directory, one for each migration class.
  • a UTC timestamp identifying
  • Rails uses this timestamp to determine which migration should be run and in what order
  • "AddXXXToYYY" or "RemoveXXXFromYYY"
  • use a Ruby DSL
  • column type as references
  • part_number:string:index
  • a migration to remove a column
  • "CreateXXX"
  • change_column_null
  • AddUserRefToProducts
  • :references
  • produce join tables if JoinTable is part of the name
  • CreateJoinTable
  • The model and scaffold generators will create migrations appropriate for adding a new model.
  • enclosed by curly braces and follow the field type
  • create_table
  • By default, create_table will create a primary key called id
  • add an index on the new column
  • when using MySQL, the default is ENGINE=InnoDB
  • create_join_table creates an HABTM (has and belongs to many) join table
  • To customize the name of the table, provide a :table_name option:
  • create_join_table also accepts a block
  • change_table, used for changing existing tables
  • remove
  • rename
  • add_column
  • change_column
  • remove_column
  • change_column_default
  • place an SQL fragment in the :options option.
  • limit
  • precision
  • scale
  • polymorphic
  • default
  • index
  • add_foreign_key
  • Active Record only supports single column foreign keys.
  • use the old style of migration using up and down methods instead of the change method.
  • .connection.execute
  • change_table is also reversible, as long as the block does not call change, change_default or remove.
  • remove_column is reversible if you supply the column type as the third argument
  • Complex migrations may require processing that Active Record doesn't know how to reverse
  • reversible
  • Using reversible will ensure that the instructions are executed in the right order too.
  • add_column add_foreign_key add_index add_reference add_timestamps change_column_default (must supply a :from and :to option) change_column_null create_join_table create_table disable_extension drop_join_table drop_table (must supply a block) enable_extension remove_column (must supply a type) remove_foreign_key (must supply a second table) remove_index remove_reference remove_timestamps rename_column rename_index rename_table
  • :column_options option
  • have the option :null set to false by default
  • By default, the name of the join table comes from the union of the first two arguments provided to create_join_table
  • in alphabetical order
  • change_column command is irreversible.
    • 張 旭
       
      關聯物在前,被關聯物在後。 A 關聯到 B
  • If the column names can not be derived from the table names, you can use the :column and :primary_key options.
  • figure out the column name
  • foreign key for a specific column
  • foreign key by name
    • 張 旭
       
      不懂 column 跟 name 的用法差異,基本上一樣。
  • Active Record knows how to reverse the migration automatically
    • 張 旭
       
      使用內建的 method,Rails 比較容易自動 rollback
    • 張 旭
       
      除了幾個特殊的 change_ 跟 remove_
  • should use reversible or write the up and down methods instead of using the change method
  • If your migration is irreversible, you should raise ActiveRecord::IrreversibleMigration from your down method.
  • DontUseConstraintForZipcodeValidationMigration
  • rails db:migrate
  • the db:migrate task also invokes the db:schema:dump task, which will update your db/schema.rb file to match the structure of your database.
  • specify a target version
  • all migrations up to and including 20080906120000
  • run the down method on all the migrations down to, but not including, 20080906120000
  • rails db:rollback
  • db:migrate:redo task is a shortcut for doing a rollback and then migrating back up again
    • 張 旭
       
      舊版的還是 rake!
  • STEP parameter
  • db:setup task will create the database, load the schema and initialize it with the seed data
  • db:reset task will drop the database and set it up again. This is functionally equivalent to rails db:drop db:setup.
  • run a specific migration up or down, the db:migrate:up and db:migrate:down
  • the RAILS_ENV environment variable
  • db:migrate VERBOSE=false will suppress all output.
  • If you have already run the migration, then you cannot just edit the migration and run the migration again: Rails thinks it has already run the migration and so will do nothing when you run rails db:migrate.
  • must rollback the migration (for example with bin/rails db:rollback), edit your migration and then run rails db:migrate to run the corrected version.
  • editing existing migrations is not a good idea.
  • should write a new migration that performs the changes you require
  • revert method can be helpful when writing a new migration to undo previous migrations in whole or in part
  • require_relative
  • revert
  • They are not designed to be edited, they just represent the current state of the database.
  • Schema Files for
  • Schema files are also useful if you want a quick look at what attributes an Active Record object has
  • annotate_models gem automatically adds and updates comments at the top of each model summarizing the schema if you desire that functionality.
  • database-independent
  • multiple databases
  • db/schema.rb cannot express database specific items such as triggers, stored procedures or check constraints
  • you can execute custom SQL statements, the schema dumper cannot reconstitute those statements from the database
  • db:structure:dump
    • 張 旭
       
      資料庫種類不相依的 schema 付出的代價就是有些特殊的資料庫特性無法描述出來,例如 trigger;如果有在 migration 寫 SQL 的,簡單說 schema dumper 這邊就要設定成 :sql 而不是預設的 :ruby
  • set in config/application.rb by the config.active_record.schema_format setting, which may be either :sql or :ruby.
  • check them into source control.
  • db/schema.rb contains the current version number of the database
  • Validations such as validates :foreign_key, uniqueness: true are one way in which models can enforce data integrity
  • The :dependent option on associations allows models to automatically destroy child objects when the parent is destroyed.
  • Migrations can also be used to add or modify data
  • Initial
  • To add initial data after a database is created, Rails has a built-in 'seeds' feature that makes the process quick and easy.
  • db/seeds.rb
  • rails db:seed
張 旭

NAT Gateways - Amazon Virtual Private Cloud - 0 views

  • a network address translation (NAT) gateway to enable instances in a private subnet to connect to the internet or other AWS services
  • but prevent the internet from initiating a connection with those instances
  • NAT gateways are not supported for IPv6 traffic
  • ...11 more annotations...
  • must specify the public subnet in which the NAT gateway should reside
  • update the route table associated with one or more of your private subnets to point Internet-bound traffic to the NAT gateway.
  • NAT gateway is created in a specific Availability Zone and implemented with redundancy in that zone.
  • ensure that resources use the NAT gateway in the same Availability Zone
  • The main route table sends internet traffic from the instances in the private subnet to the NAT gateway. The NAT gateway sends the traffic to the internet gateway using the NAT gateway’s Elastic IP address as the source IP address
  • A NAT gateway supports 5 Gbps of bandwidth and automatically scales up to 45 Gbps
  • You can associate exactly one Elastic IP address with a NAT gateway
  • A NAT gateway supports the following protocols: TCP, UDP, and ICMP
  • cannot associate a security group with a NAT gateway.
  • create a NAT gateway in the same subnet as your NAT instance, and then replace the existing route in your route table that points to the NAT instance with a route that points to the NAT gateway
  • A NAT gateway cannot send traffic over VPC endpoints, VPN connections, AWS Direct Connect, or VPC peering connections.
張 旭

Best practices for writing Dockerfiles | Docker Documentation - 0 views

  • building efficient images
  • Docker builds images automatically by reading the instructions from a Dockerfile -- a text file that contains all commands, in order, needed to build a given image.
  • A Docker image consists of read-only layers each of which represents a Dockerfile instruction.
  • ...47 more annotations...
  • The layers are stacked and each one is a delta of the changes from the previous layer
  • When you run an image and generate a container, you add a new writable layer (the “container layer”) on top of the underlying layers.
  • By “ephemeral,” we mean that the container can be stopped and destroyed, then rebuilt and replaced with an absolute minimum set up and configuration.
  • Inadvertently including files that are not necessary for building an image results in a larger build context and larger image size.
  • To exclude files not relevant to the build (without restructuring your source repository) use a .dockerignore file. This file supports exclusion patterns similar to .gitignore files.
  • minimize image layers by leveraging build cache.
  • if your build contains several layers, you can order them from the less frequently changed (to ensure the build cache is reusable) to the more frequently changed
  • avoid installing extra or unnecessary packages just because they might be “nice to have.”
  • Each container should have only one concern.
  • Decoupling applications into multiple containers makes it easier to scale horizontally and reuse containers
  • Limiting each container to one process is a good rule of thumb, but it is not a hard and fast rule.
  • Use your best judgment to keep containers as clean and modular as possible.
  • do multi-stage builds and only copy the artifacts you need into the final image. This allows you to include tools and debug information in your intermediate build stages without increasing the size of the final image.
  • avoid duplication of packages and make the list much easier to update.
  • When building an image, Docker steps through the instructions in your Dockerfile, executing each in the order specified.
  • the next instruction is compared against all child images derived from that base image to see if one of them was built using the exact same instruction. If not, the cache is invalidated.
  • simply comparing the instruction in the Dockerfile with one of the child images is sufficient.
  • For the ADD and COPY instructions, the contents of the file(s) in the image are examined and a checksum is calculated for each file.
  • If anything has changed in the file(s), such as the contents and metadata, then the cache is invalidated.
  • cache checking does not look at the files in the container to determine a cache match.
  • In that case just the command string itself is used to find a match.
    • 張 旭
       
      RUN apt-get 這樣的指令,直接比對指令內容的意思。
  • Whenever possible, use current official repositories as the basis for your images.
  • Using RUN apt-get update && apt-get install -y ensures your Dockerfile installs the latest package versions with no further coding or manual intervention.
  • cache busting
  • Docker executes these commands using the /bin/sh -c interpreter, which only evaluates the exit code of the last operation in the pipe to determine success.
  • set -o pipefail && to ensure that an unexpected error prevents the build from inadvertently succeeding.
  • The CMD instruction should be used to run the software contained by your image, along with any arguments.
  • CMD should almost always be used in the form of CMD [“executable”, “param1”, “param2”…]
  • CMD should rarely be used in the manner of CMD [“param”, “param”] in conjunction with ENTRYPOINT
  • The ENV instruction is also useful for providing required environment variables specific to services you wish to containerize,
  • Each ENV line creates a new intermediate layer, just like RUN commands
  • COPY is preferred
  • COPY only supports the basic copying of local files into the container
  • the best use for ADD is local tar file auto-extraction into the image, as in ADD rootfs.tar.xz /
  • If you have multiple Dockerfile steps that use different files from your context, COPY them individually, rather than all at once.
  • using ADD to fetch packages from remote URLs is strongly discouraged; you should use curl or wget instead
  • The best use for ENTRYPOINT is to set the image’s main command, allowing that image to be run as though it was that command (and then use CMD as the default flags).
  • the image name can double as a reference to the binary as shown in the command
  • The VOLUME instruction should be used to expose any database storage area, configuration storage, or files/folders created by your docker container.
  • use VOLUME for any mutable and/or user-serviceable parts of your image
  • If you absolutely need functionality similar to sudo, such as initializing the daemon as root but running it as non-root), consider using “gosu”.
  • always use absolute paths for your WORKDIR
  • An ONBUILD command executes after the current Dockerfile build completes.
  • Think of the ONBUILD command as an instruction the parent Dockerfile gives to the child Dockerfile
  • A Docker build executes ONBUILD commands before any command in a child Dockerfile.
  • Be careful when putting ADD or COPY in ONBUILD. The “onbuild” image fails catastrophically if the new build’s context is missing the resource being added.
張 旭

How To Create a Kubernetes Cluster Using Kubeadm on Ubuntu 18.04 | DigitalOcean - 0 views

  • A pod is an atomic unit that runs one or more containers.
  • Pods are the basic unit of scheduling in Kubernetes: all containers in a pod are guaranteed to run on the same node that the pod is scheduled on.
  • Each pod has its own IP address, and a pod on one node should be able to access a pod on another node using the pod's IP.
  • ...12 more annotations...
  • Communication between pods is more complicated, however, and requires a separate networking component that can transparently route traffic from a pod on one node to a pod on another.
  • pod network plugins. For this cluster, you will use Flannel, a stable and performant option.
  • Passing the argument --pod-network-cidr=10.244.0.0/16 specifies the private subnet that the pod IPs will be assigned from.
  • kubectl apply -f descriptor.[yml|json] is the syntax for telling kubectl to create the objects described in the descriptor.[yml|json] file.
  • deploy Nginx using Deployments and Services
  • A deployment is a type of Kubernetes object that ensures there's always a specified number of pods running based on a defined template, even if the pod crashes during the cluster's lifetime.
  • NodePort, a scheme that will make the pod accessible through an arbitrary port opened on each node of the cluster
  • Services are another type of Kubernetes object that expose cluster internal services to clients, both internal and external.
  • load balancing requests to multiple pods
  • Pods are ubiquitous in Kubernetes, so understanding them will facilitate your work
  • how controllers such as deployments work since they are used frequently in stateless applications for scaling and the automated healing of unhealthy applications.
  • Understanding the types of services and the options they have is essential for running both stateless and stateful applications.
張 旭

Pods - Kubernetes - 0 views

  • Pods are the smallest deployable units of computing
  • A Pod (as in a pod of whales or pea pod) is a group of one or more containersA lightweight and portable executable image that contains software and all of its dependencies. (such as Docker containers), with shared storage/network, and a specification for how to run the containers.
  • A Pod’s contents are always co-located and co-scheduled, and run in a shared context.
  • ...32 more annotations...
  • A Pod models an application-specific “logical host”
  • application containers which are relatively tightly coupled
  • being executed on the same physical or virtual machine would mean being executed on the same logical host.
  • The shared context of a Pod is a set of Linux namespaces, cgroups, and potentially other facets of isolation
  • Containers within a Pod share an IP address and port space, and can find each other via localhost
  • Containers in different Pods have distinct IP addresses and can not communicate by IPC without special configuration. These containers usually communicate with each other via Pod IP addresses.
  • Applications within a Pod also have access to shared volumesA directory containing data, accessible to the containers in a pod. , which are defined as part of a Pod and are made available to be mounted into each application’s filesystem.
  • a Pod is modelled as a group of Docker containers with shared namespaces and shared filesystem volumes
    • 張 旭
       
      類似 docker-compose 裡面宣告的同一坨?
  • Pods are considered to be relatively ephemeral (rather than durable) entities.
  • Pods are created, assigned a unique ID (UID), and scheduled to nodes where they remain until termination (according to restart policy) or deletion.
  • it can be replaced by an identical Pod
  • When something is said to have the same lifetime as a Pod, such as a volume, that means that it exists as long as that Pod (with that UID) exists.
  • uses a persistent volume for shared storage between the containers
  • Pods serve as unit of deployment, horizontal scaling, and replication
  • The applications in a Pod all use the same network namespace (same IP and port space), and can thus “find” each other and communicate using localhost
  • flat shared networking space
  • Containers within the Pod see the system hostname as being the same as the configured name for the Pod.
  • Volumes enable data to survive container restarts and to be shared among the applications within the Pod.
  • Individual Pods are not intended to run multiple instances of the same application
  • The individual containers may be versioned, rebuilt and redeployed independently.
  • Pods aren’t intended to be treated as durable entities.
  • Controllers like StatefulSet can also provide support to stateful Pods.
  • When a user requests deletion of a Pod, the system records the intended grace period before the Pod is allowed to be forcefully killed, and a TERM signal is sent to the main process in each container.
  • Once the grace period has expired, the KILL signal is sent to those processes, and the Pod is then deleted from the API server.
  • grace period
  • Pod is removed from endpoints list for service, and are no longer considered part of the set of running Pods for replication controllers.
  • When the grace period expires, any processes still running in the Pod are killed with SIGKILL.
  • By default, all deletes are graceful within 30 seconds.
  • You must specify an additional flag --force along with --grace-period=0 in order to perform force deletions.
  • Force deletion of a Pod is defined as deletion of a Pod from the cluster state and etcd immediately.
  • StatefulSet Pods
  • Processes within the container get almost the same privileges that are available to processes outside a container.
張 旭

MySQL on Docker: Running ProxySQL as Kubernetes Service | Severalnines - 0 views

  • Using Kubernetes ConfigMap approach, ProxySQL can be clustered with immutable configuration.
  • Kubernetes handles ProxySQL recovery and balance the connections to the instances automatically.
  • Can be used with external applications outside Kubernetes.
  • ...11 more annotations...
  • load balancing, connection failover and decoupling of the application tier from the underlying database topologies.
  • ProxySQL as a Kubernetes service (centralized deployment)
  • running as a service makes ProxySQL pods live independently from the applications and can be easily scaled and clustered together with the help of Kubernetes ConfigMap.
  • ProxySQL's multi-layer configuration system makes pod clustering possible with ConfigMap.
  • create ProxySQL pods and attach a Kubernetes service to be accessed by the other pods within the Kubernetes network or externally.
  • Default to 6033 for MySQL load-balanced connections and 6032 for ProxySQL administration console.
  • separated by "---" delimiter
  • deploy two ProxySQL pods as a ReplicaSet that matches containers labelled with "app=proxysql,tier=frontend".
  • A Kubernetes service is an abstraction layer which defines the logical set of pods and a policy by which to access them
  • The range of valid ports for NodePort resource is 30000-32767.
  • ConfigMap - To store ProxySQL configuration file as a volume so it can be mounted to multiple pods and can be remounted again if the pod is being rescheduled to the other Kubernetes node.
張 旭

vSphere Cloud Provider | vSphere Storage for Kubernetes - 0 views

  • Containers are stateless and ephemeral but applications are stateful and need persistent storage.
  • Cloud Provider
  • Kubernetes cloud providers are an interface to integrate various node (i.e. hosts), load balancers and networking routes
  • ...8 more annotations...
  • VMware offers a Cloud Provider known as the vSphere Cloud Provider (VCP) for Kubernetes which allows Pods to use enterprise grade persistent storage.
  • A vSphere datastore is an abstraction which hides storage details (such as LUNs) and provides a uniform interface for storing persistent data.
  • the datastores can be of the type vSAN, VMFS, NFS & VVol.
  • VMFS (Virtual Machine File System) is a cluster file system that allows virtualization to scale beyond a single node for multiple VMware ESX servers.
  • NFS (Network File System) is a distributed file protocol to access storage over network like local storage.
  • vSphere Cloud Provider supports every storage primitive exposed by Kubernetes
  • Kubernetes PVs are defined in Pod specifications.
  • PVCs when using Dynamic Provisioning (preferred).
張 旭

How to create reusable infrastructure with Terraform modules - 0 views

  • auto scaling schedule
  • The easiest way to create a versioned module is to put the code for the module in a separate Git repository and to set the source parameter to that repository’s URL.
張 旭

Understanding Nginx HTTP Proxying, Load Balancing, Buffering, and Caching | DigitalOcean - 0 views

  • allow Nginx to pass requests off to backend http servers for further processing
  • Nginx is often set up as a reverse proxy solution to help scale out infrastructure or to pass requests to other servers that are not designed to handle large client loads
  • explore buffering and caching to improve the performance of proxying operations for clients
  • ...48 more annotations...
  • Nginx is built to handle many concurrent connections at the same time.
  • provides you with flexibility in easily adding backend servers or taking them down as needed for maintenance
  • Proxying in Nginx is accomplished by manipulating a request aimed at the Nginx server and passing it to other servers for the actual processing
  • The servers that Nginx proxies requests to are known as upstream servers.
  • Nginx can proxy requests to servers that communicate using the http(s), FastCGI, SCGI, and uwsgi, or memcached protocols through separate sets of directives for each type of proxy
  • When a request matches a location with a proxy_pass directive inside, the request is forwarded to the URL given by the directive
  • For example, when a request for /match/here/please is handled by this block, the request URI will be sent to the example.com server as http://example.com/match/here/please
  • The request coming from Nginx on behalf of a client will look different than a request coming directly from a client
  • Nginx gets rid of any empty headers
  • Nginx, by default, will consider any header that contains underscores as invalid. It will remove these from the proxied request
    • 張 旭
       
      這裡要注意一下,header 欄位名稱有設定底線的,要設定 Nginx 讓它可以通過。
  • The "Host" header is re-written to the value defined by the $proxy_host variable.
  • The upstream should not expect this connection to be persistent
  • Headers with empty values are completely removed from the passed request.
  • if your backend application will be processing non-standard headers, you must make sure that they do not have underscores
  • by default, this will be set to the value of $proxy_host, a variable that will contain the domain name or IP address and port taken directly from the proxy_pass definition
  • This is selected by default as it is the only address Nginx can be sure the upstream server responds to
  • (as it is pulled directly from the connection info)
  • $http_host: Sets the "Host" header to the "Host" header from the client request.
  • The headers sent by the client are always available in Nginx as variables. The variables will start with an $http_ prefix, followed by the header name in lowercase, with any dashes replaced by underscores.
  • preference to: the host name from the request line itself
  • set the "Host" header to the $host variable. It is the most flexible and will usually provide the proxied servers with a "Host" header filled in as accurately as possible
  • sets the "Host" header to the $host variable, which should contain information about the original host being requested
  • This variable takes the value of the original X-Forwarded-For header retrieved from the client and adds the Nginx server's IP address to the end.
  • The upstream directive must be set in the http context of your Nginx configuration.
  • http context
  • Once defined, this name will be available for use within proxy passes as if it were a regular domain name
  • By default, this is just a simple round-robin selection process (each request will be routed to a different host in turn)
  • Specifies that new connections should always be given to the backend that has the least number of active connections.
  • distributes requests to different servers based on the client's IP address.
  • mainly used with memcached proxying
  • As for the hash method, you must provide the key to hash against
  • Server Weight
  • Nginx's buffering and caching capabilities
  • Without buffers, data is sent from the proxied server and immediately begins to be transmitted to the client.
  • With buffers, the Nginx proxy will temporarily store the backend's response and then feed this data to the client
  • Nginx defaults to a buffering design
  • can be set in the http, server, or location contexts.
  • the sizing directives are configured per request, so increasing them beyond your need can affect your performance
  • When buffering is "off" only the buffer defined by the proxy_buffer_size directive will be used
  • A high availability (HA) setup is an infrastructure without a single point of failure, and your load balancers are a part of this configuration.
  • multiple load balancers (one active and one or more passive) behind a static IP address that can be remapped from one server to another.
  • Nginx also provides a way to cache content from backend servers
  • The proxy_cache_path directive must be set in the http context.
  • proxy_cache backcache;
    • 張 旭
       
      這裡的 backcache 是前文設定的 backcache 變數,看起來每個 location 都可以有自己的 cache 目錄。
  • The proxy_cache_bypass directive is set to the $http_cache_control variable. This will contain an indicator as to whether the client is explicitly requesting a fresh, non-cached version of the resource
  • any user-related data should not be cached
  • For private content, you should set the Cache-Control header to "no-cache", "no-store", or "private" depending on the nature of the data
張 旭

Introduction to MongoDB - MongoDB Manual - 0 views

  • MongoDB is a document database designed for ease of development and scaling
  • MongoDB offers both a Community and an Enterprise version
  • A record in MongoDB is a document, which is a data structure composed of field and value pairs.
  • ...12 more annotations...
  • MongoDB documents are similar to JSON objects.
  • The values of fields may include other documents, arrays, and arrays of documents.
  • reduce need for expensive joins
  • MongoDB stores documents in collections.
  • Collections are analogous to tables in relational databases.
  • Read-only Views
  • Indexes support faster queries and can include keys from embedded documents and arrays.
  • MongoDB's replication facility, called replica set
  • A replica set is a group of MongoDB servers that maintain the same data set, providing redundancy and increasing data availability.
  • Sharding distributes data across a cluster of machines.
  • MongoDB supports creating zones of data based on the shard key.
  • MongoDB provides pluggable storage engine API
張 旭

Best practices for building Kubernetes Operators and stateful apps | Google Cloud Blog - 0 views

  • use the StatefulSet workload controller to maintain identity for each of the pods, and to use Persistent Volumes to persist data so it can survive a service restart.
  • a way to extend Kubernetes functionality with application specific logic using custom resources and custom controllers.
  • An Operator can automate various features of an application, but it should be specific to a single application
  • ...12 more annotations...
  • Kubebuilder is a comprehensive development kit for building and publishing Kubernetes APIs and Controllers using CRDs
  • Design declarative APIs for operators, not imperative APIs. This aligns well with Kubernetes APIs that are declarative in nature.
  • With declarative APIs, users only need to express their desired cluster state, while letting the operator perform all necessary steps to achieve it.
  • scaling, backup, restore, and monitoring. An operator should be made up of multiple controllers that specifically handle each of the those features.
  • the operator can have a main controller to spawn and manage application instances, a backup controller to handle backup operations, and a restore controller to handle restore operations.
  • each controller should correspond to a specific CRD so that the domain of each controller's responsibility is clear.
  • If you keep a log for every container, you will likely end up with unmanageable amount of logs.
  • integrate application-specific details to the log messages such as adding a prefix for the application name.
  • you may have to use external logging tools such as Google Stackdriver, Elasticsearch, Fluentd, or Kibana to perform the aggregations.
  • adding labels to metrics to facilitate aggregation and analysis by monitoring systems.
  • a more viable option is for application pods to expose a metrics HTTP endpoint for monitoring tools to scrape.
  • A good way to achieve this is to use open-source application-specific exporters for exposing Prometheus-style metrics.
張 旭

Kubernetes Deployments: The Ultimate Guide - Semaphore - 1 views

  • Continuous integration gives you confidence in your code. To extend that confidence to the release process, your deployment operations need to come with a safety belt.
  • these Kubernetes objects ensure that you can progressively deploy, roll back and scale your applications without downtime.
  • A pod is just a group of containers (it can be a group of one container) that run on the same machine, and share a few things together.
  • ...34 more annotations...
  • the containers within a pod can communicate with each other over localhost
  • From a network perspective, all the processes in these containers are local.
  • we can never create a standalone container: the closest we can do is create a pod, with a single container in it.
  • Kubernetes is a declarative system (by opposition to imperative systems).
  • All we can do, is describe what we want to have, and wait for Kubernetes to take action to reconcile what we have, with what we want to have.
  • In other words, we can say, “I would like a 40-feet long blue container with yellow doors“, and Kubernetes will find such a container for us. If it doesn’t exist, it will build it; if there is already one but it’s green with red doors, it will paint it for us; if there is already a container of the right size and color, Kubernetes will do nothing, since what we have already matches what we want.
  • The specification of a replica set looks very much like the specification of a pod, except that it carries a number, indicating how many replicas
  • What happens if we change that definition? Suddenly, there are zero pods matching the new specification.
  • the creation of new pods could happen in a more gradual manner.
  • the specification for a deployment looks very much like the one for a replica set: it features a pod specification, and a number of replicas.
  • Deployments, however, don’t create or delete pods directly.
  • When we update a deployment and adjust the number of replicas, it passes that update down to the replica set.
  • When we update the pod specification, the deployment creates a new replica set with the updated pod specification. That replica set has an initial size of zero. Then, the size of that replica set is progressively increased, while decreasing the size of the other replica set.
  • we are going to fade in (turn up the volume) on the new replica set, while we fade out (turn down the volume) on the old one.
  • During the whole process, requests are sent to pods of both the old and new replica sets, without any downtime for our users.
  • A readiness probe is a test that we add to a container specification.
  • Kubernetes supports three ways of implementing readiness probes:Running a command inside a container;Making an HTTP(S) request against a container; orOpening a TCP socket against a container.
  • When we roll out a new version, Kubernetes will wait for the new pod to mark itself as “ready” before moving on to the next one.
  • If there is no readiness probe, then the container is considered as ready, as long as it could be started.
  • MaxSurge indicates how many extra pods we are willing to run during a rolling update, while MaxUnavailable indicates how many pods we can lose during the rolling update.
  • Setting MaxUnavailable to 0 means, “do not shutdown any old pod before a new one is up and ready to serve traffic“.
  • Setting MaxSurge to 100% means, “immediately start all the new pods“, implying that we have enough spare capacity on our cluster, and that we want to go as fast as possible.
  • kubectl rollout undo deployment web
  • the replica set doesn’t look at the pods’ specifications, but only at their labels.
  • A replica set contains a selector, which is a logical expression that “selects” (just like a SELECT query in SQL) a number of pods.
  • it is absolutely possible to manually create pods with these labels, but running a different image (or with different settings), and fool our replica set.
  • Selectors are also used by services, which act as the load balancers for Kubernetes traffic, internal and external.
  • internal IP address (denoted by the name ClusterIP)
  • during a rollout, the deployment doesn’t reconfigure or inform the load balancer that pods are started and stopped. It happens automatically through the selector of the service associated to the load balancer.
  • a pod is added as a valid endpoint for a service only if all its containers pass their readiness check. In other words, a pod starts receiving traffic only once it’s actually ready for it.
  • In blue/green deployment, we want to instantly switch over all the traffic from the old version to the new, instead of doing it progressively
  • We can achieve blue/green deployment by creating multiple deployments (in the Kubernetes sense), and then switching from one to another by changing the selector of our service
  • kubectl label pods -l app=blue,version=v1.5 status=enabled
  • kubectl label pods -l app=blue,version=v1.4 status-
  •  
    "Continuous integration gives you confidence in your code. To extend that confidence to the release process, your deployment operations need to come with a safety belt."
張 旭

Logstash Alternatives: Pros & Cons of 5 Log Shippers [2019] - Sematext - 0 views

  • In this case, Elasticsearch. And because Elasticsearch can be down or struggling, or the network can be down, the shipper would ideally be able to buffer and retry
  • Logstash is typically used for collecting, parsing, and storing logs for future use as part of log management.
  • Logstash’s biggest con or “Achille’s heel” has always been performance and resource consumption (the default heap size is 1GB).
  • ...37 more annotations...
  • This can be a problem for high traffic deployments, when Logstash servers would need to be comparable with the Elasticsearch ones.
  • Filebeat was made to be that lightweight log shipper that pushes to Logstash or Elasticsearch.
  • differences between Logstash and Filebeat are that Logstash has more functionality, while Filebeat takes less resources.
  • Filebeat is just a tiny binary with no dependencies.
  • For example, how aggressive it should be in searching for new files to tail and when to close file handles when a file didn’t get changes for a while.
  • For example, the apache module will point Filebeat to default access.log and error.log paths
  • Filebeat’s scope is very limited,
  • Initially it could only send logs to Logstash and Elasticsearch, but now it can send to Kafka and Redis, and in 5.x it also gains filtering capabilities.
  • Filebeat can parse JSON
  • you can push directly from Filebeat to Elasticsearch, and have Elasticsearch do both parsing and storing.
  • You shouldn’t need a buffer when tailing files because, just as Logstash, Filebeat remembers where it left off
  • For larger deployments, you’d typically use Kafka as a queue instead, because Filebeat can talk to Kafka as well
  • The default syslog daemon on most Linux distros, rsyslog can do so much more than just picking logs from the syslog socket and writing to /var/log/messages.
  • It can tail files, parse them, buffer (on disk and in memory) and ship to a number of destinations, including Elasticsearch.
  • rsyslog is the fastest shipper
  • Its grammar-based parsing module (mmnormalize) works at constant speed no matter the number of rules (we tested this claim).
  • use it as a simple router/shipper, any decent machine will be limited by network bandwidth
  • It’s also one of the lightest parsers you can find, depending on the configured memory buffers.
  • rsyslog requires more work to get the configuration right
  • the main difference between Logstash and rsyslog is that Logstash is easier to use while rsyslog lighter.
  • rsyslog fits well in scenarios where you either need something very light yet capable (an appliance, a small VM, collecting syslog from within a Docker container).
  • rsyslog also works well when you need that ultimate performance.
  • syslog-ng as an alternative to rsyslog (though historically it was actually the other way around).
  • a modular syslog daemon, that can do much more than just syslog
  • Unlike rsyslog, it features a clear, consistent configuration format and has nice documentation.
  • Similarly to rsyslog, you’d probably want to deploy syslog-ng on boxes where resources are tight, yet you do want to perform potentially complex processing.
  • syslog-ng has an easier, more polished feel than rsyslog, but likely not that ultimate performance
  • Fluentd was built on the idea of logging in JSON wherever possible (which is a practice we totally agree with) so that log shippers down the line don’t have to guess which substring is which field of which type.
  • Fluentd plugins are in Ruby and very easy to write.
  • structured data through Fluentd, it’s not made to have the flexibility of other shippers on this list (Filebeat excluded).
  • Fluent Bit, which is to Fluentd similar to how Filebeat is for Logstash.
  • Fluentd is a good fit when you have diverse or exotic sources and destinations for your logs, because of the number of plugins.
  • Splunk isn’t a log shipper, it’s a commercial logging solution
  • Graylog is another complete logging solution, an open-source alternative to Splunk.
  • everything goes through graylog-server, from authentication to queries.
  • Graylog is nice because you have a complete logging solution, but it’s going to be harder to customize than an ELK stack.
  • it depends
張 旭

Full Cycle Developers at Netflix - Operate What You Build - 1 views

  • Researching issues felt like bouncing a rubber ball between teams, hard to catch the root cause and harder yet to stop from bouncing between one another.
  • In the past, Edge Engineering had ops-focused teams and SRE specialists who owned the deploy+operate+support parts of the software life cycle
  • hearing about those problems second-hand
  • ...17 more annotations...
  • devs could push code themselves when needed, and also were responsible for off-hours production issues and support requests
  • What were we trying to accomplish and why weren’t we being successful?
  • These specialized roles create efficiencies within each segment while potentially creating inefficiencies across the entire life cycle.
  • Grouping differing specialists together into one team can reduce silos, but having different people do each role adds communication overhead, introduces bottlenecks, and inhibits the effectiveness of feedback loops.
  • devops principles
  • develops a system also be responsible for operating and supporting that system
  • Each development team owns deployment issues, performance bugs, capacity planning, alerting gaps, partner support, and so on.
  • Those centralized teams act as force multipliers by turning their specialized knowledge into reusable building blocks.
  • Communication and alignment are the keys to success.
  • Full cycle developers are expected to be knowledgeable and effective in all areas of the software life cycle.
  • ramping up on areas they haven’t focused on before
  • We run dev bootcamps and other forms of ongoing training to impart this knowledge and build up these skills
  • “how can I automate what is needed to operate this system?”
  • “what self-service tool will enable my partners to answer their questions without needing me to be involved?”
  • A full cycle developer thinks and acts like an SWE, SDET, and SRE. At times they create software that solves business problems, at other times they write test cases for that, and still other times they automate operational aspects of that system.
  • the need for continuous delivery pipelines, monitoring/observability, and so on.
  • Tooling and automation help to scale expertise, but no tool will solve every problem in the developer productivity and operations space
張 旭

Cluster Networking - Kubernetes - 0 views

  • Networking is a central part of Kubernetes, but it can be challenging to understand exactly how it is expected to work
  • Highly-coupled container-to-container communications
  • Pod-to-Pod communications
  • ...57 more annotations...
  • this is the primary focus of this document
    • 張 旭
       
      Cluster Networking 所關注處理的是: Pod 到 Pod 之間的連線
  • Pod-to-Service communications
  • External-to-Service communications
  • Kubernetes is all about sharing machines between applications.
  • sharing machines requires ensuring that two applications do not try to use the same ports.
  • Dynamic port allocation brings a lot of complications to the system
  • Every Pod gets its own IP address
  • do not need to explicitly create links between Pods
  • almost never need to deal with mapping container ports to host ports.
  • Pods can be treated much like VMs or physical hosts from the perspectives of port allocation, naming, service discovery, load balancing, application configuration, and migration.
  • pods on a node can communicate with all pods on all nodes without NAT
  • agents on a node (e.g. system daemons, kubelet) can communicate with all pods on that node
  • pods in the host network of a node can communicate with all pods on all nodes without NAT
  • If your job previously ran in a VM, your VM had an IP and could talk to other VMs in your project. This is the same basic model.
  • containers within a Pod share their network namespaces - including their IP address
  • containers within a Pod can all reach each other’s ports on localhost
  • containers within a Pod must coordinate port usage
  • “IP-per-pod” model.
  • request ports on the Node itself which forward to your Pod (called host ports), but this is a very niche operation
  • The Pod itself is blind to the existence or non-existence of host ports.
  • AOS is an Intent-Based Networking system that creates and manages complex datacenter environments from a simple integrated platform.
  • Cisco Application Centric Infrastructure offers an integrated overlay and underlay SDN solution that supports containers, virtual machines, and bare metal servers.
  • AOS Reference Design currently supports Layer-3 connected hosts that eliminate legacy Layer-2 switching problems.
  • The AWS VPC CNI offers integrated AWS Virtual Private Cloud (VPC) networking for Kubernetes clusters.
  • users can apply existing AWS VPC networking and security best practices for building Kubernetes clusters.
  • Using this CNI plugin allows Kubernetes pods to have the same IP address inside the pod as they do on the VPC network.
  • The CNI allocates AWS Elastic Networking Interfaces (ENIs) to each Kubernetes node and using the secondary IP range from each ENI for pods on the node.
  • Big Cloud Fabric is a cloud native networking architecture, designed to run Kubernetes in private cloud/on-premises environments.
  • Cilium is L7/HTTP aware and can enforce network policies on L3-L7 using an identity based security model that is decoupled from network addressing.
  • CNI-Genie is a CNI plugin that enables Kubernetes to simultaneously have access to different implementations of the Kubernetes network model in runtime.
  • CNI-Genie also supports assigning multiple IP addresses to a pod, each from a different CNI plugin.
  • cni-ipvlan-vpc-k8s contains a set of CNI and IPAM plugins to provide a simple, host-local, low latency, high throughput, and compliant networking stack for Kubernetes within Amazon Virtual Private Cloud (VPC) environments by making use of Amazon Elastic Network Interfaces (ENI) and binding AWS-managed IPs into Pods using the Linux kernel’s IPvlan driver in L2 mode.
  • to be straightforward to configure and deploy within a VPC
  • Contiv provides configurable networking
  • Contrail, based on Tungsten Fabric, is a truly open, multi-cloud network virtualization and policy management platform.
  • DANM is a networking solution for telco workloads running in a Kubernetes cluster.
  • Flannel is a very simple overlay network that satisfies the Kubernetes requirements.
  • Any traffic bound for that subnet will be routed directly to the VM by the GCE network fabric.
  • sysctl net.ipv4.ip_forward=1
  • Jaguar provides overlay network using vxlan and Jaguar CNIPlugin provides one IP address per pod.
  • Knitter is a network solution which supports multiple networking in Kubernetes.
  • Kube-OVN is an OVN-based kubernetes network fabric for enterprises.
  • Kube-router provides a Linux LVS/IPVS-based service proxy, a Linux kernel forwarding-based pod-to-pod networking solution with no overlays, and iptables/ipset-based network policy enforcer.
  • If you have a “dumb” L2 network, such as a simple switch in a “bare-metal” environment, you should be able to do something similar to the above GCE setup.
  • Multus is a Multi CNI plugin to support the Multi Networking feature in Kubernetes using CRD based network objects in Kubernetes.
  • NSX-T can provide network virtualization for a multi-cloud and multi-hypervisor environment and is focused on emerging application frameworks and architectures that have heterogeneous endpoints and technology stacks.
  • NSX-T Container Plug-in (NCP) provides integration between NSX-T and container orchestrators such as Kubernetes
  • Nuage uses the open source Open vSwitch for the data plane along with a feature rich SDN Controller built on open standards.
  • OpenVSwitch is a somewhat more mature but also complicated way to build an overlay network
  • OVN is an opensource network virtualization solution developed by the Open vSwitch community.
  • Project Calico is an open source container networking provider and network policy engine.
  • Calico provides a highly scalable networking and network policy solution for connecting Kubernetes pods based on the same IP networking principles as the internet
  • Calico can be deployed without encapsulation or overlays to provide high-performance, high-scale data center networking.
  • Calico can also be run in policy enforcement mode in conjunction with other networking solutions such as Flannel, aka canal, or native GCE, AWS or Azure networking.
  • Romana is an open source network and security automation solution that lets you deploy Kubernetes without an overlay network
  • Weave Net runs as a CNI plug-in or stand-alone. In either version, it doesn’t require any configuration or extra code to run, and in both cases, the network provides one IP address per pod - as is standard for Kubernetes.
  • The network model is implemented by the container runtime on each node.
‹ Previous 21 - 36 of 36
Showing 20 items per page