Skip to main content

Home/ Larvata/ Contents contributed and discussions participated by 張 旭

Contents contributed and discussions participated by 張 旭

張 旭

Databases and Collections - MongoDB Manual - 0 views

  • MongoDB stores data records as documents (specifically BSON documents) which are gathered together in collections.
  • A database stores one or more collections of documents.
  • In MongoDB, databases hold one or more collections of documents.
  • ...9 more annotations...
  • If a database does not exist, MongoDB creates the database when you first store data for that database.
  • The insertOne() operation creates both the database myNewDB and the collection myNewCollection1 if they do not already exist.
  • MongoDB stores documents in collections.
  • If a collection does not exist, MongoDB creates the collection when you first store data for that collection.
  • MongoDB provides the db.createCollection() method to explicitly create a collection with various options, such as setting the maximum size or the documentation validation rules.
  • By default, a collection does not require its documents to have the same schema;
  • To change the structure of the documents in a collection, such as add new fields, remove existing fields, or change the field values to a new type, update the documents to the new structure.
  • Collections are assigned an immutable UUID.
  • To retrieve the UUID for a collection, run either the listCollections command or the db.getCollectionInfos() method.
張 旭

Introduction to MongoDB - MongoDB Manual - 0 views

  • MongoDB is a document database designed for ease of development and scaling
  • MongoDB offers both a Community and an Enterprise version
  • A record in MongoDB is a document, which is a data structure composed of field and value pairs.
  • ...12 more annotations...
  • MongoDB documents are similar to JSON objects.
  • The values of fields may include other documents, arrays, and arrays of documents.
  • reduce need for expensive joins
  • MongoDB stores documents in collections.
  • Collections are analogous to tables in relational databases.
  • Read-only Views
  • Indexes support faster queries and can include keys from embedded documents and arrays.
  • MongoDB's replication facility, called replica set
  • A replica set is a group of MongoDB servers that maintain the same data set, providing redundancy and increasing data availability.
  • Sharding distributes data across a cluster of machines.
  • MongoDB supports creating zones of data based on the shard key.
  • MongoDB provides pluggable storage engine API
張 旭

MongoDB Performance - MongoDB Manual - 0 views

  • MongoDB uses a locking system to ensure data set consistency. If certain operations are long-running or a queue forms, performance will degrade as requests and operations wait for the lock.
  • performance limitations as a result of inadequate or inappropriate indexing strategies, or as a consequence of poor schema design patterns.
  • performance issues may be temporary and related to abnormal traffic load.
  • ...9 more annotations...
  • Lock-related slowdowns can be intermittent.
  • If globalLock.currentQueue.total is consistently high, then there is a chance that a large number of requests are waiting for a lock.
  • If globalLock.totalTime is high relative to uptime, the database has existed in a lock state for a significant amount of time.
  • For write-heavy applications, deploy sharding and add one or more shards to a sharded cluster to distribute load among mongod instances.
  • Unless constrained by system-wide limits, the maximum number of incoming connections supported by MongoDB is configured with the maxIncomingConnections setting.
  • When logLevel is set to 0, MongoDB records slow operations to the diagnostic log at a rate determined by slowOpSampleRate.
  • At higher logLevel settings, all operations appear in the diagnostic log regardless of their latency with the following exception
  • Full Time Diagnostic Data Collection (FTDC) mechanism. FTDC data files are compressed, are not human-readable, and inherit the same file access permissions as the MongoDB data files.
  • mongod processes store FTDC data files in a diagnostic.data directory under the instances storage.dbPath.
  •  
    "MongoDB uses a locking system to ensure data set consistency. If certain operations are long-running or a queue forms, performance will degrade as requests and operations wait for the lock."
張 旭

[Elasticsearch] 分散式特性 & 分散式搜尋的機制 | 小信豬的原始部落 - 0 views

  • 水平擴展儲存空間
  • Data HA:若有 node 掛掉,資料不會遺失
  • 若是要查詢 cluster 中的 node 狀態,可以使用 GET /_cat/nodes API
  • ...39 more annotations...
  • 決定每個 shard 要被分配到哪個 data node 上
  • 為 cluster 設置多個 master node
  • 一旦發現被選中的 master node 出現問題,就會選出新的 master node
  • 每個 node 啟動時就預設是一個 master eligible node,可以透過設定 node.master: false 取消此預設設定
  • 處理 request 的 node 稱為 Coordinating Node,其功能是將 request 轉發到合適的 node 上
  • 所有的 node 都預設是 Coordinating Node
  • coordinating node 可以直接接收 search request 並處理,不需要透過 master node 轉過來
  • 可以保存資料的 node,每個 node 啟動後都會預設是 data node,可以透過設定 node.data: false 停用 data node 功能
  • 由 master node 決定如何把分片分發到不同的 data node 上
  • 每個 node 上都保存了 cluster state
  • 只有 master 才可以修改 cluster state 並負責同步給其他 node
  • 每個 node 都會詳細紀錄本身的狀態資訊
  • shard 是 Elasticsearch 分散式儲存的基礎,包含 primary shard & replica shard
  • 每一個 shard 就是一個 Lucene instance
  • primary shard 功能是將一份被索引後的資料,分散到多個 data node 上存放,實現儲存方面的水平擴展
  • primary shard 的數量在建立 index 時就會指定,後續是無法修改的,若要修改就必須要進行 reindex
  • 當 primary shard 遺失時,replica shard 就可以被 promote 成 primary shard 來保持資料完整性
  • replica shard 數量可以動態調整,讓每個 data node 上都有完整的資料
  • ES 7.0 開始,primary shard 預設為 1,replica shard 預設為 0
  • replica shard 若設定過多,會降低 cluster 整體的寫入效能
  • replica shard 必須和 primary shard 被分配在不同的 data node 上
  • 所有的 primary shard 可以在同一個 data node 上
  • 透過 GET _cluster/health/<target> 可以取得目前 cluster 的健康狀態
  • Yellow:表示 primary shard 可以正常分配,但 replica shard 分配有問題
  • 透過 GET /_cat/shards/<target> 可以取得目前的 shard 狀態
  • replica shard 無法被分配,因此 cluster 健康狀態為黃色
  • 若是擔心 reboot 機器造成 failover 動作開始執行,可以設定將 replication 延遲一段時間後再執行(透過調整 settings 中的 index.unassigned.node_left.delayed_timeout 參數),避免無謂的 data copy 動作 (此功能稱為 delay allocation)
  • 集群變紅,代表有 primary shard 丟失,這個時候會影響讀寫。
  • 如果 node 重新回來,會從 translog 中恢復沒有寫入的資料
  • 設定 index settings 之後,primary shard 數量無法隨意變更
  • 不建議直接發送請求到master節點,雖然也會工作,但是大量請求發送到 master,會有潛在的性能問題
  • shard 是 ES 中最小的工作單元
  • shard 是一個 Lucene 的 index
  • 將 Index Buffer 中的內容寫入 Segment,而這寫入的過程就稱為 Refresh
  • 當 document 被 refresh 進入到 segment 之後,就可以被搜尋到了
  • 在進行 refresh 時先將 segment 寫入 cache 以開放查詢
  • 將 document 進行索引時,同時也會寫入 transaction log,且預設都會寫入磁碟中
  • 每個 shard 都會有對應的 transaction log
  • 由於 transaction log 都會寫入磁碟中,因此當 node 從故障中恢復時,就會優先讀取 transaction log 來恢復資料
張 旭

Database Profiler - MongoDB Manual - 0 views

  • The database profiler collects detailed information about Database Commands executed against a running mongod instance.
  • The profiler writes all the data it collects to the system.profile collection, a capped collection in the admin database.
  • db.setProfilingLevel(2)
  • ...10 more annotations...
  • The slowms and sampleRate profiling settings are global. When set, these settings affect all databases in your process.
  • db.setProfilingLevel(1, { slowms: 20 })
  • db.setProfilingLevel(0, { slowms: 20 })
  • show profile
  • The system.profile collection is a capped collection with a default size of 1 megabyte.
  • By default, sampleRate is set to 1.0, meaning all slow operations are profiled.
  • When logLevel is set to 0, MongoDB records slow operations to the diagnostic log at a rate determined by slowOpSampleRate.
  • The slowms field indicates operation time threshold, in milliseconds, beyond which operations are considered slow.
  • You cannot enable profiling on a mongos instance.
  • profiler logs information about database operations in the system.profile collection.
張 旭

MongoDB Performance Tuning: Everything You Need to Know - Stackify - 0 views

  • db.serverStatus().globalLock
  • db.serverStatus().locks
  • globalLock.currentQueue.total: This number can indicate a possible concurrency issue if it’s consistently high. This can happen if a lot of requests are waiting for a lock to be released.
  • ...35 more annotations...
  • globalLock.totalTime: If this is higher than the total database uptime, the database has been in a lock state for too long.
  • Unlike relational databases such as MySQL or PostgreSQL, MongoDB uses JSON-like documents for storing data.
  • Databases operate in an environment that consists of numerous reads, writes, and updates.
  • When a lock occurs, no other operation can read or modify the data until the operation that initiated the lock is finished.
  • locks.deadlockCount: Number of times the lock acquisitions have encountered deadlocks
  • Is the database frequently locking from queries? This might indicate issues with the schema design, query structure, or system architecture.
  • For version 3.2 on, WiredTiger is the default.
  • MMAPv1 locks whole collections, not individual documents.
  • WiredTiger performs locking at the document level.
  • When the MMAPv1 storage engine is in use, MongoDB will use memory-mapped files to store data.
  • All available memory will be allocated for this usage if the data set is large enough.
  • db.serverStatus().mem
  • mem.resident: Roughly equivalent to the amount of RAM in megabytes that the database process uses
  • If mem.resident exceeds the value of system memory and there’s a large amount of unmapped data on disk, we’ve most likely exceeded system capacity.
  • If the value of mem.mapped is greater than the amount of system memory, some operations will experience page faults.
  • The WiredTiger storage engine is a significant improvement over MMAPv1 in performance and concurrency.
  • By default, MongoDB will reserve 50 percent of the available memory for the WiredTiger data cache.
  • wiredTiger.cache.bytes currently in the cache – This is the size of the data currently in the cache.
  • wiredTiger.cache.tracked dirty bytes in the cache – This is the size of the dirty data in the cache.
  • we can look at the wiredTiger.cache.bytes read into cache value for read-heavy applications. If this value is consistently high, increasing the cache size may improve overall read performance.
  • check whether the application is read-heavy. If it is, increase the size of the replica set and distribute the read operations to secondary members of the set.
  • write-heavy, use sharding within a sharded cluster to distribute the load.
  • Replication is the propagation of data from one node to another
  • Replication sets handle this replication.
  • Sometimes, data isn’t replicated as quickly as we’d like.
  • a particularly thorny problem if the lag between a primary and secondary node is high and the secondary becomes the primary
  • use the db.printSlaveReplicationInfo() or the rs.printSlaveReplicationInfo() command to see the status of a replica set from the perspective of the secondary member of the set.
  • shows how far behind the secondary members are from the primary. This number should be as low as possible.
  • monitor this metric closely.
  • watch for any spikes in replication delay.
  • Always investigate these issues to understand the reasons for the lag.
  • One replica set is primary. All others are secondary.
  • it’s not normal for nodes to change back and forth between primary and secondary.
  • use the profiler to gain a deeper understanding of the database’s behavior.
  • Enabling the profiler can affect system performance, due to the additional activity.
  •  
    "globalLock.currentQueue.total: This number can indicate a possible concurrency issue if it's consistently high. This can happen if a lot of requests are waiting for a lock to be released."
張 旭

GitLab Auto DevOps 深入淺出,自動部署,連設定檔不用?! | 五倍紅寶石・專業程式教育 - 0 views

  • 一個 K8S 的 Cluster,Auto DevOps 將會把網站部署到這個 Cluster
  • 需要有一個 wildcard 的 DNS 讓部署在這個環境的網站有 Domain name
  • 一個可以跑 Docker 的 GitLab Runner,將會為由它來執行 CI / CD 的流程。
  • ...37 more annotations...
  • 其實 Auto DevOps 就是一份官方寫好的 gitlab-ci.yml,在啟動 Auto DevOps 的專案裡,如果找不到 gitlab-ci.yml 檔,那就會直接用官方 gitlab-ci.yml 去跑 CI / CD 流程。
  • Pod 是 K8S 中可以被部署的最小元件,一個 Pod 是由一到多個 Container 組成,同個 Pod 的不同 Container 之間彼此共享網路資源。
  • 每個 Pod 都會有它的 yaml 檔,用以描述 Pod 會使用的 Image 還有連接的 Port 等資訊。
  • Node 又分成 Worker Node 和 Master Node 兩種
  • Helm 透過參數 (parameter) 跟模板 (template) 的方式,讓我們可以在只修改參數的方式重複利用模板。
  • 為了要有 CI CD 的功能我們會把 .gitlab-ci.yml 放在專案的根目錄裡, GitLab 會依造 .gitlab-ci.yml 的設定產生 CI/CD Pipeline,每個 Pipeline 裡面可能有多個 Job,這時候就會需要有 GitLab Runner 來執行這些 Job 並把執行的結果回傳給 GitLab 讓它知道這個 Job 是否有正常執行。
  • 把專案打包成 Docker Image 這工作又或是 helm 的操作都會在 Container 內執行
  • CI/CD Pipeline 是由 stage 還有 job 組成的,stage 是有順序性的,前面的 stage 完成後才會開始下一個 stage。
  • 每個 stage 裡面包含一到多個 Job
  • Auto Devops 裡也會大量用到這種在指定 Container 內運行的工作。
  • 可以通過 health checks
  • 開 private 的話還要注意使用 Container Registry 的權限問題
  • 申請好的 wildcard 的 DNS
  • Auto Devops 也提供只要設定環境變數就能一定程度客製化的選項
  • 特別注意 namespace 有沒有設定對,不然會找不到資料喔
  • Auto Devops,如果想要進一步的客製化,而且是改 GitLab 環境變數都無法實現的客製化,這時候還是得回到 .gitlab-ci.yml 設定檔
  • 在 Docker in Docker 的環境用 Dockerfile 打包 Image
  • 用 helm upgrade 把 chart 部署到 K8S 上
  • GitLab CI 的環境變數主要有三個來源,優先度高到低依序為Settings > CI/CD 介面定義的變數gitlab_ci.yml 定義環境變數GitLab 預設環境變數
  • 把專案打包成 Docker Image 首先需要在專案下新增一份 Dockerfile
  • Auto Devops 裡面的做法,用 herokuish 提供的 Image 來打包專案
  • 在 Runner 的環境中是沒有 docker 指令可以用的,所以這邊啟動一個 Docker Container 在裡面執行就可以用 docker 指令了。
  • 其中 $CI_COMMIT_SHA $CI_COMMIT_BEFORE_SHA 這兩個都是 GitLab 預設環境變數,代表這次 commit 還有上次 commit 的 SHA 值。
  • dind 則是直接啟動 docker daemon,此外 dind 還會自動產生 TLS certificates
  • 為了在 Docker Container 內運行 Docker,會把 Host 上面的 Docker API 分享給 Container。
  • docker:stable 有執行 docker 需要的執行檔,他裡面也包含要啟動 docker 的程式(docker daemon),但啟動 Container 的 entrypoint 是 sh
  • docker:dind 繼承自 docker:stable,而且它 entrypoint 就是啟動 docker 的腳本,此外還會做完 TLS certificates
  • Container 要去連 Host 上的 Docker API 。但現在連線失敗卻是找 http://docker:2375,現在的 dind 已經不是被當做 services 來用了,而是要直接在裡面跑 Docker,所以他應該是要 unix:///var/run/docker.sock 用這種連線,於是把環境變數 DOCKER_HOST 從 tcp://docker:2375 改成空字串,讓 docker daemon 走預設連線就能成功囉!
  • auto-deploy preparationhelm init 建立 helm 專案設定 tiller 在背景執行設定 cluster 的 namespace
  • auto-deploy deploy使用 helm upgrade 部署 chart 到 K8S 上透過 --set 來設定要注入 template 的參數
  • set -x,這樣就能在執行前,顯示指令內容。
  • 用 helm repo list 看看現在有註冊哪些 Chart Repository
  • helm fetch gitlab/auto-deploy-app --untar
  • nohup 可以讓你在離線或登出系統後,還能夠讓工作繼續進行
  • 在不特別設定 CI_APPLICATION_REPOSITORY 的情況下,image_repository 的值就是預設環境變數 CI_REGISTRY_IMAGE/CI_COMMIT_REF_SLUG
  • A:-B 的意思是如果有 A 就用它,沒有就用 B
  • 研究 Auto Devops 難度最高的地方就是太多工具整合在一起,搞不清楚他們之間的關係,出錯也不知道從何查起
張 旭

MetalLB, bare metal load-balancer for Kubernetes - 0 views

  • it allows you to create Kubernetes services of type “LoadBalancer” in clusters that don’t run on a cloud provider
  • In a cloud-enabled Kubernetes cluster, you request a load-balancer, and your cloud platform assigns an IP address to you.
  • MetalLB cannot create IP addresses out of thin air, so you do have to give it pools of IP addresses that it can use.
  • ...6 more annotations...
  • MetalLB lets you define as many address pools as you want, and doesn’t care what “kind” of addresses you give it.
  • Once MetalLB has assigned an external IP address to a service, it needs to make the network beyond the cluster aware that the IP “lives” in the cluster.
  • In layer 2 mode, one machine in the cluster takes ownership of the service, and uses standard address discovery protocols (ARP for IPv4, NDP for IPv6) to make those IPs reachable on the local network
  • From the LAN’s point of view, the announcing machine simply has multiple IP addresses.
  • In BGP mode, all machines in the cluster establish BGP peering sessions with nearby routers that you control, and tell those routers how to forward traffic to the service IPs.
  • Using BGP allows for true load balancing across multiple nodes, and fine-grained traffic control thanks to BGP’s policy mechanisms.
張 旭

MetalLB, bare metal load-balancer for Kubernetes - 0 views

  • Kubernetes does not offer an implementation of network load-balancers (Services of type LoadBalancer) for bare metal clusters
  • If you’re not running on a supported IaaS platform (GCP, AWS, Azure…), LoadBalancers will remain in the “pending” state indefinitely when created.
  • Bare metal cluster operators are left with two lesser tools to bring user traffic into their clusters, “NodePort” and “externalIPs” services.
張 旭

Service | Kubernetes - 0 views

  • Each Pod gets its own IP address
  • Pods are nonpermanent resources.
  • Kubernetes Pods are created and destroyed to match the state of your cluster
  • ...23 more annotations...
  • In Kubernetes, a Service is an abstraction which defines a logical set of Pods and a policy by which to access them (sometimes this pattern is called a micro-service).
  • The set of Pods targeted by a Service is usually determined by a selector
  • If you're able to use Kubernetes APIs for service discovery in your application, you can query the API server for Endpoints, that get updated whenever the set of Pods in a Service changes.
  • A Service in Kubernetes is a REST object, similar to a Pod.
  • The name of a Service object must be a valid DNS label name
  • Kubernetes assigns this Service an IP address (sometimes called the "cluster IP"), which is used by the Service proxies
  • A Service can map any incoming port to a targetPort. By default and for convenience, the targetPort is set to the same value as the port field.
  • The default protocol for Services is TCP
  • As many Services need to expose more than one port, Kubernetes supports multiple port definitions on a Service object. Each port definition can have the same protocol, or a different one.
  • Because this Service has no selector, the corresponding Endpoints object is not created automatically. You can manually map the Service to the network address and port where it's running, by adding an Endpoints object manually
  • Endpoint IP addresses cannot be the cluster IPs of other Kubernetes Services
  • Kubernetes ServiceTypes allow you to specify what kind of Service you want. The default is ClusterIP
  • ClusterIP: Exposes the Service on a cluster-internal IP.
  • NodePort: Exposes the Service on each Node's IP at a static port (the NodePort). A ClusterIP Service, to which the NodePort Service routes, is automatically created. You'll be able to contact the NodePort Service, from outside the cluster, by requesting <NodeIP>:<NodePort>.
  • LoadBalancer: Exposes the Service externally using a cloud provider's load balancer
  • ExternalName: Maps the Service to the contents of the externalName field (e.g. foo.bar.example.com), by returning a CNAME record with its value. No proxying of any kind is set up.
  • You can also use Ingress to expose your Service. Ingress is not a Service type, but it acts as the entry point for your cluster.
  • If you set the type field to NodePort, the Kubernetes control plane allocates a port from a range specified by --service-node-port-range flag (default: 30000-32767).
  • The default for --nodeport-addresses is an empty list. This means that kube-proxy should consider all available network interfaces for NodePort.
  • you need to take care of possible port collisions yourself. You also have to use a valid port number, one that's inside the range configured for NodePort use.
  • Service is visible as <NodeIP>:spec.ports[*].nodePort and .spec.clusterIP:spec.ports[*].port
  • Choosing this value makes the Service only reachable from within the cluster.
  • NodePort: Exposes the Service on each Node's IP at a static port
張 旭

Introducing Infrastructure as Code | Linode - 0 views

  • Infrastructure as Code (IaC) is a technique for deploying and managing infrastructure using software, configuration files, and automated tools.
  • With the older methods, technicians must configure a device manually, perhaps with the aid of an interactive tool. Information is added to configuration files by hand or through the use of ad-hoc scripts. Configuration wizards and similar utilities are helpful, but they still require hands-on management. A small group of experts owns the expertise, the process is typically poorly defined, and errors are common.
  • The development of the continuous integration and continuous delivery (CI/CD) pipeline made the idea of treating infrastructure as software much more attractive.
  • ...20 more annotations...
  • Infrastructure as Code takes advantage of the software development process, making use of quality assurance and test automation techniques.
  • Consistency/Standardization
  • Each node in the network becomes what is known as a snowflake, with its own unique settings. This leads to a system state that cannot easily be reproduced and is difficult to debug.
  • With standard configuration files and software-based configuration, there is greater consistency between all equipment of the same type. A key IaC concept is idempotence.
  • Idempotence makes it easy to troubleshoot, test, stabilize, and upgrade all the equipment.
  • Infrastructure as Code is central to the culture of DevOps, which is a mix of development and operations
  • edits are always made to the source configuration files, never on the target.
  • A declarative approach describes the final state of a device, but does not mandate how it should get there. The specific IaC tool makes all the procedural decisions. The end state is typically defined through a configuration file, a JSON specification, or a similar encoding.
  • An imperative approach defines specific functions or procedures that must be used to configure the device. It focuses on what must happen, but does not necessarily describe the final state. Imperative techniques typically use scripts for the implementation.
  • With a push configuration, the central server pushes the configuration to the destination device.
  • If a device is mutable, its configuration can be changed while it is active
  • Immutable devices cannot be changed. They must be decommissioned or rebooted and then completely rebuilt.
  • an immutable approach ensures consistency and avoids drift. However, it usually takes more time to remove or rebuild a configuration than it does to change it.
  • System administrators should consider security issues as part of the development process.
  • Ansible is a very popular open source IaC application from Red Hat
  • Ansible is often used in conjunction with Kubernetes and Docker.
  • Linode offers a collection of several Ansible guides for a more comprehensive overview.
  • Pulumi permits the use of a variety of programming languages to deploy and manage infrastructure within a cloud environment.
  • Terraform allows users to provision data center infrastructure using either JSON or Terraform’s own declarative language.
  • Terraform manages resources through the use of providers, which are similar to APIs.
張 旭

ALB vs ELB | Differences Between an ELB and an ALB on AWS | Sumo Logic - 0 views

  • If you use AWS, you have two load-balancing options: ELB and ALB.
  • An ELB is a software-based load balancer which can be set up and configured in front of a collection of AWS Elastic Compute (EC2) instances.
  • The load balancer serves as a single entry point for consumers of the EC2 instances and distributes incoming traffic across all machines available to receive requests.
  • ...14 more annotations...
  • the ELB also performs a vital role in improving the fault tolerance of the services which it fronts.
  • he Open Systems Interconnection Model, or OSI Model, is a conceptual model which is used to facilitate communications between different computing systems.
  • Layer 1 is the physical layer, and represents the physical medium across which the request is sent.
  • Layer 2 describes the data link layer
  • Layer 3 (the network layer)
  • Layer 7, which serves the application layer.
  • The Classic ELB operates at Layer 4. Layer 4 represents the transport layer, and is controlled by the protocol being used to transmit the request.
  • A network device, of which the Classic ELB is an example, reads the protocol and port of the incoming request, and then routes it to one or more backend servers.
  • the ALB operates at Layer 7. Layer 7 represents the application layer, and as such allows for the redirection of traffic based on the content of the request.
  • Whereas a request to a specific URL backed by a Classic ELB would only enable routing to a particular pool of homogeneous servers, the ALB can route based on the content of the URL, and direct to a specific subgroup of backing servers existing in a heterogeneous collection registered with the load balancer.
  • The Classic ELB is a simple load balancer, is easy to configure
  • As organizations move towards microservice architecture or adopt a container-based infrastructure, the ability to merely map a single address to a specific service becomes more complicated and harder to maintain.
  • the ALB manages routing based on user-defined rules.
  • oute traffic to different services based on either the host or the content of the path contained within that URL.
張 旭

Open source load testing tool review 2020 - 0 views

  • Hey is a simple tool, written in Go, with good performance and the most common features you'll need to run simple static URL tests.
  • Hey supports HTTP/2, which neither Wrk nor Apachebench does
  • Apachebench is very fast, so often you will not need more than one CPU core to generate enough traffic
  • ...16 more annotations...
  • Hey has rate limiting, which can be used to run fixed-rate tests.
  • Vegeta was designed to be run on the command line; it reads from stdin a list of HTTP transactions to generate, and sends results in binary format to stdout,
  • Vegeta is a really strong tool that caters to people who want a tool to test simple, static URLs (perhaps API end points) but also want a bit more functionality.
  • Vegeta can even be used as a Golang library/package if you want to create your own load testing tool.
  • Wrk is so damn fast
  • being fast and measuring correctly is about all that Wrk does
  • k6 is scriptable in plain Javascript
  • k6 is average or better. In some categories (documentation, scripting API, command line UX) it is outstanding.
  • Jmeter is a huge beast compared to most other tools.
  • Siege is a simple tool, similar to e.g. Apachebench in that it has no scripting and is primarily used when you want to hit a single, static URL repeatedly.
  • A good way of testing the testing tools is to not test them on your code, but on some third-party thing that is sure to be very high-performing.
  • use a tool like e.g. top to keep track of Nginx CPU usage while testing. If you see just one process, and see it using close to 100% CPU, it means you could be CPU-bound on the target side.
  • If you see multiple Nginx processes but only one is using a lot of CPU, it means your load testing tool is only talking to that particular worker process.
  • Network delay is also important to take into account as it sets an upper limit on the number of requests per second you can push through.
  • If, say, the Nginx default page requires a transfer of 250 bytes to load, it means that if the servers are connected via a 100 Mbit/s link, the theoretical max RPS rate would be around 100,000,000 divided by 8 (bits per byte) divided by 250 => 100M/2000 = 50,000 RPS. Though that is a very optimistic calculation - protocol overhead will make the actual number a lot lower so in the case above I would start to get worried bandwidth was an issue if I saw I could push through max 30,000 RPS, or something like that.
  • Wrk managed to push through over 50,000 RPS and that made 8 Nginx workers on the target system consume about 600% CPU.
張 旭

Gracefully Shutdown Docker Container - Kakashi's Blog - 1 views

  • The initial idea is to make application invokes deconstructor of each component as soon as the application receives specific signals such as SIGTERM and SIGINT
  • When you run a docker container, by default it has a PID namespace, which means the docker process is isolated from other processes on your host.
  • The PID namespace has an important task to reap zombie processes.
  • ...11 more annotations...
  • This uses /bin/bash as PID1 and runs your program as the subprocess.
  • When a signal is sent to a shell, the signal actually won’t be forwarded to subprocesses.
  • By using the exec form, we can run our program as PID1
  • if you use exec form to run a shell script to spawn your application, remember to use exec syscall to overwrite /usr/bin/bash otherwise it will act as senario1
  • /bin/bash can handle repeating zombie process
  • with Tini, SIGTERM properly terminates your process even if you didn’t explicitly install a signal handler for it.
  • run tini as PID1 and it will forward the signal for subprocesses.
  • tini is a signal proxy and it also can deal with zombie process issue automatically.
  • run your program with tini by passing --init flag to docker run
  • use docker stop, docker will wait for 10s for stopping container before killing a process (by default). The main process inside the container will receive SIGTERM, then docker daemon will wait for 10s and send SIGKILL to terminate process.
  • kill running containers immediately. it’s more like kill -9 and kill --SIGKILL
« First ‹ Previous 121 - 140 of 596 Next › Last »
Showing 20 items per page