Group items tagged

Filter: All | Bookmarks | Topics Simple Middle

[Elasticsearch] 分散式特性 & 分散式搜尋的機制 | 小信豬的原始部落 - 0 views

godleon.github.io/...icsearch-distributed-mechanism

elasticsearch database

shared by 張旭 on 17 Apr 21 - No Cached

水平擴展儲存空間
...

Cancel
Data HA：若有 node 掛掉，資料不會遺失
...

Cancel
若是要查詢 cluster 中的 node 狀態，可以使用 GET /_cat/nodes API
...

Cancel
...39 more annotations...
決定每個 shard 要被分配到哪個 data node 上
...

Cancel
為 cluster 設置多個 master node
...

Cancel
一旦發現被選中的 master node 出現問題，就會選出新的 master node
...

Cancel
每個 node 啟動時就預設是一個 master eligible node，可以透過設定 node.master: false 取消此預設設定
...

Cancel
處理 request 的 node 稱為 Coordinating Node，其功能是將 request 轉發到合適的 node 上
...

Cancel
所有的 node 都預設是 Coordinating Node
...

Cancel
coordinating node 可以直接接收 search request 並處理，不需要透過 master node 轉過來
...

Cancel
可以保存資料的 node，每個 node 啟動後都會預設是 data node，可以透過設定 node.data: false 停用 data node 功能
...

Cancel
由 master node 決定如何把分片分發到不同的 data node 上
...

Cancel
每個 node 上都保存了 cluster state
...

Cancel
只有 master 才可以修改 cluster state 並負責同步給其他 node
...

Cancel
每個 node 都會詳細紀錄本身的狀態資訊
...

Cancel
shard 是 Elasticsearch 分散式儲存的基礎，包含 primary shard & replica shard
...

Cancel
每一個 shard 就是一個 Lucene instance
...

Cancel
primary shard 功能是將一份被索引後的資料，分散到多個 data node 上存放，實現儲存方面的水平擴展
...

Cancel
primary shard 的數量在建立 index 時就會指定，後續是無法修改的，若要修改就必須要進行 reindex
...

Cancel
當 primary shard 遺失時，replica shard 就可以被 promote 成 primary shard 來保持資料完整性
...

Cancel
replica shard 數量可以動態調整，讓每個 data node 上都有完整的資料
...

Cancel
ES 7.0 開始，primary shard 預設為 1，replica shard 預設為 0
...

Cancel
replica shard 若設定過多，會降低 cluster 整體的寫入效能
...

Cancel
replica shard 必須和 primary shard 被分配在不同的 data node 上
...

Cancel
所有的 primary shard 可以在同一個 data node 上
...

Cancel
透過 GET _cluster/health/<target> 可以取得目前 cluster 的健康狀態
...

Cancel
Yellow：表示 primary shard 可以正常分配，但 replica shard 分配有問題
...

Cancel
透過 GET /_cat/shards/<target> 可以取得目前的 shard 狀態
...

Cancel
replica shard 無法被分配，因此 cluster 健康狀態為黃色
...

Cancel
若是擔心 reboot 機器造成 failover 動作開始執行，可以設定將 replication 延遲一段時間後再執行(透過調整 settings 中的 index.unassigned.node_left.delayed_timeout 參數)，避免無謂的 data copy 動作 (此功能稱為 delay allocation)
...

Cancel
集群變紅，代表有 primary shard 丟失，這個時候會影響讀寫。
...

Cancel
如果 node 重新回來，會從 translog 中恢復沒有寫入的資料
...

Cancel
設定 index settings 之後，primary shard 數量無法隨意變更
...

Cancel
不建議直接發送請求到master節點，雖然也會工作，但是大量請求發送到 master，會有潛在的性能問題
...

Cancel
shard 是 ES 中最小的工作單元
...

Cancel
shard 是一個 Lucene 的 index
...

Cancel
將 Index Buffer 中的內容寫入 Segment，而這寫入的過程就稱為 Refresh
...

Cancel
當 document 被 refresh 進入到 segment 之後，就可以被搜尋到了
...

Cancel
在進行 refresh 時先將 segment 寫入 cache 以開放查詢
...

Cancel
將 document 進行索引時，同時也會寫入 transaction log，且預設都會寫入磁碟中
...

Cancel
每個 shard 都會有對應的 transaction log
...

Cancel
由於 transaction log 都會寫入磁碟中，因此當 node 從故障中恢復時，就會優先讀取 transaction log 來恢復資料
...

Cancel