Skip to main content

Home/ Larvata/ Group items tagged mysql

Rss Feed Group items tagged

張 旭

我必须得告诉大家的MySQL优化原理 - 简书 - 0 views

  • 很多的查询优化工作实际上就是遵循一些原则让MySQL的优化器能够按照预想的合理方式运行而已
  • MySQL客户端/服务端通信协议是“半双工”的:在任一时刻,要么是服务器向客户端发送数据,要么是客户端向服务器发送数据,这两个动作不能同时发生。
  • 当查询语句很长的时候,需要设置max_allowed_packet参数。
  • ...70 more annotations...
  • 如果查询实在是太大,服务端会拒绝接收更多数据并抛出异常
  • 服务器响应给用户的数据通常会很多,由多个数据包组成
  • 减小通信间数据包的大小和数量是一个非常好的习惯
  • 查询中尽量避免使用SELECT *以及加上LIMIT限制
  • 在解析一个查询语句前,如果查询缓存是打开的,那么MySQL会检查这个查询语句是否命中查询缓存中的数据。
  • 两个查询在任何字符上的不同(例如:空格、注释),都会导致缓存不会命中。
  • MySQL将缓存存放在一个引用表
  • 如果查询中包含任何用户自定义函数、存储函数、用户变量、临时表、mysql库中的系统表,其查询结果 都不会被缓存。
  • MySQL的查询缓存系统会跟踪查询中涉及的每个表,如果这些表(数据或结构)发生变化,那么和这张表相关的所有缓存数据都将失效。
  • 在任何的写操作时,MySQL必须将对应表的所有缓存都设置为失效。
  • 查询缓存对系统的额外消耗也不仅仅在写操作,读操作也不例外
  • 任何的查询语句在开始之前都必须经过检查,即使这条SQL语句永远不会命中缓存
  • 如果查询结果可以被缓存,那么执行完成后,会将结果存入缓存,也会带来额外的系统消耗
  • 并不是什么情况下查询缓存都会提高系统性能,缓存和失效都会带来额外消耗
  • 用多个小表代替一个大表
  • 批量插入代替循环单条插入
  • 合理控制缓存空间大小
  • 可以通过SQL_CACHE和SQL_NO_CACHE来控制某个查询语句是否需要进行缓存
  • 不要轻易打开查询缓存,特别是写密集型应用
  • 将query_cache_type设置为DEMAND,这时只有加入SQL_CACHE的查询才会走缓存,其他查询则不会,这样可以非常自由地控制哪些查询需要被缓存。
  • 预处理则会根据MySQL规则进一步检查解析树是否合法。比如检查要查询的数据表和数据列是否存在等等。
  • 一条查询可以有很多种执行方式,最后都返回相应的结果。优化器的作用就是找到这其中最好的执行计划。
  • 查询当前会话的last_query_cost的值来得到其计算当前查询的成本。
  • 尝试预测一个查询使用某种执行计划时的成本
  • 有非常多的原因会导致MySQL选择错误的执行计划
  • MySQL值选择它认为成本小的,但成本小并不意味着执行时间短
  • 提前终止查询(比如:使用Limit时,查找到满足数量的结果集后会立即终止查询)
  • 找某列的最小值,如果该列有索引,只需要查找B+Tree索引最左端,反之则可以找到最大值
  • 在完成解析和优化阶段以后,MySQL会生成对应的执行计划,查询执行引擎根据执行计划给出的指令逐步执行得出结果。
  • 查询过程中的每一张表由一个handler实例表示。
  • MySQL在查询优化阶段就为每一张表创建了一个handler实例,优化器可以根据这些实例的接口来获取表的相关信息
  • 即使查询不到数据,MySQL仍然会返回这个查询的相关信息,比如改查询影响到的行数以及执行时间等等。
  • 结果集返回客户端是一个增量且逐步返回的过程
  • 整型就比字符操作代价低,因而会使用整型来存储ip地址,使用DATETIME来存储时间,而不是使用字符串。
  • 如果计划在列上创建索引,就应该将该列设置为NOT NULL
  • UNSIGNED表示不允许负值,大致可以使正数的上限提高一倍。
  • 没有太大的必要使用DECIMAL数据类型。即使是在需要存储财务数据时,仍然可以使用BIGINT
  • 大表ALTER TABLE非常耗时
  • MySQL执行大部分修改表结果操作的方法是用新的结构创建一个张空表,从旧表中查出所有的数据插入新表,然后再删除旧表
  • 索引是提高MySQL查询性能的一个重要途径,但过多的索引可能会导致过高的磁盘使用率以及过高的内存占用,从而影响应用程序的整体性能。
  • 索引是指B-Tree索引,它是目前关系型数据库中查找数据最为常用和有效的索引,大多数存储引擎都支持这种索引
  • B+Tree中的B是指balance,意为平衡。
  • 平衡二叉树首先需要符合二叉查找树的定义,其次必须满足任何节点的两个子树的高度差不能大于1。
  • 索引往往以索引文件的形式存储的磁盘上
  • 如何减少查找过程中的I/O存取次数?
  • 减少树的深度,将二叉树变为m叉树(多路搜索树),而B+Tree就是一种多路搜索树。
  • 页是计算机管理存储器的逻辑块,硬件及OS往往将主存和磁盘存储区分割为连续的大小相等的块
  • B+Tree为了保持平衡,对于新插入的值需要做大量的拆分页操作,而页的拆分需要I/O操作,为了尽可能的减少页的拆分操作,B+Tree也提供了类似于平衡二叉树的旋转功能。
  • 在多数情况下,在多个列上建立独立的索引并不能提高查询性能。理由非常简单,MySQL不知道选择哪个索引的查询效率更好
  • 当出现多个索引做联合操作时(多个OR条件),对结果集的合并、排序等操作需要耗费大量的CPU和内存资源,特别是当其中的某些索引的选择性不高,需要返回合并大量数据时,查询成本更高。
  • explain时如果发现有索引合并(Extra字段出现Using union),应该好好检查一下查询和表结构是不是已经是最优的
  • 索引选择性是指不重复的索引值和数据表的总记录数的比值
  • 唯一索引的选择性是1,这时最好的索引选择性,性能也是最好的。
  • 如果一个索引包含或者说覆盖所有需要查询的字段的值,那么就没有必要再回表查询,这就称为覆盖索引。
  • 对结果集进行排序的操作
  • 按照索引顺序扫描得出的结果自然是有序的
  • 如果explain的结果中type列的值为index表示使用了索引扫描来做排序。
  • 在设计索引时,如果一个索引既能够满足排序,有满足查询,是最好的。
  • 比如有一个索引(A,B),再创建索引(A)就是冗余索引。
  • 对于非常小的表,简单的全表扫描更高效。对于中到大型的表,索引就非常有效。对于超大型的表,建立和维护索引的代价随之增长,这时候其他技术也许更有效,比如分区表。
  • explain后再提测是一种美德。
  • 如果要统计行数,直接使用COUNT(*),意义清晰,且性能更好。
  • 执行EXPLAIN并不需要真正地去执行查询,所以成本非常低。
  • 确保ON和USING字句中的列上有索引。在创建索引的时候就要考虑到关联的顺序
  • 确保任何的GROUP BY和ORDER BY中的表达式只涉及到一个表中的列,这样MySQL才有可能使用索引来优化
  • MySQL关联执行的策略非常简单,它对任何的关联都执行嵌套循环关联操作,即先在一个表中循环取出单条数据,然后在嵌套循环到下一个表中寻找匹配的行,依次下去,直到找到所有表中匹配的行为为止。然后根据各个表匹配的行,返回查询中需要的各个列。
  • 最外层的查询是根据A.xx列来查询的,A.c上如果有索引的话,整个关联查询也不会使用。再看内层的查询,很明显B.c上如果有索引的话,能够加速查询,因此只需要在关联顺序中的第二张表的相应列上创建索引即可。
  • MySQL处理UNION的策略是先创建临时表,然后再把各个查询结果插入到临时表中,最后再来做查询。
  • 要使用UNION ALL,如果没有ALL关键字,MySQL会给临时表加上DISTINCT选项,这会导致整个临时表的数据做唯一性检查,这样做的代价非常高。
  • 尽可能不要使用存储过程
張 旭

MySQL :: MySQL 5.7 Reference Manual :: 20.2 Introducing InnoDB Cluster - 0 views

  • A group of MySQL servers can be configured to create a cluster using MySQL Shell
  • The cluster of servers has a single master, called the primary, which acts as the read-write master.
  • Multiple secondary servers are replicas of the master
  • ...6 more annotations...
  • A client application is connected to the primary via MySQL Router
  • MySQL Shell also requires Python 2.7 and above to run cluster provisioning scripts
  • AdminAPI, which enables you to create and administer an InnoDB cluster, using either JavaScript or Python scripting
  • Caches the metadata of the InnoDB cluster and performs high availability routing to the MySQL Server instances which make up the cluster
  • Group Replication mechanism to allow data to be replicated from the primary to the secondaries in the cluster
  • AdminAPI is available as of MySQL Shell 1.0.8.
張 旭

MySQL :: MySQL 5.7 Reference Manual :: 20.4 Getting Started with InnoDB Cluster - 0 views

  • InnoDB cluster instances are created and managed through the MySQL Shell.
  • To create a new InnoDB cluster, the MySQL Shell must be connected to the MySQL Server instance. By default, this MySQL Server instance is the seed instance of the new InnoDB cluster and hold the initial data set.
  • Sandbox instance are only suitable for deploying and running on your local machine.
  • ...3 more annotations...
  • A minimum of three instances are required to create an InnoDB cluster
  • reverts to read-only mode
  • MySQL Shell provides two scripting languages: JavaScript and Python.
crazylion lee

Overview of Different MySQL Replication Solutions - Percona Database Performance Blog - 0 views

  •  
    "In this blog post, I will review some of the MySQL replication concepts that are part of the MySQL environment (and Percona Server for MySQL specifically). I will also try to clarify some of the misconceptions people have about replication. Since I've been working on the Solution Engineering team, I've noticed that - although information is plentiful - replication is often misunderstood or incompletely understood."
張 旭

MySQL cluster vs Galera - How to make the right choice - 0 views

  • there is no “one size fits all” solution when coming to database clustering.
  • MySQL cluster contains the data nodes that store the cluster data and management node that store the cluster’s configuration.
  • MySQL clients first communicate with the management node and then connect directly to these data nodes.
  • ...13 more annotations...
  • For synchronization of data in the data nodes, MySQL cluster uses a special data engine called NDB (Network Database).
  • it uses automatic shrading aka splitting of a large database into small units.
  • MySQL cluster avoids single point failure and ensures 99.99% availability.
  • MySQL cluster can provide a response time as low as less than 3 ms.
  • Galera Cluster consists of a database server and uses the Galera Replication Plugin to manage replication.
  • a multi-master database cluster that supports synchronous replication.
  • it provides multiple, up-to-date copies of the data.
  • there is a need for instant fail-over.
  • Galera cluster allows the read and write of data in any node.
  • Galera cluster include guaranteed write consistency, automatic node provisioning, etc.
  • Upon restoring the connection, the separated nodes will sync back and rejoin the cluster automatically.
  • there is no need to have management node like MySQL cluster.
  • it gives best results with the InnoDB storage engine.
張 旭

php - CakePHP: No such file or directory (trying to connect via unix:///var/mysql/mysql... - 0 views

  •  
    'port' => '/path/to/mysql.sock'
  •  
    'port' => '/private/tmp/mysql.sock'
張 旭

How to Benchmark Performance of MySQL & MariaDB Using SysBench | Severalnines - 1 views

  • SysBench is a C binary which uses LUA scripts to execute benchmarks
  • support for parallelization in the LUA scripts, multiple queries can be executed in parallel
  • by default, benchmarks which cover most of the cases - OLTP workloads, read-only or read-write, primary key lookups and primary key updates.
  • ...21 more annotations...
  • SysBench is not a tool which you can use to tune configurations of your MySQL servers (unless you prepared LUA scripts with custom workload or your workload happen to be very similar to the benchmark workloads that SysBench comes with)
  • it is great for is to compare performance of different hardware.
  • Every new server acquired should go through a warm-up period during which you will stress it to pinpoint potential hardware defects
  • by executing OLTP workload which overloads the server, or you can also use dedicated benchmarks for CPU, disk and memory.
  • bulk_insert.lua. This test can be used to benchmark the ability of MySQL to perform multi-row inserts.
  • All oltp_* scripts share a common table structure. First two of them (oltp_delete.lua and oltp_insert.lua) execute single DELETE and INSERT statements.
  • oltp_point_select, oltp_update_index and oltp_update_non_index. These will execute a subset of queries - primary key-based selects, index-based updates and non-index-based updates.
  • you can run different workload patterns using the same benchmark.
  • Warmup helps to identify “regular” throughput by executing benchmark for a predefined time, allowing to warm up the cache, buffer pools etc.
  • By default SysBench will attempt to execute queries as fast as possible. To simulate slower traffic this option may be used. You can define here how many transactions should be executed per second.
  • SysBench gives you ability to generate different types of data distribution.
  • decide if SysBench should use prepared statements (as long as they are available in the given datastore - for MySQL it means PS will be enabled by default) or not.
  • sysbench ./sysbench/src/lua/oltp_read_write.lua  help
  • By default, SysBench will attempt to execute queries in explicit transaction. This way the dataset will stay consistent and not affected: SysBench will, for example, execute INSERT and DELETE on the same row, making sure the data set will not grow (impacting your ability to reproduce results).
  • specify error codes from MySQL which SysBench should ignore (and not kill the connection).
  • the two most popular benchmarks - OLTP read only and OLTP read/write.
  • 1 million rows will result in ~240 MB of data. Ten tables, 1000 000 rows each equals to 2.4GB
  • by default, SysBench looks for ‘sbtest’ schema which has to exist before you prepare the data set. You may have to create it manually.
  • pass ‘--histogram’ argument to SysBench
  • ~48GB of data (20 tables, 10 000 000 rows each).
  • if you don’t understand why the performance was like it was, you may draw incorrect conclusions out of the benchmarks.
張 旭

Two Timestamp Columns in MySQL - Jason Bosco - 0 views

  •  
    MySQL 自動產生「建立時間」跟「更新時間」
張 旭

MySQL :: MySQL 5.7 Reference Manual :: 19.1 Group Replication Background - 0 views

  • the component can be removed and the system should continue to operate as expected
  • network partitioning
  • split brain scenarios
  • ...8 more annotations...
  • the ultimate challenge is to fuse the logic of the database and data replication with the logic of having several servers coordinated in a consistent and simple way
  • MySQL Group Replication provides distributed state machine replication with strong coordination between servers.
  • Servers coordinate themselves automatically when they are part of the same group
  • The group can operate in a single-primary mode with automatic primary election, where only one server accepts updates at a time.
  • For a transaction to commit, the majority of the group have to agree on the order of a given transaction in the global sequence of transactions
  • Deciding to commit or abort a transaction is done by each server individually, but all servers make the same decision
  • group communication protocols
  • the Paxos algorithm. It acts as the group communication systems engine.
張 旭

MySQL :: MySQL 5.7 Reference Manual :: 19.1.1.2 Group Replication - 0 views

  • The replication group is a set of servers that interact with each other through message passing.
  • The communication layer provides a set of guarantees such as atomic message and total order message delivery.
  • a multi-master update everywhere replication protocol
  • ...8 more annotations...
  • a replication group is formed by multiple servers and each server in the group may execute transactions independently
  • Read-only (RO) transactions need no coordination within the group and thus commit immediately
  • any RW transaction the group needs to decide whether it commits or not, thus the commit operation is not a unilateral decision from the originating server
  • when a transaction is ready to commit at the originating server, the server atomically broadcasts the write values (rows changed) and the correspondent write set (unique identifiers of the rows that were updated). Then a global total order is established for that transaction.
  • all servers receive the same set of transactions in the same order
  • The resolution procedure states that the transaction that was ordered first commits on all servers, whereas the transaction ordered second aborts, and thus is rolled back on the originating server and dropped by the other servers in the group. This is in fact a distributed first commit wins rule
  • Group Replication is a shared-nothing replication scheme where each server has its own entire copy of the data
  • MySQL Group Replication protocol
crazylion lee

youtube/vitess: vitess provides servers and tools which facilitate scaling of MySQL dat... - 0 views

  •  
    "vitess provides servers and tools which facilitate scaling of MySQL databases for large scale web services."
張 旭

单表60亿记录等大数据场景的MySQL优化和运维之道 - 快课网 - 0 views

  • 存储引擎使用InnoDB
  • 变长字符串尽量使用varchar varbinary
  • 不在数据库中存储图片、文件等
  • ...34 more annotations...
  • 库名、表名、字段名、索引名使用小写字母,以下划线分割 ,需要见名知意
  • 所有字段均定义为NOT NULL ,除非你真的想存Null
  • 使用TIMESTAMP存储时间
  • 使用DECIMAL存储精确浮点数,用float有的时候会有问题
  • 单个索引字段数不超过5,单表索引数量不超过5,索引设计遵循B+ Tree索引最左前缀匹配原则
  • 建立的索引能覆盖80%主要的查询,不求全,解决问题的主要矛盾
  • 避免冗余索引
  • 索引这个东西是一把双刃剑,在加速读的同时也引入了很多额外的写入和锁,降低写入能力
  • 字段定义为varchar,但传入的值是个int,就会导致全表扫描,要求程序端要做好类型检查
  • 避免使用大表的JOIN,MySQL优化器对join优化策略过于简单
  • UPDATE、DELETE语句不使用LIMIT ,容易造成主从不一致
  • 高危操作检查,Drop前做好数据备份
  • 日志分析,主要是指的MySQL慢日志和错误日志
  • Percona公司根据Facebook OSC思路,用perl重写了一版,就是我们现在用得很多的pt-online-schema-change,软件本身非常成熟,支持目前主流版本
  • 原生主从同步肯定存在着性能和安全性问题
  • Sharding is very complex, so itʼs best not to shard until itʼs obvious that you will actually need to!
  • 有中间层控制拆分逻辑最好,否则拆分过细管理成本会很高
  • 全量binlog备份
  • xtrabackup热备
  • 采用分布式文件系统存储备份
  • 基于库级别的复制,所以如果你只有一个库,使用这个意义不大
  • 半同步复制,从5.5开始支持
  • 半同步通过从库返回ACK这种方式确认从库收到数据
  • Secondsbehindmaster来判断延时不可靠,在网络抖动或者一些特殊参数配置情况下,会造成这个值是0但其实延时很大了。通过heartbeat表插入时间戳这种机制判断延时是更靠谱的
  • Binlog格式,建议都采用row格式,数据一致性更好
  • 成熟开源事务存储引擎,支持ACID,支持事务四个隔离级别,更好的数据安全性,高性能高并发,MVCC,细粒度锁,支持O_DIRECT
  • 数据安全性至关重要,InnoDB完胜
  • 主流使用TokuDB主要是看中了它的高压缩比
  • TokuDB在测试过程中写入稳定性是非常好的
  • 单表容量在InnoDB下1TB+,使用Tokudb的lzma压缩到80GB
  • 独立写程序好一些,与程序解耦方便后期维护
  • 追踪字段值变化可以通过分析row格式binlog好一些
  • 解决了单表过大恢复时间问题,也支持online DDL
  • 物理备份采用xtrabackup热备方案比较好
張 旭

Home · sysown/proxysql Wiki - 0 views

  • bear in mind that the best way to configure ProxySQL is through its admin interface.
  • llow you to control the list of the backend servers, how traffic is routed to them, and other important settings (such as caching, access control, etc)
  • Once you've made modifications to the in-memory data structure, you must load the new configuration to the runtime, or persist the new settings to disk
  • ...4 more annotations...
  • mysql_variables: contains global variables that control the functionality for handling the incoming MySQL traffic.
  • mysql_users: contains rows for the mysql_users table from the admin interface. Basically, these define the users which can connect to the proxy, and the users with which the proxy can connect to the backend servers.
  • mysql_servers: contains rows for the mysql_servers table from the admin interface. Basically, these define the backend servers towards which the incoming MySQL traffic is routed.
  • mysql_query_rules: contains rows for the mysql_query_rules table from the admin interface. Basically, these define the rules used to classify and route the incoming MySQL traffic, according to various criteria (patterns matched, user used to run the query, etc.).
張 旭

MySQL 到底能不能放到 Docker 里跑? - 0 views

  • 忙碌又容易出错的工作其实是无意义的
  • 单机多实例运行 MySQL
  • MySQL 运行的就是个进程而且对 IO 要求比较高
  • ...12 more annotations...
  • Docker 的资源限制用的就是 cgroups
  • Percona:我们的备份、慢日志分析、过载保护等功能都是基于 pt-tools 工具包来实现的。
  • Consul:分布式的服务发现和配置共享软件
  • 容器调度的开源产品主要有 Kubernetes 和 mesos
  • 适合自己现状的需求才是最好的
  • 有机会做到计算调度和存储调度分离的情况下我们可能会转向 Kubernetes 的方案
  • 根据这个需求按照我们的资源筛选规则 (比如主从不能在同一台机器、内存配置不允许超卖等等),从现有的资源池中匹配出可用资源,然后依次创建主从关系、创建高可用管理、检查集群复制状态、推送集群信息到中间件 (选用了中间件的情况下) 控制中心、最后将以上相关信息都同步到 CMDB。
    • 張 旭
       
      感覺用 K8S 就不用那麼麻煩了。
  • 每一个工作都是通过服务端发送消息到 agent,然后由 agent 执行对应的脚本,脚本会返回指定格式的执行结果
  • 备份工具我们是用 percona-xtrabackup
  • zabbix 来实现监控告警
  • grafana 是监控画图界的扛把子,功能齐全的度量仪表盘和图形编辑器,经过简单配置就能完成各种监控图形的展示。
  • (MariaDB 不支持写 table,只能写 file),极大减少了从库复制带来的 IOPS。
張 旭

MySQL :: MySQL 5.7 Reference Manual :: 19.2.1.2 Configuring an Instance for Group Repli... - 0 views

  • store replication metadata in system tables instead of files
  • collect the write set and encode it as a hash using the XXHASH64 hashing algorithm
  • not start operations automatically when the server starts
  • ...10 more annotations...
  • for incoming connections from other members in the group
  • The server listens on this port for member-to-member connections. This port must not be used for user applications at all
  • The loose- prefix used for the group_replication variables above instructs the server to continue to start if the Group Replication plugin has not been loaded at the time the server is started.
  • For example, if each server instance is on a different machine use the IP and port of the machine, such as 10.0.0.1:33061. The recommended port for group_replication_local_address is 33061
  • does not need to list all members in the group
  • The server that starts the group does not make use of this option, since it is the initial server and as such, it is in charge of bootstrapping the group
  • start the bootstrap member first, and let it create the group
  • Creating a group and joining multiple members at the same time is not supported.
  • must only be used on one server instance at any time
  • Disable this option after the first server instance comes online
張 旭

The Backup Cycle - Full Backups - 0 views

  • xtrabackup will not overwrite existing files, it will fail with operating system error 17, file exists.
  • Log copying thread checks the transactional log every second to see if there were any new log records written that need to be copied, but there is a chance that the log copying thread might not be able to keep up with the amount of writes that go to the transactional logs, and will hit an error when the log records are overwritten before they could be read.
  • It is safe to cancel at any time, because xtrabackup does not modify the database.
  • ...15 more annotations...
  • need to prepare it in order to restore it.
  • Data files are not point-in-time consistent until they are prepared, because they were copied at different times as the program ran, and they might have been changed while this was happening.
  • You can run the prepare operation on any machine; it does not need to be on the originating server or the server to which you intend to restore.
  • you simply run xtrabackup with the --prepare option and tell it which directory to prepare,
  • All following prepares will not change the already prepared data files
  • It is not recommended to interrupt xtrabackup process while preparing backup
  • Backup validity is not guaranteed if prepare process was interrupted.
  • If you intend the backup to be the basis for further incremental backups, you should use the --apply-log-only option when preparing the backup, or you will not be able to apply incremental backups to it.
  • Backup needs to be prepared before it can be restored.
  • xtrabackup --copy-back --target-dir=/data/backups/
  • The datadir must be empty before restoring the backup.
  • MySQL server needs to be shut down before restore is performed.
  • You cannot restore to a datadir of a running mysqld instance (except when importing a partial backup).
  • rsync -avrP /data/backup/ /var/lib/mysql/
  • chown -R mysql:mysql /var/lib/mysql
crazylion lee

soundcloud/lhm - 1 views

  •  
    "Online MySQL schema migrations" # 可以不用lock table
1 - 20 of 53 Next › Last »
Showing 20 items per page