最近碰到一个 case,发现在特定场景下,并行复制竟然比单线程复制要慢。
现象
从某个时间点开始,从库的复制延迟持续增加,且没有下降的趋势。
数据库版本:8.0.40,事务隔离级别 RC(Read Committed),并行重放线程数(replica_parallel_workers)为 8。
分析过程
通过show slave status\G
查看,发现Relay_Master_Log_File
和Exec_Master_Log_Pos
都在变化,只不过变化得比较慢。
刚开始怀疑是主库写入量较大导致的,后来通过mysql-binlog-time-extractor
(具体用法可参考:分享一个 MySQL binlog 分析小工具)分析,发现主库的写入量在刚开始出现延迟时(2025-09-01 09:30)并不大,反倒是写入量大的时间段(2025-09-01 04:57:53 - 2025-09-01 05:02:42)没有出现延迟。
+--------------------+------------------------+---------------------+---------------------+-----------+------------------------------------+
| Log_name | File_size | Start_time | End_time | Duration | GTID |
+--------------------+------------------------+---------------------+---------------------+-----------+------------------------------------+
| binary-log.005565 | 1302499830 (1.21 GB) | 2025-09-01 04:57:53 | 2025-09-01 04:58:22 | 00:00:29 | 1284696693-1284699126 |
| binary-log.005566 | 1105002721 (1.03 GB) | 2025-09-01 04:58:22 | 2025-09-01 04:58:23 | 00:00:01 | 1284699127-1284699312 |
| binary-log.005567 | 1273545902 (1.19 GB) | 2025-09-01 04:58:23 | 2025-09-01 05:02:33 | 00:04:10 | 1284699313-1284728539 |
| binary-log.005568 | 1287820910 (1.20 GB) | 2025-09-01 05:02:33 | 2025-09-01 05:02:42 | 00:00:09 | 1284728540-1284729282 |
...
| binary-log.005633 | 58514304 (55.80 MB) | 2025-09-01 09:12:53 | 2025-09-01 09:17:53 | 00:05:00 | 1286735216-1286786118 |
| binary-log.005634 | 58955596 (56.22 MB) | 2025-09-01 09:17:53 | 2025-09-01 09:22:53 | 00:05:00 | 1286786119-1286834568 |
| binary-log.005635 | 71508778 (68.20 MB) | 2025-09-01 09:22:53 | 2025-09-01 09:27:53 | 00:05:00 | 1286834569-1286880281 |
| binary-log.005636 | 107107179 (102.15 MB) | 2025-09-01 09:27:53 | 2025-09-01 09:32:53 | 00:05:00 | 1286880282-1286942223 |
| binary-log.005637 | 530205055 (505.64 MB) | 2025-09-01 09:32:53 | 2025-09-01 09:37:53 | 00:05:00 | 1286942224-1287246612 |
| binary-log.005638 | 546754562 (521.43 MB) | 2025-09-01 09:37:53 | 2025-09-01 09:42:53 | 00:05:00 | 1287246613-1287562930 |
| binary-log.005639 | 528677634 (504.19 MB) | 2025-09-01 09:42:53 | 2025-09-01 09:47:53 | 00:05:00 | 1287562931-1287868985 |
+--------------------+------------------------+---------------------+---------------------+-----------+------------------------------------+
查看该实例的错误日志,发现有大量的锁等待超时报错。
需要注意的是,这个实例的事务隔离级别是 RC。在该级别下,MySQL 通常只会加记录锁。此外,该实例启用了 WRITESET 并行复制,MySQL 会根据事务修改的主键或唯一索引来判断是否可并行执行。换句话说,如果两个事务在主键或唯一索引上存在冲突,它们将无法并行重放。理论上,在这种机制组合下,从库在重放过程不应发生锁等待超时。
随后使用binlog_summary.py
(具体用法可参考:Binlog分析利器-binlog_summary.py)对延迟开始时段的四个 binlog 文件( binary-log.005636 ~ binary-log.005639 )进行了分析,发现这些 binlog 的操作模式十分相似:操作次数排名前两位的均为同一张表biz_schema.tbl_product_service_mapping01
的 DELETE 与 INSERT 操作。
# python3 binlog_summary.py -f binary-log.005636.txt -c opr --new
TABLE_NAME DML_TYPE NUMS
biz_schema.tbl_product_service_mapping01 INSERT 71271
biz_schema.tbl_product_service_mapping01 DELETE 67434
...
写了个简单的脚本测试了下,发现对于相同的唯一索引值,INSERT
操作总是出现在对应的DELETE
操作之后,于是写了个脚本将 DELETE
操作涉及的记录提取出来并插入到测试库中,然后将相关 binlog 当作 relay log 进行重放。
为了排除其它表的干扰,在重放时设置了replicate-do-table = biz_schema.tbl_product_service_mapping01
,只重放这一张表。
下面是具体的重放步骤:
1. 初始化 relay log:
CHANGE MASTER TO MASTER_HOST='dummy';
STOP SLAVE;
RESET SLAVE ALL;
执行上述命令后,MySQL 会在当前数据目录下生成两个文件:
instance-20250903-0701-relay-bin.000001
(第一个 relay log 文件)instance-20250903-0701-relay-bin.index
(relay log 索引文件)
其中,instance-20250903-0701 是主机名。
2. 替换掉 relay log:
用 binary-log.005636 替换掉 instance-20250903-0701-relay-bin.000001,并修改该文件的属主。
# cp binary-log.005636 /data/mysql/3306/data/instance-20250903-0701-relay-bin.000001
# chown mysql.mysql /data/mysql/3306/data/instance-20250903-0701-relay-bin.000001
3. 启动 SQL 线程进行重放:
CHANGE MASTER TO RELAY_LOG_FILE='instance-20250903-0701-relay-bin.000001', RELAY_LOG_POS=1, MASTER_HOST='dummy';
START SLAVE SQL_THREAD;
结果发现能成功重放,且重放过程中未出现任何报错。
测试了三次,重放时间分别为 362.74s、352.69s、361.75s,平均耗时 359.06 秒。
每次重放过程中,错误日志中都出现了多次锁等待超时错误。
2025-09-21T07:53:45.257279-00:00 260 [Warning] [MY-010584] [Repl] Slave SQL for channel '': Worker 5 failed executing transaction '9206ff59-2d95-4a02-88cf-04d97adfdd65:1286917678' at master log , end_log_pos 63251784; Could not execute Write_rows event on table biz_schema.tbl_product_service_mapping01; Lock wait timeout exceeded; try restarting transaction, Error_code: 1205; handler error HA_ERR_LOCK_WAIT_TIMEOUT; the event's master log FIRST, end_log_pos 63251784, Error_code: MY-001205
很显然,锁等待超时是并行重放导致的。
如果是单线程重放,就能规避这个问题,于是将replica_parallel_workers
设置为 1,重新执行相同的测试,三次重放时间分别为 82.39s、83.40s、83.43s,平均仅 83.07 秒。
想不到,单线程重放竟然比多线程快了四倍多。
接下来,重点分析下锁等待超时问题。
为什么会出现锁等待?
以下是出现锁等待时,sys.innodb_lock_waits
的输出:
mysql> select * from sys.innodb_lock_waits\G
*************************** 1.row ***************************
wait_started: 2025-10-1213:11:29
wait_age: 08:00:33
wait_age_secs: 28833
locked_table: `biz_schema`.`tbl_product_service_mapping01`
locked_table_schema: biz_schema
locked_table_name: tbl_product_service_mapping01
locked_table_partition: NULL
locked_table_subpartition: NULL
locked_index: tbl_product_service_pk
locked_type: RECORD
waiting_trx_id: 5221288
waiting_trx_started: 2025-10-1213:11:22
waiting_trx_age: 08:00:40
waiting_trx_rows_locked: 35
waiting_trx_rows_modified: 34
waiting_pid: 10
waiting_query: INSERT IGNORE INTO tbl_product ... (10512475, 1073743289) ,
waiting_lock_id: 140256432120808:10011:67:240:263:140256317861168
waiting_lock_mode: X,GAP,INSERT_INTENTION
blocking_trx_id: 5221291
blocking_pid: 14
blocking_query: INSERT IGNORE INTO tbl_product ... (10512476, 1073743289)
blocking_lock_id: 140256432125848:9282:67:240:263:140256317891856
blocking_lock_mode: S,GAP
blocking_trx_started: 2025-10-1213:11:22
blocking_trx_age: 08:00:40
blocking_trx_rows_locked: 35
blocking_trx_rows_modified: 34
sql_kill_blocking_query: KILL QUERY 14
sql_kill_blocking_connection: KILL 14
*************************** 2.row ***************************
wait_started: 2025-10-1213:11:29
wait_age: 08:00:33
wait_age_secs: 28833
locked_table: `biz_schema`.`tbl_product_service_mapping01`
locked_table_schema: biz_schema
locked_table_name: tbl_product_service_mapping01
locked_table_partition: NULL
locked_table_subpartition: NULL
locked_index: tbl_product_service_pk
locked_type: RECORD
waiting_trx_id: 5221291
waiting_trx_started: 2025-10-1213:11:22
waiting_trx_age: 08:00:40
waiting_trx_rows_locked: 35
waiting_trx_rows_modified: 34
waiting_pid: 14
waiting_query: INSERT IGNORE INTO tbl_product ... (10512476, 1073743289)
waiting_lock_id: 140256432125848:9282:67:240:260:140256317892560
waiting_lock_mode: X,GAP,INSERT_INTENTION
blocking_trx_id: 5221289
blocking_pid: 12
blocking_query: NULL
blocking_lock_id: 140256432123832:9816:67:240:260:140256317879376
blocking_lock_mode: S,GAP
blocking_trx_started: 2025-10-1213:11:22
blocking_trx_age: 08:00:40
blocking_trx_rows_locked: 34
blocking_trx_rows_modified: 33
sql_kill_blocking_query: KILL QUERY 12
sql_kill_blocking_connection: KILL 12
2rowsinset (0.01 sec)
可以看出:
- PID 10 的 INSERT 操作被 PID 14 持有的 S,GAP 锁阻塞。
- PID 14的 INSERT 操作又被 PID 12 持有的 S,GAP 锁阻塞。
使用performance_schema.data_locks
可以获取更详细的锁信息,包括被锁定的数据行:
SELECT
w.REQUESTING_ENGINE_TRANSACTION_ID AS waiting_trx_id,
w.BLOCKING_ENGINE_TRANSACTION_ID AS blocking_trx_id,
l1.LOCK_MODE AS waiting_lock_mode,
l1.LOCK_DATA AS waiting_lock_data,
l2.LOCK_MODE AS blocking_lock_mode,
l2.LOCK_DATA AS blocking_lock_data
FROM performance_schema.data_lock_waits AS w
JOIN performance_schema.data_locks AS l1
ON w.REQUESTING_ENGINE_LOCK_ID = l1.ENGINE_LOCK_ID
JOIN performance_schema.data_locks AS l2
ON w.BLOCKING_ENGINE_LOCK_ID = l2.ENGINE_LOCK_ID;
+----------------+-----------------+------------------------+--------------------------+--------------------+--------------------------+
| waiting_trx_id | blocking_trx_id | waiting_lock_mode | waiting_lock_data | blocking_lock_mode | blocking_lock_data |
+----------------+-----------------+------------------------+--------------------------+--------------------+--------------------------+
| 5221288 | 5221291 | X,GAP,INSERT_INTENTION | 10512476, 1, 18158557178 | S,GAP | 10512476, 1, 18158557178 |
| 5221291 | 5221289 | X,GAP,INSERT_INTENTION | 10512477, 1, 18158557146 | S,GAP | 10512477, 1, 18158557146 |
+----------------+-----------------+------------------------+--------------------------+--------------------+--------------------------+
2 rows in set (0.00 sec)
接下来从show processlist
的输出中看看 PID 10、14、12 这三个线程的状态。
mysql> show processlist;
+----+-----------------+-----------+-----------+---------+---------+---------------------------------------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+-----------------+-----------+-----------+---------+---------+---------------------------------------------+------------------------------------------------------------------------------------------------------+
| 5 | event_scheduler | localhost | NULL | Daemon | 1813 | Waiting on empty queue | NULL |
| 8 | root | localhost | biz_schema | Query | 0 | init | show processlist |
| 9 | system user | | NULL | Query | 1588 | Waiting for dependent transaction to commit | NULL |
| 10 | system user | | biz_schema | Query | 3585949 | Applying batch of row changes (write) | INSERT IGNORE INTO tbl_product_service_mapping01
(
c1,
c2 |
| 11 | system user | | NULL | Query | 3585949 | Waiting for preceding transaction to commit | NULL |
| 12 | system user | | NULL | Query | 3585949 | Waiting for preceding transaction to commit | NULL |
| 13 | system user | | NULL | Query | 3585949 | Waiting for preceding transaction to commit | NULL |
| 14 | system user | | biz_schema | Query | 3585949 | Applying batch of row changes (write) | INSERT IGNORE INTO tbl_product_service_mapping01
(
c1,
c2 |
| 15 | system user | | NULL | Query | 3585949 | Waiting for preceding transaction to commit | NULL |
| 16 | system user | | NULL | Query | 3585949 | Waiting for an event from Coordinator | NULL |
| 17 | system user | | NULL | Query | 3585949 | Waiting for an event from Coordinator | NULL |
+----+-----------------+-----------+-----------+---------+---------+---------------------------------------------+------------------------------------------------------------------------------------------------------+
11rowsinset, 1warning (0.00 sec)
PID 12 的执行用户是system user
,说明它是并行重放的工作线程,其状态为Waiting for preceding transaction to commit
,表示该线程正在等待它前面的事务提交完成。
而 PID 10 和 PID 14 的状态均为Applying batch of row changes (write)
。从字面上看,似乎是在执行批量写入操作,但实际上,这两个线程正在等待锁。
如果执行的是SHOW FULL PROCESSLIST
,Info
列的INSERT IGNORE
操作中还可以看到具体要插入的唯一索引值。借助这些唯一索引值,可以在 binlog 中精确定位对应的执行位置,便于分析事务执行顺序和锁等待情况。
不过,对于 PID 12,由于Info
列为NULL
,无法直接看到具体的 DML 操作,因此难以定位其执行内容。
为了解决这个问题,我在Slave_worker::slave_worker_exec_event
函数中,在调用ev->do_apply_event_worker(this)
的前后分别添加了日志打印。这样,就能清楚地看到每个工作线程正在执行的 event 的 binlog 位置点信息。
int Slave_worker::slave_worker_exec_event(Log_event *ev) {
...
ulong thread_id = thd->thread_id();
ulong log_pos = static_cast<ulong>(ev->common_header->log_pos);
std::string msg = "Executing event: worker_thread_id=" +
std::to_string(thread_id) +
", master_log_pos=" +
std::to_string(log_pos);
LogErr(INFORMATION_LEVEL, ER_CONDITIONAL_DEBUG, msg.c_str());
ret = ev->do_apply_event_worker(this);
msg = "Done executing event: worker_thread_id=" +
std::to_string(thread_id) +
", master_log_pos=" +
std::to_string(log_pos);
LogErr(INFORMATION_LEVEL, ER_CONDITIONAL_DEBUG, msg.c_str());
return ret;
}
下面是锁等待发生时,PID 10、12、14 正在执行的 binlog event 位置点信息:
# PID 10
grep 'worker_thread_id=10' /data/mysql/3306/data/mysqld.err | tail -1
2025-10-12T13:11:22.639120-00:00 10 [Note] [MY-013935] [Repl] Executing event: worker_thread_id=10, master_log_pos=63245428
# PID 12
grep 'worker_thread_id=12' /data/mysql/3306/data/mysqld.err | tail -1
2025-10-12T13:11:22.725638-00:00 12 [Note] [MY-013935] [Repl] Executing event: worker_thread_id=12, master_log_pos=63248672
# PID 14
grep 'worker_thread_id=14' /data/mysql/3306/data/mysqld.err | tail -1
2025-10-12T13:11:22.646870-00:00 14 [Note] [MY-013935] [Repl] Executing event: worker_thread_id=14, master_log_pos=63251784
可以看到,PID 10 对应的事务在 binlog 中的位置早于 PID 12。
当参数replica_preserve_commit_order
设置为 ON 时,从库必须严格按照主库的提交顺序依次提交事务,因此 PID 12 必须等待 PID 10 提交完成才能继续执行。
结合锁依赖关系,就形成了一个循环等待的局面:
- PID 10 等待 PID 14 持有的 S,GAP 锁;
- PID 14 等待 PID 12 持有的 S,GAP 锁;
- PID 12 因提交顺序限制,必须等待 PID 10 提交事务。
最终,这种环路导致三个线程相互阻塞,直到锁等待超时,MySQL 才会重新执行这些事务。
模拟从库的重放操作
根据获取到的 PID 10、12、14 对应的 event 位置点信息,我们可以还原出锁等待发生时这三个线程正在执行的具体操作:
worker_thread_id=10 master_log_pos=63245428: insert c1 = 10512475 的所有记录,如(10512475,1),(10512475,20)...
worker_thread_id=12 master_log_pos=63248672: insert c1 = 10512477 的所有记录,如(10512477,1),(10512477,20)...
worker_thread_id=14 master_log_pos=63251784: insert c1 = 10512476 的所有记录,如(10512476,1),(10512476,20)...
这里的记录值对应的是表的联合唯一索引,其中c1
是联合索引的第一列。
值得注意的是,在这些INSERT
操作之前,binlog 中还存在针对相同c1
值的DELETE
操作:
delete c1 = 10512475 的所有记录,如(10512475,1),(10512475,20)...
delete c1 = 10512477 的所有记录,如(10512477,1),(10512477,20)...
delete c1 = 10512476 的所有记录,如(10512476,1),(10512476,20)...
也就是说,业务实际上是通过 DELETE + INSERT 的方式实现数据更新。
为了进一步分析锁等待问题,我打印了重放过程中每个INSERT
操作的具体内容。
2025-10-12T13:11:22.639267-00:00 10 [Note] [MY-013935] [Repl] Inserted row: (18158557112, 10512475, 1, ...)
2025-10-12T13:11:22.640729-00:00 10 [Note] [MY-013935] [Repl] Inserted row: (18158557113, 10512475, 20, ...)
2025-10-12T13:11:22.642004-00:00 10 [Note] [MY-013935] [Repl] Inserted row: (18158557114, 10512475, 26, ...)
2025-10-12T13:11:22.643344-00:00 10 [Note] [MY-013935] [Repl] Inserted row: (18158557115, 10512475, 123, ...)
2025-10-12T13:11:22.644262-00:00 12 [Note] [MY-013935] [Repl] Inserted row: (18158557146, 10512477, 1, ...)
2025-10-12T13:11:22.644663-00:00 10 [Note] [MY-013935] [Repl] Inserted row: (18158557116, 10512475, 131, ...)
2025-10-12T13:11:22.646250-00:00 12 [Note] [MY-013935] [Repl] Inserted row: (18158557147, 10512477, 20, ...)
2025-10-12T13:11:22.647020-00:00 14 [Note] [MY-013935] [Repl] Inserted row: (18158557178, 10512476, 1, ...)
2025-10-12T13:11:22.647192-00:00 10 [Note] [MY-013935] [Repl] Inserted row: (18158557117, 10512475, 133, ...)
其中,第一个值是自增主键,后两个值是唯一索引列。
从输出可以看到,这三个事务的插入操作是交叉执行的。
模拟从库重放过程
下面通过一个实验来模拟从库的重放操作。
首先,在会话 1 中创建测试表并插入数据。
session1> create table test.t1(id bigint auto_increment primary key,c1 int,c2 int,unique key(c1,c2));
Query OK, 0 rows affected (0.06 sec)
session1> insert into test.t1(c1,c2) values(10512475, 1),(10512475, 2),(10512476, 1),(10512476, 2),(10512477, 1),(10512477, 2);
Query OK, 6 rows affected (0.04 sec)
Records: 6 Duplicates: 0 Warnings: 0
session1> select * from test.t1;
+----+----------+------+
| id | c1 | c2 |
+----+----------+------+
| 1 | 10512475 | 1 |
| 2 | 10512475 | 2 |
| 3 | 10512476 | 1 |
| 4 | 10512476 | 2 |
| 5 | 10512477 | 1 |
| 6 | 10512477 | 2 |
+----+----------+------+
6 rows in set (0.00 sec)
其次,在会话 2 中针对另外一张表执行FLUSH TABLES FOR EXPORT
操作,至于为什么要执行这个操作,后续加锁分析部分会解释。
session2> flush tables test.t2 for export;
Query OK, 0 rows affected (0.01 sec)
接着,在会话 1 中删除表中数据。
session1> delete from test.t1;
Query OK, 6 rows affected (0.02 sec)
接着分别创建三个新的会话,执行如下操作:
session3> begin;
Query OK, 0 rows affected (0.00 sec)
session3> insert into test.t1(c1,c2,id) values(10512475,1,100);
Query OK, 1 row affected (0.01 sec)
session4> begin;
Query OK, 0 rows affected (0.00 sec)
session4> insert into test.t1(c1,c2,id) values(10512476,1,18158557178);
Query OK, 1 row affected (0.00 sec)
session5> begin;
Query OK, 0 rows affected (0.00 sec)
session5> insert into test.t1(c1,c2,id) values(10512477,1,18158557146);
Query OK, 1 row affected (0.01 sec)
继续在会话 3 和 会话 4 中插入数据。
session3> set session innodb_lock_wait_timeout=5000;
Query OK, 0 rows affected (0.00 sec)
session3> insert into test.t1(c1,c2) values(10512475, 2);
-- 阻塞中...
session4> set session innodb_lock_wait_timeout=5000;
Query OK, 0 rows affected (0.00 sec)
session4> insert into test.t1(c1,c2) values(10512476, 2);
-- 阻塞中...
接着在会话 2 中执行UNLOCK TABLES
操作释放表锁。
session2> unlock tables;
Query OK, 0 rows affected (0.00 sec)
在会话 5 中查看锁等待信息。
+----------------+-----------------+------------------------+--------------------------+--------------------+--------------------------+
| waiting_trx_id | blocking_trx_id | waiting_lock_mode | waiting_lock_data | blocking_lock_mode | blocking_lock_data |
+----------------+-----------------+------------------------+--------------------------+--------------------+--------------------------+
| 23228 | 23229 | X,GAP,INSERT_INTENTION | 10512477, 1, 18158557146 | S,GAP | 10512477, 1, 18158557146 |
| 23225 | 23228 | X,GAP,INSERT_INTENTION | 10512476, 1, 18158557178 | S,GAP | 10512476, 1, 18158557178 |
+----------------+-----------------+------------------------+--------------------------+--------------------+--------------------------+
2 rows in set (0.01 sec)
可以看到,该结果与重放过程中出现锁等待时的输出完全一致。
加锁分析
接下来,我们重点分析一下:为什么在 RC(Read Committed)事务隔离级别下会产生 GAP 锁?
毕竟,在大多数人的印象中,RC 隔离级别下只会存在记录锁,而不会出现间隙锁。
事实上,这与 INSERT 操作之前执行的 DELETE 操作 有直接关系。
在前面的例子中,我们在 binlog 中发现,在执行 INSERT 操作之前,存在针对相同记录的 DELETE 操作。
在 MySQL 中,DELETE 操作并不会立即物理删除数据,而是将记录标记为“已删除”(delete-marked),等待后台的 purge 线程异步清理。这意味着在逻辑删除之后,这些记录仍然可能暂时保留在索引页中。
当随后执行 INSERT 操作时,如果待插入的记录在唯一索引上与某条“已标记删除但尚未清除”的记录键值相同,MySQL 会执行如下加锁行为:
- 对该索引项加上 S 锁;
- 同时,对该索引项的间隙(即该记录与下一条记录之间的范围)加上 S,GAP 锁。
下面我们通过一个简化的实验来验证这一点。
实验验证
在前面的重放示例中,我们执行了FLUSH TABLES FOR EXPORT
操作。
执行这个操作的目的,是为了暂停 purge 线程,从而保留 delete-marked 记录,便于重现这种锁行为。
# 会话 1:创建测试表并插入数据
session1> create table test.t1(id bigint auto_increment primary key,c1 int,c2 int,unique key(c1,c2));
Query OK, 0 rows affected (0.07 sec)
session1> insert into test.t1(c1,c2) values(10512476, 1),(10512476, 2);
Query OK, 2 rows affected (0.04 sec)
Records: 2 Duplicates: 0 Warnings: 0
session1> select * from test.t1;
+----+----------+------+
| id | c1 | c2 |
+----+----------+------+
| 1 | 10512476 | 1 |
| 2 | 10512476 | 2 |
+----+----------+------+
2 rows in set (0.00 sec)
# 会话 2:暂停 purge 线程
session2> flush tables test.t2 for export;
Query OK, 0 rows affected (0.03 sec)
# 会话 1:删除数据
session1> delete from test.t1;
Query OK, 2 rows affected (0.01 sec)
# 会话 3:开启事务并插入数据
session3> begin;
Query OK, 0 rows affected (0.00 sec)
session3> insert into test.t1(c1,c2,id) values(10512476,1,18158557178);
Query OK, 1 row affected (0.01 sec)
# 查看锁信息
session3> select object_schema, object_name, index_name, lock_type, lock_mode, lock_status, lock_data from performance_schema.data_locks;
+---------------+-------------+------------+-----------+-----------+-------------+--------------------------+
| object_schema | object_name | index_name | lock_type | lock_mode | lock_status | lock_data |
+---------------+-------------+------------+-----------+-----------+-------------+--------------------------+
| test | t1 | NULL | TABLE | IX | GRANTED | NULL |
| test | t1 | c1 | RECORD | S | GRANTED | 10512476, 1, 1 |
| test | t1 | c1 | RECORD | S,GAP | GRANTED | 10512476, 2, 2 |
| test | t1 | c1 | RECORD | S,GAP | GRANTED | 10512476, 1, 18158557178 |
+---------------+-------------+------------+-----------+-----------+-------------+--------------------------+
4 rows in set (0.00 sec)
锁行为解释
在插入 (10512476, 1, 18158557178)
这条记录时:
-
由于
(10512476, 1)
仍存在于索引中(虽然被标记删除),MySQL 会对该记录加上 S 锁; -
同时,对
(10512476, 1)
的下一条记录(10512476, 2)
加上 S,GAP 锁,防止该间隙范围内被其他事务插入新记录; -
此外,插入的新记录
(10512476, 1, 18158557178)
还会继承下一条记录(10512476, 2)
的 GAP 锁。
其中:
- 前两种锁的加锁逻辑是在
row_ins_scan_sec_index_for_duplicate()
中实现的; - 锁继承的逻辑是在
lock_rec_add_to_queue()
中实现的。
当执行UNLOCK TABLES
后,purge 线程恢复运行,会清理掉之前的 delete-marked 记录,对应的锁也会被释放。但可以看到,新插入记录自身的 GAP 锁仍然保留:
session2> unlock tables;
Query OK, 0 rows affected (0.05 sec)
session3> select object_schema, object_name, index_name, lock_type, lock_mode, lock_status, lock_data from performance_schema.data_locks;
+---------------+-------------+------------+-----------+-----------+-------------+--------------------------+
| object_schema | object_name | index_name | lock_type | lock_mode | lock_status | lock_data |
+---------------+-------------+------------+-----------+-----------+-------------+--------------------------+
| test | t1 | NULL | TABLE | IX | GRANTED | NULL |
| test | t1 | c1 | RECORD | S,GAP | GRANTED | 10512476, 1, 18158557178 |
+---------------+-------------+------------+-----------+-----------+-------------+--------------------------+
2 rows in set (0.01 sec)
优化方案
针对上面分析的锁等待案例,优化主要可以从应用侧和数据库侧两方面入手。
1. 应用侧优化
应用层面的改进主要集中在索引设计与更新逻辑上:
- 将原本的唯一索引改为普通二级索引。
- 将自增主键去掉,直接用原来的唯一索引列作为主键。与普通唯一索引不同,主键在插入时,即使遇到已经删除的记录,也不会额外加 S,GAP 锁。
- 优化更新逻辑。尽量避免通过
DELETE + INSERT
的方式更新数据,可以考虑使用UPDATE
或者其他业务逻辑调整,以减少对间隙锁的触发。
2. 数据库侧优化
将replica_preserve_commit_order
设置为 OFF,允许从库在遇到事务等待环路时,独立提交事务,而无需等待其他事务完成。
不过需要注意的是,如果使用的 Group Replication,会要求该参数必须为 ON。
下表展示了不同方案下的从库重放性能对比:
方案 | 三次平均执行时间(秒) |
---|---|
唯一索引 + replica_parallel_workers = 8 | 359.06 |
唯一索引 + replica_parallel_workers = 1 | 83.07 |
普通索引 + replica_parallel_workers = 8 | 33.50 |
唯一索引 + replica_parallel_workers = 8 + replica_preserve_commit_order = OFF | 21.11 |
参考资料
- https://help.aliyun.com/zh/polardb/polardb-for-mysql/resolve-the-unique-key-check-problem-in-mysql
- http://mysql.taobao.org/monthly/2015/06/02/
- https://zhuanlan.zhihu.com/p/28797400192
- https://dev.mysql.com/doc/refman/8.4/en/innodb-transaction-isolation-levels.html