Page MenuHomePhabricator

mysqld process hang in db1069 - S2 mysql instance
Closed, ResolvedPublic

Description

There are several mysqld processes running at db1069 and the one running S2 instance showed a 21h lag.

After troubleshooting it a bit we have seen the following facts:

Affected table structure:

CREATE TABLE `user_groups` (
  `ug_user` int(5) unsigned NOT NULL DEFAULT '0',
  `ug_group` varbinary(255) NOT NULL DEFAULT '',
  PRIMARY KEY (`ug_user`,`ug_group`),
  KEY `ug_group` (`ug_group`)
) ENGINE=TokuDB DEFAULT CHARSET=binary `compression`='tokudb_zlib'

The following query has been running for a long time:

*************************** 4. row ***************************
      Id: 44
    User: system user
    Host:
      db: zhwiki
 Command: Connect
    Time: 81310
   State: update
    Info: INSERT /* User::addGroup 127.0.0.1 */ IGNORE INTO `user_groups` (ug_user,ug_group) VALUES ('1546118','bot')
Progress: 0.000

Nothing else is running

Id	User	Host	db	Command	Time	State	Info	Progress
2	event_scheduler	localhost	NULL	Daemon	177	Waiting for next activation	NULL	0.000
42	system user		NULL	Connect	6676093	Waiting for master to send event	NULL	0.000
43	system user		NULL	Connect	81133	Waiting for prior transaction to commit	NULL	0.000
44	system user		zhwiki	Connect	81133	update	INSERT /* User::addGroup 127.0.0.1 */ IGNORE INTO `user_groups` (ug_user,ug_group) VALUES ('1546118','bot')	0.000
45	system user		NULL	Connect	97087	Waiting for room in worker thread event queue	NULL	0.000
830653	repl	10.64.37.5:57304	NULL	Binlog Dump	2815399	Master has sent all binlog to slave; waiting for binlog to be updated	NULL	0.000
839855	repl	10.64.4.11:44118	NULL	Binlog Dump	2769700	Master has sent all binlog to slave; waiting for binlog to be updated	NULL	0.000
1254774	watchdog	10.64.0.15:51206	information_schema	Sleep	5		NULL	0.000
1254775	watchdog	10.64.0.15:51627	mysql	Sleep	0		NULL	0.000
1410875	root	localhost	NULL	Sleep	469		NULL	0.000
1410978	root	localhost	NULL	Query	0	init	show full processlist	0.000
*************************** 1. row ***************************
  Type: InnoDB
  Name:
Status:
=====================================
2016-09-08 13:16:53 7ffa5a5e7700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 9 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 6475205 srv_active, 0 srv_shutdown, 177284 srv_idle
srv_master_thread log flush and writes: 6652339
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 34309329
OS WAIT ARRAY INFO: signal count 51354558
Mutex spin waits 57127135, rounds 892504058, OS waits 25672980
RW-shared spins 13506675, rounds 235123829, OS waits 6441900
RW-excl spins 9307856, rounds 143680432, OS waits 1919206
Spin rounds per wait: 15.62 mutex, 17.41 RW-shared, 15.44 RW-excl
------------
TRANSACTIONS
------------
Trx id counter 5158085477
Purge done for trx's n:o < 5158085476 undo n:o < 0 state: running but idle
History list length 572
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 1410965, OS thread handle 0x7ffa5a5e7700, query id 1309267452 localhost root init
show engine innodb status
---TRANSACTION 4225399955, not started
MySQL thread id 42, OS thread handle 0x7ffa5a585700, query id 0 Waiting for master to send event
---TRANSACTION 5158070750, not started
MySQL thread id 44, OS thread handle 0x7ffa56dce700, query id 1309267436 update
INSERT /* User::addGroup 127.0.0.1 */ IGNORE INTO `user_groups` (ug_user,ug_group) VALUES ('1546118','bot')
---TRANSACTION 4225399812, not started
MySQL thread id 1, OS thread handle 0x7ffa5a73e700, query id 0 Waiting for background binlog tasks
---TRANSACTION 5158070754, ACTIVE (PREPARED) 81069 sec
3 lock struct(s), heap size 1184, 1 row lock(s), undo log entries 3
MySQL thread id 43, OS thread handle 0x7ffa58433700, query id 1309131308 Waiting for prior transaction to commit
Trx #rec lock waits 3121 #table lock waits 0
Trx total rec lock wait time 18 SEC
Trx total table lock wait time 0 SEC
--------
FILE I/O
--------
I/O thread 0 state: waiting for i/o request (insert buffer thread)
I/O thread 1 state: waiting for i/o request (log thread)
I/O thread 2 state: waiting for i/o request (read thread)
I/O thread 3 state: waiting for i/o request (read thread)
I/O thread 4 state: waiting for i/o request (read thread)
I/O thread 5 state: waiting for i/o request (read thread)
I/O thread 6 state: waiting for i/o request (read thread)
I/O thread 7 state: waiting for i/o request (read thread)
I/O thread 8 state: waiting for i/o request (read thread)
I/O thread 9 state: waiting for i/o request (read thread)
I/O thread 10 state: waiting for i/o request (read thread)
I/O thread 11 state: waiting for i/o request (read thread)
I/O thread 12 state: waiting for i/o request (read thread)
I/O thread 13 state: waiting for i/o request (read thread)
I/O thread 14 state: waiting for i/o request (read thread)
I/O thread 15 state: waiting for i/o request (read thread)
I/O thread 16 state: waiting for i/o request (read thread)
I/O thread 17 state: waiting for i/o request (read thread)
I/O thread 18 state: waiting for i/o request (write thread)
I/O thread 19 state: waiting for i/o request (write thread)
I/O thread 20 state: waiting for i/o request (write thread)
I/O thread 21 state: waiting for i/o request (write thread)
I/O thread 22 state: waiting for i/o request (write thread)
I/O thread 23 state: waiting for i/o request (write thread)
I/O thread 24 state: waiting for i/o request (write thread)
I/O thread 25 state: waiting for i/o request (write thread)
Pending normal aio reads: 0 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0, 0, 0, 0, 0] ,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
4463767 OS file reads, 562387708 OS file writes, 535312345 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 2600, seg size 2602, 888183 merges
merged operations:
 insert 2856041, delete mark 2732633, delete 855981
discarded operations:
 insert 0, delete mark 0, delete 0
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 2594485252999
Log flushed up to   2594485252999
Pages flushed up to 2594485252999
Last checkpoint at  2594485252999
Max checkpoint age    3478212404
Checkpoint age target 3369518267
Modified age          0
Checkpoint age        0
0 pending log writes, 0 pending chkp writes
514432583 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 8791261184; in additional pool allocated 0
Total memory allocated by read views 432
Internal hash tables (constant factor + variable factor)
    Adaptive hash index 833192288 	(151384312 + 681807976)
    Page hash           1107208 (buffer pool 0 only)
    Dictionary cache    37243056 	(35402768 + 1840288)
    File system         905296 	(812272 + 93024)
    Lock system         21253192 	(21250568 + 2624)
    Recovery system     0 	(0 + 0)
Dictionary memory allocated 1840288
Buffer pool size        524280
Buffer pool size, bytes 8589803520
Free buffers            8904
Database pages          473762
Old database pages      174877
Modified db pages       0
Percent of dirty pages(LRU & free pages): 0.000
Max dirty pages percent: 75.000
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 2754509, not young 811412628
0.00 youngs/s, 0.00 non-youngs/s
Pages read 5226290, created 4170968, written 54158543
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 473762, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
----------------------
INDIVIDUAL BUFFER POOL INFO
----------------------
---BUFFER POOL 0
Buffer pool size        65535
Buffer pool size, bytes 1073725440
Free buffers            1144
Database pages          59149
Old database pages      21829
Modified db pages       0
Percent of dirty pages(LRU & free pages): 0.000
Max dirty pages percent: 75.000
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 351299, not young 98612103
0.00 youngs/s, 0.00 non-youngs/s
Pages read 661252, created 524914, written 7393879
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 59149, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
---BUFFER POOL 1
Buffer pool size        65535
Buffer pool size, bytes 1073725440
Free buffers            1196
Database pages          59098
Old database pages      21829
Modified db pages       0
Percent of dirty pages(LRU & free pages): 0.000
Max dirty pages percent: 75.000
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 347217, not young 98373232
0.00 youngs/s, 0.00 non-youngs/s
Pages read 660394, created 522916, written 7025071
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 59098, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
---BUFFER POOL 2
Buffer pool size        65535
Buffer pool size, bytes 1073725440
Free buffers            1144
Database pages          59210
Old database pages      21860
Modified db pages       0
Percent of dirty pages(LRU & free pages): 0.000
Max dirty pages percent: 75.000
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 343753, not young 104864852
0.00 youngs/s, 0.00 non-youngs/s
Pages read 658594, created 526885, written 6525218
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 59210, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
---BUFFER POOL 3
Buffer pool size        65535
Buffer pool size, bytes 1073725440
Free buffers            1124
Database pages          59221
Old database pages      21870
Modified db pages       0
Percent of dirty pages(LRU & free pages): 0.000
Max dirty pages percent: 75.000
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 338562, not young 100821994
0.00 youngs/s, 0.00 non-youngs/s
Pages read 650016, created 518134, written 6897026
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 59221, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
---BUFFER POOL 4
Buffer pool size        65535
Buffer pool size, bytes 1073725440
Free buffers            1026
Database pages          59380
Old database pages      21899
Modified db pages       0
Percent of dirty pages(LRU & free pages): 0.000
Max dirty pages percent: 75.000
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 347327, not young 103727368
0.00 youngs/s, 0.00 non-youngs/s
Pages read 653023, created 523331, written 6911900
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 59380, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
---BUFFER POOL 5
Buffer pool size        65535
Buffer pool size, bytes 1073725440
Free buffers            1108
Database pages          59204
Old database pages      21861
Modified db pages       0
Percent of dirty pages(LRU & free pages): 0.000
Max dirty pages percent: 75.000
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 341341, not young 102578398
0.00 youngs/s, 0.00 non-youngs/s
Pages read 647211, created 518967, written 6812852
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 59204, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
---BUFFER POOL 6
Buffer pool size        65535
Buffer pool size, bytes 1073725440
Free buffers            1101
Database pages          59203
Old database pages      21861
Modified db pages       0
Percent of dirty pages(LRU & free pages): 0.000
Max dirty pages percent: 75.000
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 343242, not young 102572143
0.00 youngs/s, 0.00 non-youngs/s
Pages read 646626, created 518401, written 6240981
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 59203, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
---BUFFER POOL 7
Buffer pool size        65535
Buffer pool size, bytes 1073725440
Free buffers            1061
Database pages          59297
Old database pages      21868
Modified db pages       0
Percent of dirty pages(LRU & free pages): 0.000
Max dirty pages percent: 75.000
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 341768, not young 99862538
0.00 youngs/s, 0.00 non-youngs/s
Pages read 649174, created 517420, written 6351616
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 59297, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
1 RW transactions active inside InnoDB
0 RO transactions active inside InnoDB
1 out of 1000 descriptors used
Main thread process no. 16572, id 140703350900480, state: sleeping
Number of rows inserted 293087206, updated 887359, deleted 58720205, read 73067195
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
Number of system rows inserted 505799786, updated 0, deleted 505799788, read 505799789
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================

The replication thread never goes forward, the query is stuck.

An strace doesn't show anything relevant

Process 16572 attached
restart_syscall(<... resuming interrupted call ...>) = 1
fcntl(32, F_GETFL)                      = 0x2 (flags O_RDWR)
fcntl(32, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
accept(32, {sa_family=AF_INET6, sin6_port=htons(35658), inet_pton(AF_INET6, "::ffff:10.64.0.15", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 3579
fcntl(32, F_SETFL, O_RDWR)              = 0
fcntl(3579, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
setsockopt(3579, SOL_IP, IP_TOS, [8], 4) = 0
setsockopt(3579, SOL_TCP, TCP_NODELAY, [1], 4) = 0
futex(0x13dafe4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x13dafe0, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x13d8660, FUTEX_WAKE_PRIVATE, 1) = 1
poll([{fd=32, events=POLLIN}, {fd=33, events=POLLIN}], 2, 4294967295) = 1 ([{fd=32, revents=POLLIN}])
fcntl(32, F_GETFL)                      = 0x2 (flags O_RDWR)
fcntl(32, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
accept(32, {sa_family=AF_INET6, sin6_port=htons(35666), inet_pton(AF_INET6, "::ffff:10.64.0.15", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 3579
fcntl(32, F_SETFL, O_RDWR)              = 0
fcntl(3579, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
setsockopt(3579, SOL_IP, IP_TOS, [8], 4) = 0
setsockopt(3579, SOL_TCP, TCP_NODELAY, [1], 4) = 0
futex(0x13dafe4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x13dafe0, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
poll([{fd=32, events=POLLIN}, {fd=33, events=POLLIN}], 2, 4294967295

There have been issues with MariaDB and Aria keycaches in the past (I, Manuel, have suffered them in the past with pretty similar symptoms, with this bug specially: https://jira.mariadb.org/browse/MDEV-8004

We have some Aria cache being used:

+-----------------------------------+---------+
| Variable_name                     | Value   |
+-----------------------------------+---------+
| Aria_pagecache_blocks_not_flushed | 0       |
| Aria_pagecache_blocks_unused      | 15736   |
| Aria_pagecache_blocks_used        | 11      |
| Aria_pagecache_read_requests      | 7354070 |
| Aria_pagecache_reads              | 667666  |
| Aria_pagecache_write_requests     | 1335959 |
| Aria_pagecache_writes             | 0       |
| Aria_transaction_log_syncs        | 0       |
+-----------------------------------+---------+
8 rows in set (0.01 sec)

Stopping the slave doesn't even work and hangs forever (strace not showing anything relevant when we try to stop it)

accept(32, {sa_family=AF_INET6, sin6_port=htons(51810), inet_pton(AF_INET6, "::ffff:10.64.0.15", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 3580
fcntl(32, F_SETFL, O_RDWR)              = 0
fcntl(3580, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
setsockopt(3580, SOL_IP, IP_TOS, [8], 4) = 0
setsockopt(3580, SOL_TCP, TCP_NODELAY, [1], 4) = 0
futex(0x13dafe4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x13dafe0, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
futex(0x13d8660, FUTEX_WAKE_PRIVATE, 1) = 1
poll([{fd=32, events=POLLIN}, {fd=33, events=POLLIN}], 2, 4294967295) = 1 ([{fd=32, revents=POLLIN}])
fcntl(32, F_GETFL)                      = 0x2 (flags O_RDWR)
fcntl(32, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
accept(32, {sa_family=AF_INET6, sin6_port=htons(52870), inet_pton(AF_INET6, "::ffff:10.64.0.15", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 3581
fcntl(32, F_SETFL, O_RDWR)              = 0
fcntl(3581, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
setsockopt(3581, SOL_IP, IP_TOS, [8], 4) = 0
setsockopt(3581, SOL_TCP, TCP_NODELAY, [1], 4) = 0
clone(child_stack=0x7ffa56d09f30, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7ffa56d0a9d0, tls=0x7ffa56d0a700, child_tidptr=0x7ffa56d0a9d0) = 42109
poll([{fd=32, events=POLLIN}, {fd=33, events=POLLIN}], 2, 4294967295) = 1 ([{fd=32, revents=POLLIN}])
fcntl(32, F_GETFL)                      = 0x2 (flags O_RDWR)
fcntl(32, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
accept(32, {sa_family=AF_INET6, sin6_port=htons(53205), inet_pton(AF_INET6, "::ffff:10.64.0.15", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 3582
fcntl(32, F_SETFL, O_RDWR)              = 0
fcntl(3582, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
setsockopt(3582, SOL_IP, IP_TOS, [8], 4) = 0
setsockopt(3582, SOL_TCP, TCP_NODELAY, [1], 4) = 0
clone(child_stack=0x7ffa56d6bf30, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7ffa56d6c9d0, tls=0x7ffa56d6c700, child_tidptr=0x7ffa56d6c9d0) = 42120
poll([{fd=32, events=POLLIN}, {fd=33, events=POLLIN}], 2, 4294967295

The only solution is to kill -9 mysqld and let it come back again

As next step, I would suggest we discuss wether we want to convert this(these) table(s) to InnoDB and see if this still happens so we can try to isolate this to a MariaDB bug or a TokuDB one.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
jcrespo subscribed.

This happened much more often before the latest upgrade (10.0.16, I think?). Since the upgrade to 10.0.22, this hadn't happened for months (6 months). The previous suspicion was tokudb, as this only happened when replicating to tokudb tables without primary keys. This never happened on production databases, where we only use InnoDB, so it has to be something specific about sanitarium (replication filters, toku, row based replication, ?).

It happened again. I am going to kill the server again, then convert the table to innodb, as that seemed to do the trick before.

      Id: 44
    User: system user
    Host: 
      db: zhwiki
 Command: Connect
    Time: 80247
   State: update
    Info: INSERT /* User::addGroup 127.0.0.1 */ IGNORE INTO `user_groups` (ug_user,ug_group) VALUES ('1546118','bot')
Progress: 0.000
      Id: 11
    User: system user
    Host: 
      db: zhwiki
 Command: Connect
    Time: 98697
   State: update
    Info: INSERT /* User::addGroup 127.0.0.1 */ IGNORE INTO `user_groups` (ug_user,ug_group) VALUES ('1546118','bot')
Progress: 0.000
*************************** 9. row ***************************

Same query again - that is interesting.
When I hit the MariaDB bug in the past, it used to be _always_ the same
kind of query (a LOAD DATA if I remember correctly).

We will see if once converted to InnoDB it keeps happening, if not, we
should try to feed this back to tokudb devs to get it looked at.

BTW, it indeed worked, which I think almost confirms it is a Toku issue (again).

I will open a bug report to tokudb today and paste here the link to it

I have opened the bug report: https://bugs.launchpad.net/percona-server/+bug/1621852

Edit: As suggested by Percona, I have also created an issue with Mariadb linking it to Percona's bug report: https://jira.mariadb.org/browse/MDEV-10796

This happened again and I have updated the bug report with Percona and MariaDB with the following information

  • gdb stacktraces
  • show engine tokudb status
  • show table status
  • configuration for the relevant instance

https://bugs.launchpad.net/percona-server/+bug/1621852
https://jira.mariadb.org/browse/MDEV-10796

Again the solution was:

kill -9 the process
alter table user_groups engine=Innodb;
start replication

As discussed with @jcrespo it might be worth to alter that table across the wikis proactively to avoid this issue happening again (at least with this table)

MariaDB just answered:


Thanks.
Lets wait and see what Percona guys come up with, TokuDB is their area of expertise. If they claim it's the server's fault, then it will be MariaDB's turn.

I would say T146121 mitigates this issue by a lot. Maybe closing it as resolved or stalled, unless we expect a proper fix from upstream?

Makes sense - I will closed it as resolved and will report back if Percona/MariaDB report some findings.

Change 313948 had a related patch set uploaded (by Legoktm):
Only make our user a bot if it's not already a bot

https://gerrit.wikimedia.org/r/313948

Legoktm subscribed.

The queries listed in the bug so far all seem to occur for the MassMessage system user "MediaWiki message delivery" (by looking at the user_name for the provided user_id in the queries). I looked at the code, and it looks like we're needlessly sending extra duplicate writes to the server. The patch I just submitted should make it basically never for the MassMessage user.

Change 313948 merged by jenkins-bot:
Only make our user a bot if it's not already a bot

https://gerrit.wikimedia.org/r/313948

Thanks a lot @Legoktm for spending time on this.
Hopefully by changing the code as you just did and converting them to InnoDB we should be safe for a while.

Thanks again, and as I said, if Percona/MariaDB report back, I will update this.

Looks like MariaDB assigned to bug to someone already (after me asking for an update:

Elena Stepanova reassigned MDEV-10796:
--------------------------------------

    Fix Version/s: 10.0
         Assignee: Lixun Peng

[~plinux],
Percona suspects MariaDB parallel replication to be the reason of the problem. Could you please review their assessment?

I think we specifically disabled parallel replication? Maybe multisorce is the cause?, in which case it is still a bug for me.

Yes - I am unsure why she said so:

MariaDB SANITARIUM localhost (none) > show global variables like 'slave_parallel_mode';
Empty set (0.00 sec)

Interesting update on the percona bug:

The stack trace shows TokuDB waiting for a row lock.

There is a pull request for fixing some bugs with similar symptoms as this:

  https://github.com/percona/PerconaFT/pull/360

There were some race conditions where one transaction releasing a lock could
miss the wakeup of another transaction waiting for the lock. The result is
that the waiting transaction would be stuck until tokudb_lock_timeout
expires.

Maybe that pull request could help in this case also.

That should not be it, because replication changes are commited in order, or in a single thread. However, if the issue happens also with show slave status, it could be it- as monitoring does that a lot.

However, we had issues with tokudb and parallel replication on the labsdb machines.