Page MenuHomePhabricator

hidenondamaging=1 query is extremely slow on enwiki
Closed, ResolvedPublic

Description

It takes a very long time to load https://en.wikipedia.org/w/index.php?title=Special:RecentChanges&hidenondamaging=1 . This seems to be because the query that it executes hits an optimizer bug:

Query without hidenondamaging=1:

mysql:research@s3-analytics-slave [enwiki]> explain SELECT rc_id,rc_timestamp,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_source,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,page_latest,(SELECT GROUP_CONCAT(ct_tag SEPARATOR ',') FROM `change_tag` WHERE ct_rc_id=rc_id ) AS `ts_tags`,oresc_probability,'0.5' AS ores_threshold FROM `recentchanges` LEFT JOIN `watchlist` ON (wl_user = '1' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT JOIN `page` ON ((rc_cur_id=page_id)) LEFT JOIN `ores_classification` ON ((rc_this_oldid = oresc_rev AND oresc_class = 1)) LEFT JOIN `ores_model` ON ((oresc_model = oresm_id AND oresm_is_current = 1)) WHERE rc_bot = '0' AND (rc_timestamp >= '20160912000000') AND ((oresm_name = 'damaging' OR oresm_name IS NULL)) AND ((oresm_is_current != 0 OR oresm_is_current IS NULL)) AND rc_new IN ('0','1') ORDER BY rc_timestamp DESC LIMIT 50 ;
+------+--------------------+---------------------+--------+---------------------------------------+--------------+---------+-----------------------------------------------------------------------+---------+-------------+
| id   | select_type        | table               | type   | possible_keys                         | key          | key_len | ref                                                                   | rows    | Extra       |
+------+--------------------+---------------------+--------+---------------------------------------+--------------+---------+-----------------------------------------------------------------------+---------+-------------+
|    1 | PRIMARY            | recentchanges       | ref    | rc_timestamp,new_name_timestamp,tmp_2 | tmp_2        | 1       | const                                                                 | 4073302 | Using where |
|    1 | PRIMARY            | watchlist           | eq_ref | wl_user,namespace_title               | wl_user      | 265     | const,enwiki.recentchanges.rc_namespace,enwiki.recentchanges.rc_title |       1 |             |
|    1 | PRIMARY            | page                | eq_ref | PRIMARY                               | PRIMARY      | 4       | enwiki.recentchanges.rc_cur_id                                        |       1 |             |
|    1 | PRIMARY            | ores_classification | ref    | oresc_winner                          | oresc_winner | 4       | enwiki.recentchanges.rc_this_oldid                                    |       8 | Using where |
|    1 | PRIMARY            | ores_model          | eq_ref | PRIMARY                               | PRIMARY      | 2       | enwiki.ores_classification.oresc_model                                |       1 | Using where |
|    2 | DEPENDENT SUBQUERY | change_tag          | ref    | ct_rc_id,change_tag_rc_tag            | ct_rc_id     | 5       | enwiki.recentchanges.rc_id                                            |       1 | Using index |
+------+--------------------+---------------------+--------+---------------------------------------+--------------+---------+-----------------------------------------------------------------------+---------+-------------+
6 rows in set (0.00 sec)

mysql:research@s3-analytics-slave [enwiki]> SELECT rc_id,rc_timestamp,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_source,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,page_latest,(SELECT GROUP_CONCAT(ct_tag SEPARATOR ',') FROM `change_tag` WHERE ct_rc_id=rc_id ) AS `ts_tags`,oresc_probability,'0.5' AS ores_threshold FROM `recentchanges` LEFT JOIN `watchlist` ON (wl_user = '1' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT JOIN `page` ON ((rc_cur_id=page_id)) LEFT JOIN `ores_classification` ON ((rc_this_oldid = oresc_rev AND oresc_class = 1)) LEFT JOIN `ores_model` ON ((oresc_model = oresm_id AND oresm_is_current = 1)) WHERE rc_bot = '0' AND (rc_timestamp >= '20160912000000') AND ((oresm_name = 'damaging' OR oresm_name IS NULL)) AND ((oresm_is_current != 0 OR oresm_is_current IS NULL)) AND rc_new IN ('0','1') ORDER BY rc_timestamp DESC LIMIT 50 ;
[...]
50 rows in set (0.00 sec)

Query with hidenondamaging=1:

mysql:research@s3-analytics-slave [enwiki]> explain SELECT rc_id,rc_timestamp,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_source,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,page_latest,(SELECT GROUP_CONCAT(ct_tag SEPARATOR ',') FROM `change_tag` WHERE ct_rc_id=rc_id ) AS `ts_tags`,oresc_probability,'0.5' AS ores_threshold FROM `recentchanges` LEFT JOIN `watchlist` ON (wl_user = '1' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT JOIN `page` ON ((rc_cur_id=page_id)) INNER JOIN `ores_classification` ON ((rc_this_oldid = oresc_rev AND oresc_class = 1)) LEFT JOIN `ores_model` ON ((oresc_model = oresm_id AND oresm_is_current = 1)) WHERE rc_bot = '0' AND (rc_timestamp >= '20160912000000') AND ((oresm_name = 'damaging' OR oresm_name IS NULL)) AND ((oresm_is_current != 0 OR oresm_is_current IS NULL)) AND (oresc_probability > '0.5') AND rc_patrolled = '0' AND rc_new IN ('0','1') ORDER BY rc_timestamp DESC LIMIT 50 ;
+------+--------------------+---------------------+--------+---------------------------------------------+----------+---------+-----------------------------------------------------------------------+---------+----------------------------------------------+
| id   | select_type        | table               | type   | possible_keys                               | key      | key_len | ref                                                                   | rows    | Extra                                        |
+------+--------------------+---------------------+--------+---------------------------------------------+----------+---------+-----------------------------------------------------------------------+---------+----------------------------------------------+
|    1 | PRIMARY            | ores_classification | ALL    | oresc_winner                                | NULL     | NULL    | NULL                                                                  | 2903587 | Using where; Using temporary; Using filesort |
|    1 | PRIMARY            | recentchanges       | ref    | rc_timestamp,new_name_timestamp,tmp_1,tmp_2 | tmp_1    | 4       | enwiki.ores_classification.oresc_rev                                  |       1 | Using where                                  |
|    1 | PRIMARY            | watchlist           | eq_ref | wl_user,namespace_title                     | wl_user  | 265     | const,enwiki.recentchanges.rc_namespace,enwiki.recentchanges.rc_title |       1 |                                              |
|    1 | PRIMARY            | ores_model          | eq_ref | PRIMARY                                     | PRIMARY  | 2       | enwiki.ores_classification.oresc_model                                |       1 | Using where                                  |
|    1 | PRIMARY            | page                | eq_ref | PRIMARY                                     | PRIMARY  | 4       | enwiki.recentchanges.rc_cur_id                                        |       1 |                                              |
|    2 | DEPENDENT SUBQUERY | change_tag          | ref    | ct_rc_id,change_tag_rc_tag                  | ct_rc_id | 5       | enwiki.recentchanges.rc_id                                            |       1 | Using index                                  |
+------+--------------------+---------------------+--------+---------------------------------------------+----------+---------+-----------------------------------------------------------------------+---------+----------------------------------------------+
6 rows in set (0.04 sec)

mysql:research@s3-analytics-slave [enwiki]> SELECT rc_id,rc_timestamp,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_source,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,page_latest,(SELECT GROUP_CONCAT(ct_tag SEPARATOR ',') FROM `change_tag` WHERE ct_rc_id=rc_id ) AS `ts_tags`,oresc_probability,'0.5' AS ores_threshold FROM `recentchanges` LEFT JOIN `watchlist` ON (wl_user = '1' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT JOIN `page` ON ((rc_cur_id=page_id)) INNER JOIN `ores_classification` ON ((rc_this_oldid = oresc_rev AND oresc_class = 1)) LEFT JOIN `ores_model` ON ((oresc_model = oresm_id AND oresm_is_current = 1)) WHERE rc_bot = '0' AND (rc_timestamp >= '20160912000000') AND ((oresm_name = 'damaging' OR oresm_name IS NULL)) AND ((oresm_is_current != 0 OR oresm_is_current IS NULL)) AND (oresc_probability > '0.5') AND rc_patrolled = '0' AND rc_new IN ('0','1') ORDER BY rc_timestamp DESC LIMIT 50 ;
[...]
50 rows in set (2 min 17.63 sec)

Event Timeline

Looks like STRAIGHT_JOIN works around the optimizer bug:

mysql:research@s3-analytics-slave [enwiki]> explain SELECT /*!STRAIGHT_JOIN*/ rc_id,rc_timestamp,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_source,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,page_latest,(SELECT GROUP_CONCAT(ct_tag SEPARATOR ',') FROM `change_tag` WHERE ct_rc_id=rc_id ) AS `ts_tags`,oresc_probability,'0.5' AS ores_threshold FROM `recentchanges` LEFT JOIN `watchlist` ON (wl_user = '1' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT JOIN `page` ON ((rc_cur_id=page_id)) INNER JOIN `ores_classification` ON ((rc_this_oldid = oresc_rev AND oresc_class = 1)) LEFT JOIN `ores_model` ON ((oresc_model = oresm_id AND oresm_is_current = 1)) WHERE rc_bot = '0' AND (rc_timestamp >= '20160912000000') AND ((oresm_name = 'damaging' OR oresm_name IS NULL)) AND ((oresm_is_current != 0 OR oresm_is_current IS NULL)) AND (oresc_probability > '0.5') AND rc_patrolled = '0' AND rc_new IN ('0','1') ORDER BY rc_timestamp DESC LIMIT 50 ;
+------+--------------------+---------------------+--------+---------------------------------------------+--------------+---------+-----------------------------------------------------------------------+---------+-------------+
| id   | select_type        | table               | type   | possible_keys                               | key          | key_len | ref                                                                   | rows    | Extra       |
+------+--------------------+---------------------+--------+---------------------------------------------+--------------+---------+-----------------------------------------------------------------------+---------+-------------+
|    1 | PRIMARY            | recentchanges       | ref    | rc_timestamp,new_name_timestamp,tmp_1,tmp_2 | tmp_2        | 1       | const                                                                 | 4071169 | Using where |
|    1 | PRIMARY            | watchlist           | eq_ref | wl_user,namespace_title                     | wl_user      | 265     | const,enwiki.recentchanges.rc_namespace,enwiki.recentchanges.rc_title |       1 |             |
|    1 | PRIMARY            | page                | eq_ref | PRIMARY                                     | PRIMARY      | 4       | enwiki.recentchanges.rc_cur_id                                        |       1 |             |
|    1 | PRIMARY            | ores_classification | ref    | oresc_winner                                | oresc_winner | 4       | enwiki.recentchanges.rc_this_oldid                                    |       8 | Using where |
|    1 | PRIMARY            | ores_model          | eq_ref | PRIMARY                                     | PRIMARY      | 2       | enwiki.ores_classification.oresc_model                                |       1 | Using where |
|    2 | DEPENDENT SUBQUERY | change_tag          | ref    | ct_rc_id,change_tag_rc_tag                  | ct_rc_id     | 5       | enwiki.recentchanges.rc_id                                            |       1 | Using index |
+------+--------------------+---------------------+--------+---------------------------------------------+--------------+---------+-----------------------------------------------------------------------+---------+-------------+
6 rows in set (0.01 sec)

Change 311637 had a related patch set uploaded (by Catrope):
Use STRAIGHT_JOIN to work around optimizer bug for hidenondamaging=1

https://gerrit.wikimedia.org/r/311637

Change 311637 merged by jenkins-bot:
Use STRAIGHT_JOIN to work around optimizer bug for hidenondamaging=1

https://gerrit.wikimedia.org/r/311637

Halfak claimed this task.