0

This question is related to this one.

I have a page table with the following structure:

CREATE TABLE  mydatabase.page (
  pageid int(10) unsigned NOT NULL auto_increment,
  sourceid int(10) unsigned default NULL,
  number int(10) unsigned default NULL,
  data mediumtext,
  processed int(10) unsigned default NULL,
  PRIMARY KEY  (pageid),
  KEY sourceid (sourceid)
) ENGINE=MyISAM AUTO_INCREMENT=9768 DEFAULT CHARSET=latin1;

The data column contains text whose size is around 80KB - 200KB per record. The total size of the data stored in the data column is around 1.5GB.

Executing this query takes 0.08 seconds:

select pageid from page

But executing this query takes around 130.0 seconds:

select sourceid from page

As you see, I've got a primary index on page.pageid and an index on page.sourceid. So should the second query be taking THAT long?

Edit #1

EXPLAIN returned

id select_type table type  possible_keys key      key_len ref rows Extra
1  SIMPLE      page  index               sourceid 5           9767 Using index

I'm sorry but profiling didn't work... MySQL (its 4.1.22) did not recognize SHOW PROFILE query.

SHOW INDEX returned

Table Non_unique Key_name  Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment
page  0          PRIMARY   1            pageid      A         9767                             BTREE 
page  1          sourceid  1            sourceid    A         3255                        YES  BTREE 
3
  • 1
    Please do a "EXPLAIN select sourceid from page" and enable pforiling for this query: "SET profiling = 1;", execute query, "SHOW PROFILE;" and then disable profiling "SET profiling = 2;" and past the results. Commented May 11, 2009 at 8:14
  • Oops - first of all it should read "profiling" not "pforiling" and then it should be "SET profiling = 0;" to disable profiling. Commented May 11, 2009 at 8:15
  • Just edited post as requested Commented May 11, 2009 at 12:01

3 Answers 3

1

Did you try to enforce the use of the index? Like:

SELECT sourceid FROM page USE INDEX (sourceid_index)

Like sgehrig comments, check using EXPLAIN if the index is used? And share the result?

EXPLAIN select sourceid from page

It could also help to share the definiton of the indexes:

SHOW INDEX FROM page
5
  • I've revised question and added the information you requested Commented May 11, 2009 at 12:08
  • ahhh... select sql_no_cache sourceid from page use index (sourceid) worked and the query took 0.09 seconds. For some reason mysql is unable to figure out the index to use on its own. i now need a query that forces an index in cascaded join (pagedetail > page > source) Commented May 11, 2009 at 12:13
  • 1
    Can you really confirm that only the combination of SQL_NO_CACHE and USE INDEX brings the expected speed benefit? Could it be that SQL_NO_CACHE is the determinant factor? Commented May 11, 2009 at 12:19
  • 1
    I use SQL_NO_CACHE for testing the actual performance of the query. If i dont use this keyword (which is what i normally do) the query runs slow for the first time, later the query results come from the cache which is always faster but not the solution to the problem. SQL_NO_CACHE forces mysql not to use/save-results-in cache returning the actual time the query takes in normal circumstances. Commented May 11, 2009 at 12:43
  • That's exactly why I asked - I wanted to rule out the option that the query cache had some influence on the execution time. So obviously MySQL needs the index hint to make use of the covering index. Commented May 11, 2009 at 13:02
0

How different are your sourceid fields? If you have only a few different sourceid values compared to the number of rows then you can try increasing the size of the index.

1
  • sourceid contains around 3500 distinct values. It refers to a source table that has around 3500 rows. Each source contains 0 to 700 pages. Commented May 11, 2009 at 12:00
0

As MySQL 4.1.22 is fairly old (02 November 2006) I'd suspect that it doesn't support the notion of covering indexes for secondary keys. EXPLAIN shows that the query actually uses the index, so I'd assume that the additional time is needed to read all the result rows (instead of just returning the index content when using covering indexes) to extract the sourceid column.

Do you have the possibility to check the query on a more recent MySQL server version?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.