6

Our MySQL database will contain an encyclopedia. The encyclopedia will be shown in pages where every page contains entries starting from a letter.

Which indexes should I use? Should I have two indexes for the field "title" (one index of the length 255 for ordering and one index of the length 1 for indexing by the first letter)?

CREATE TABLE `cyclopedy` (
  `id` int(10) unsigned NOT NULL auto_increment,
  `title` varchar(255) collate utf8_bin NOT NULL,
  `article` mediumtext collate utf8_bin NOT NULL,
  PRIMARY KEY  (`id`),
  KEY `title` (`title`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

Well, now I understand that length 1 is not enough for indexing by the first letter, because an UTF-8 letter may consist of three bytes.

1 Answer 1

7

An index can seek by a subset of characters, as long as you're searching from the left. E.g., "Inter%" can seek, "%net" will not.

However, the first character is not necessarily the character under which the article would be sorted. "The Internet" should go under "I", not "T". You probably need two fields, DisplayTitle and SortTitle; a single-character index on the latter may be worthwhile, but most likely a full-length index will be just fine.

Indexes are typically B-trees, and a seek will jump to the right location about equally quickly whether you have 10 or 100 entries per page. Scans are another matter, but I'd start with the simplest solution and add an extra index only if performance proves inadequate in practice.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.