Boost Performance With Mysql 5.1 Partitions: Giuseppe Maxia Mysql Community Team Lead Sun Microsystems

Download as pdf or txt
Download as pdf or txt
You are on page 1of 106

Boost performance

with
MySQL 5.1 partitions
Giuseppe Maxia
MySQL Community Team Lead
Sun Microsystems

Who's this guy?

about me
2

Giuseppe Maxia

a.k.a. The Data Charmer


MySQL Community Team Lead
Long time hacking with MySQL features
Formerly, database consultant, designer, coder.
A passion for QA
An even greater passion for open source
... and community
Passionate blogger
http://datacharmer.blogspot.com
3

MySQL 5.1 GA

MySQL 5.1 GA

MySQL 5.1 GA

Defining the problem

YOUR
NEEDS
6

The problem(s)

Too much data


Not enough RAM
Historical data
Growing data
Rotating data

Too much data

data

Not enough RAM

RAM

data

INDEXES

INDEXES

Not enough RAM

INDEXES
RAM
data

INDEXES

10

MySQL 5.1 partitions

WHAT
11

What is it?
Logical splitting of tables
Transparent to user

12

Logical splitting
No need to create separate tables
No need to move chunks of data across files

13

MySQL 5.1 partitions

SQL

TRANSPARENT
14

COMPARED TO MERGE TABLES


MERGE TABLE

separate tables
risk of duplicates
insert in each table
no constraints

15

COMPARED TO MERGE TABLES


PARTITIONED TABLE

One table
No risk of duplicates
insert in one table
constraints enforced

16

Wait a minute ...


WHAT THE HELL DOES "LOGICAL SPLIT"
REALLY MEANS?
LET ME EXPLAIN ...

17

Physical partitioning (1)


Take a map

18

Physical partitioning (2)


cut it into pieces

19

Physical partitioning (3)


What you have, is several different pieces

20

Physical partitioning (4)


If you want the map back, you need some
application (adhesive tape) and you may get it
wrong

21

Logical partitioning (1)


Take a map

22

Logical partitioning (2)


fold it to show the piece you need

23

Logical partitioning (3)


what you have is still a map, even if you see only
one part.

24

Logical partitioning (4)


if you unfold the map, you still have the whole thing

25

What partitions can do

logical split
but you can also split the data physically
granular partition (subpartitioning)
different methods (range, list, hash, key)

26

MySQL 5.1 partitions

WHY
27

4 main reasons to use partitions

To make single inserts and selects faster


To make range selects faster
To help split the data across different paths
To store historical data efficiently
If you need to delete large chunks of data
INSTANTLY

28

MySQL 5.1 partitions

HOW
29

HOW TO MAKE PARTITIONS

30

HOW TO MAKE PARTITIONS

RTFM ...

30

HOW TO MAKE PARTITIONS

RTFM ...

No, seriously, the manual has everything

30

HOW TO MAKE PARTITIONS

RTFM ...

No, seriously, the manual has everything


But if you absolutely insist ...

30

HOW TO MAKE PARTITIONS


CREATE TABLE t1 (
id int
) ENGINE=InnoDB
# or MyISAM, ARCHIVE
PARTITION BY RANGE (id)
(
PARTITION P1 VALUES LESS THAN (10),
PARTITION P2 VALUES LESS THAN (20)
)
31

HOW TO MAKE PARTITIONS


CREATE TABLE t1 (
id int
) ENGINE=InnoDB
PARTITION BY LIST (id)
(
PARTITION P1 VALUES IN (1,2,4),
PARTITION P2 VALUES IN (3,5,9)
)
32

HOW TO MAKE PARTITIONS


CREATE TABLE t1 (
id int not null primary key
) ENGINE=InnoDB
PARTITION BY HASH (id)
PARTITIONS 10;

33

HOW TO MAKE PARTITIONS


CREATE TABLE t1 (
id int not null primary key
) ENGINE=InnoDB
PARTITION BY KEY ()
PARTITIONS 10;

34

Limitations
Can partition only by INTEGER columns
OR you can partition by an expression, which must
return an integer
Maximum 1024 partitions
If you have a Unique Key or PK, the partition
column must be part of that key
No Foreign Key support
No Fulltext or GIS support
35

PARTITIONING BY DATE
CREATE TABLE t1 (
d date
) ENGINE=InnoDB
PARTITION BY RANGE (year(d))
(
PARTITION P1 VALUES
LESS THAN (1999),
PARTITION P2 VALUES
LESS THAN (2000)
)

36

PARTITIONING BY DATE
CREATE TABLE t1 (
d date
) ENGINE=InnoDB
PARTITION BY RANGE (to_days(d))
(
PARTITION P1 VALUES
LESS THAN (to_days('1999-01-01')),
PARTITION P2 VALUES
LESS THAN (to_days('2000-01-01'))
37
)

MySQL 5.1 partitions

Partition
pruning
38

Partition pruning
1a - unpartitioned table - SINGLE RECORD

select *
from
table_name
where colx =
120

Partition pruning
1a - unpartitioned table - SINGLE RECORD

D
A
T
A

I
N
D
E
X

select *
from
table_name
where colx =
120

Partition pruning
1b - unpartitioned table - SINGLE RECORD

select *
from
table_name
where colx =
350

Partition pruning
1c - unpartitioned table - RANGE

select *
from
table_name
where colx
between 120
and 230

Partition pruning
2a - table partitioned by colx - SINGLE REC
1-99
100-199
200-299
300-399
400-499
500-599

select *
from
table_name
where colx =
120

Partition pruning
2a - table partitioned by colx - SINGLE REC
1-99

D
A
T
A

100-199
200-299
300-399
400-499
500-599

I
N
D
E
X

select *
from
table_name
where colx =
120

Partition pruning
2b - table partitioned by colx - SINGLE REC
1-99
100-199
200-299
300-399
400-499
500-599

select *
from
table_name
where colx =
350

Partition pruning
2c - table partitioned by colx - RANGE
1-99
100-199
200-299
300-399
400-499
500-599

select *
from
table_name
where colx
between 120
and 230

Partition pruning

EXPLAIN
select *
before from table_name
where colx = 120

in 5.1

EXPLAIN PARTITIONS
select *
from table_name
where colx = 120

Partition pruning - not using partition column


explain partitions select count(*)
from table_name where colx=120\G
***** 1. row ****
id: 1
select_type: SIMPLE
table: table_name
partitions:
p01,p02,p03,p04,p05,p06,p07,p08,p09,
p10,p11,p12,p13,p14,p15,p16,p17,p18,
p19
type: index
...

Partition pruning - not using partition column


explain partitions select count(*)
from table_name where colx between
120 and 230\G
***** 1. row ****
id: 1
select_type: SIMPLE
table: table_name
partitions:
p01,p02,p03,p04,p05,p06,p07,p08,p09,
p10,p11,p12,p13,p14,p15,p16,p17,p18,
p19
type: index
...

Partition pruning - table partitioned by colx


explain partitions select count(*)
from table_name where colx between
120 and 230\G
***** 1. row ****
id: 1
select_type: SIMPLE
table: table_name
partitions: p02,p03
type: index
...

MySQL 5.1 partitions

WHEN
51

When to use partitions?


if you have large tables
if you know that you will always query for the
partitioning column
if you have historical tables that you want to purge
quickly
if your indexes are larger than the available RAM

52

MySQL 5.1 partitions

HANDS
ON
53

components used for testing

54

components used for testing


MySQL Sandbox
> created MySQL instances in seconds
> http://launchpad.net/mysql-sandbox

54

components used for testing


MySQL Sandbox
> created MySQL instances in seconds
> http://launchpad.net/mysql-sandbox

MySQL Employees Test Database


> has ~ 4 mil records in 6 tables
> http://launchpad.net/test-db

54

components used for testing


MySQL Sandbox
> created MySQL instances in seconds
> http://launchpad.net/mysql-sandbox

MySQL Employees Test Database


> has ~ 4 mil records in 6 tables
> http://launchpad.net/test-db

Command line ability


> fingers
> ingenuity
54

employees test database

55

How many partitions


from information_schema.partitions
+-------+-----------------+----------+
| pname | expr
| descr
|
+-------+-----------------+----------+
| p01
| year(from_date) | 1985
|
| p02
| year(from_date) | 1986
|
...
| p13
| year(from_date) | 1997
|
| p14
| year(from_date) | 1998
|
| p15
| year(from_date) | 1999
|
| p16
| year(from_date) | 2000
|
...
| p19
| year(from_date) | MAXVALUE |
+-------+-----------------+----------+

56

How many records


select count(*) from salaries;
+----------+
| count(*) |
+----------+
| 2844047 |
+----------+
1 row in set (0.01 sec)

57

How many records in 1998


not partitioned
select count(*) from salaries
where from_date between
'1998-01-01' and '1998-12-31';
+----------+
| count(*) |
+----------+
|
247489 |
+----------+
1 row in set (1.52 sec)
# NOT PARTITIONED

58

How many records in 1998


PARTITIONED

select count(*) from salaries


where from_date between
'1998-01-01' and '1998-12-31';
+----------+
| count(*) |
+----------+
|
247489 |
+----------+
1 row in set (0.41 sec)
# partition p15

59

How many records in 1999


not partitioned

select count(*) from salaries


where from_date between
'1999-01-01' and '1999-12-31';
+----------+
| count(*) |
+----------+
|
260957 |
+----------+
1 row in set (1.54 sec)
# NOT PARTITIONED

60

How many records in 1999


PARTITIONED

select count(*) from salaries


where from_date between
'1999-01-01' and '1999-12-31';
+----------+
| count(*) |
+----------+
|
260957 |
+----------+
1 row in set (0.17 sec)
# partition p16

61

Deleting 1998 records


not partitioned
delete from salaries where
from_date between '1998-01-01'
and '1998-12-31';
Query OK, 247489 rows affected
(19.13 sec)

# NOT PARTITIONED
62

Deleting 1998 records


partitioned
alter table salaries drop
partition p15;
Query OK, 0 rows affected (1.35
sec)

63

MySQL 5.1 partitions

LEVERAGING
REPLICATION
64

Replication schemes

MASTER

INNODB
NOT
PARTITIONED

INNODB
NOT
PARTITIONED

SLAVE

concurrent insert

concurrent read
INNODB
PARTITIONED
BY RANGE

SLAVE
concurrent batch processing

MyISAM
PARTITIONED
BY RANGE

SLAVE
large batch processing

65

Replication schemes

MASTER

SLAVE

INNODB
PARTITIONED

MyISAM
PARTITIONED

batch processing

concurrent insert

INNODB
NON
PARTITIONED

SLAVE
66

concurrent reads

Replication schemes - dimensions

MASTER

INNODB
NOT
PARTITIONED

ARCHIVE
PARTITIONED
SLAVE BY RANGE
(locations)

concurrent insert
ARCHIVE
PARTITIONED
BY RANGE
(date)

dimensional processing

SLAVE

dimensional processing

SLAVE

ARCHIVE
PARTITIONED
BY RANGE
(product)
67

dimensional processing

MySQL 5.1 partitions

PITFALLS
68

Expressions
Partition by function
Search by column

69

WRONG
PARTITION BY RANGE(year(from_date))
select count(*) from salaries where
year(from_date) = 1998;
+----------+
| count(*) |
+----------+
|
247489 |
+----------+
1 row in set (2.25 sec)
70

RIGHT
PARTITION BY RANGE(year(from_date))
select count(*) from salaries where
from_date between '1998-01-01' and
'1998-12-31';
+----------+
| count(*) |
+----------+
|
247489 |
+----------+
1 row in set (0.46 sec)
71

EXPLAIN
explain partitions select count(*)
from salaries where year(from_date)
= 1998\G
***** 1. row ****
id: 1
select_type: SIMPLE
table: salaries
partitions:
p01,p02,p03,p04,p05,p06,p07,p08,p09
,p10,p11,p12,p13,p14,p15,p16,p17,p1
8,p19
type: index
72
...

EXPLAIN
explain partitions select count(*)
from salaries where from_date
between '1998-01-01' and
'1998-12-31'\G
***** 1. row ****
id: 1
select_type: SIMPLE
table: salaries
partitions: p14,p15
...
73

MySQL 5.1 partitions

BENCHMARKS
74

Partitions benchmarks (laptop)

engine
innodb
myisam
archive
innodb partitioned (whole)
innodb partitioned (file per table)
myisam partitioned
archive partitioned

storage
(MB)
221
181
74
289
676
182
72

Benchmarking results (laptop)


engine
InnoDB

query year
2000
1.25

query year
2002
1.25

MyISAM

1.72

1.73

Archive

2.47

2.45

InnoDB partitioned
whole
InnoDB Partitioned
(file per table)
MyISAM partitioned

0.24

0.10

0.45

0.10

0.18

0.12

Archive partitioned

0.22

0.12

Partitioning storage (huge server)

engine
innodb (with PK)
myisam (with PK)
archive
innodb partitioned (no PK)
myisam partitioned (no PK)
archive partitioned

storage
(GB)
330
141
13
237
107
13

Benchmarking results (huge server)


engine

6 month range

InnoDB

4 min 30s

MyISAM

25.03s

Archive

22 min 25s

InnoDB partitioned by month

13.19

MyISAM partitioned by year

6.31

MyISAM partitioned by month

4.45

Archive partitioned by year

16.67

Archive partitioned by month

8.97

MySQL 5.1 partitions

TOOLS
79

The partition helper


The INFORMATION_SCHEMA.PARTITIONS table
The partition helper
> http://datacharmer.blogspot.com/2008/12/partition-

helper-improving-usability.html
> A Perl script that creates partitioning statements

... anything you are creating right now :)

80

the partition helper


./partitions_helper \
--table=mytable --column=prod_id \
--interval=1000 --start=1 --end=10000
ALTER TABLE mytable
PARTITION by range (prod_id)
(
partition p001 VALUES LESS
, partition p002 VALUES LESS
, partition p003 VALUES LESS
, partition p004 VALUES LESS
, partition p005 VALUES LESS
, partition p006 VALUES LESS
, partition p007 VALUES LESS
, partition p008 VALUES LESS
, partition p009 VALUES LESS
, partition p010 VALUES LESS
);

THAN
THAN
THAN
THAN
THAN
THAN
THAN
THAN
THAN
THAN

(1000)
(2000)
(3000)
(4000)
(5000)
(6000)
(7000)
(8000)
(9000)
(10000)
81

the partition helper


./partitions_helper
--table=mytable --column=d --interval=year \
--start=2004-01-01 --end=2009-01-01
ALTER TABLE mytable
PARTITION by range (to_date(d))
(
partition p001 VALUES LESS THAN
, partition p002 VALUES LESS THAN
, partition p003 VALUES LESS THAN
, partition p004 VALUES LESS THAN
, partition p005 VALUES LESS THAN
, partition p006 VALUES LESS THAN
);

(to_days('2004-01-01'))
(to_days('2005-01-01'))
(to_days('2006-01-01'))
(to_days('2007-01-01'))
(to_days('2008-01-01'))
(to_days('2009-01-01'))

82

the partition helper


./partitions_helper --table=mytable \
--column=d --interval=month \
--start=2008-01-01 --end=2009-01-01
ALTER TABLE mytable
PARTITION by range (to_date(d))
(
partition p001 VALUES LESS THAN
, partition p002 VALUES LESS THAN
, partition p003 VALUES LESS THAN
, partition p004 VALUES LESS THAN
, partition p005 VALUES LESS THAN
, partition p006 VALUES LESS THAN
, partition p007 VALUES LESS THAN
, partition p008 VALUES LESS THAN
, partition p009 VALUES LESS THAN
, partition p010 VALUES LESS THAN
, partition p011 VALUES LESS THAN
, partition p012 VALUES LESS THAN
, partition p013 VALUES LESS THAN
);

(to_days('2008-01-01'))
(to_days('2008-02-01'))
(to_days('2008-03-01'))
(to_days('2008-04-01'))
(to_days('2008-05-01'))
(to_days('2008-06-01'))
(to_days('2008-07-01'))
(to_days('2008-08-01'))
(to_days('2008-09-01'))
(to_days('2008-10-01'))
(to_days('2008-11-01'))
(to_days('2008-12-01'))
(to_days('2009-01-01'))
83

MySQL 5.1 partitions

TIPS
84

TIPS
For large table ( indexes > available RAM)
> DO NOT USE INDEXES
> PARTITIONS ARE MORE EFFICIENT

85

TIPS
Before adopting partitions in production, benchmark!
you can create partitions of different sizes
> in historical tables, you can partition
current data by day
recent data by month
distant past data by year
> ALL IN THE SAME TABLE!

If you want to enforce a constraint on a integer


column, you may set a partition by list, with only the
values you want to admit.
86

TIPS
For large historical tables, consider using ARCHIVE
+ partitions
> You see benefits for very large amounts of data ( > 500

MB)
> Very effective when you run statistics
> Almost the same performance of MyISAM
> but 1/10 of storage

87

Thank you!

Giuseppe Maxia
http://[email protected]
[email protected]

MySQL 5.1 partitions

BONUS
SLIDES
89

MySQL 5.1 partitions

ANNOYANCES
90

Annoyances with partitions


CREATE TABLE STATEMENT hard to read

91

Annoyances with partitions


CREATE TABLE STATEMENT hard to read

91

Annoyances with partitions


hard to read (FIXED!! in 5.1.31)

92

Annoyances with partitions


hard to read (FIXED!! in 5.1.31)

92

Annoyances with partitions


CREATE TABLE only keeps the result of the
expression for each partition.
(you can use the information_schema to ease the
pain in this case)

93

Annoyances with partitions


CREATE TABLE t1 (
d date
) ENGINE=InnoDB
PARTITION BY RANGE (to_days(d))
(
PARTITION P1 VALUES
LESS THAN (to_days('1999-01-01')),
PARTITION P2 VALUES
LESS THAN (to_days('2000-01-01'))
94
)

Annoyances with partitions


CREATE TABLE `t1` (
`d` date DEFAULT NULL
) ENGINE=InnoDB DEFAULT
CHARSET=latin1
/*!50100 PARTITION BY RANGE
(to_days(d))
(PARTITION P1 VALUES LESS THAN
(730120) ENGINE = InnoDB,
PARTITION P2 VALUES LESS THAN
(730485) ENGINE = InnoDB) */

95

Fixing annoyances with partitions


select partition_name part,
partition_expression expr,
partition_description val
from information_schema.partitions
where table_name='t1';
+------+------------+--------+
| part | expr
| val
|
+------+------------+--------+
| P1
| to_days(d) | 730120 |
| P2
| to_days(d) | 730485 |
96
+------+------------+--------+

fixing annoyances with partitions


select partition_name part ,
partition_expression expr,
from_days(partition_description) val
from information_schema.partitions
where table_name='t1';
+------+------------+------------+
| part | expr
| val
|
+------+------------+------------+
| P1
| to_days(d) | 1999-01-01 |
| P2
| to_days(d) | 2000-01-01 |
97
+------+------------+------------+

You might also like