Boost Performance With Mysql 5.1 Partitions: Giuseppe Maxia Mysql Community Team Lead Sun Microsystems
Boost Performance With Mysql 5.1 Partitions: Giuseppe Maxia Mysql Community Team Lead Sun Microsystems
Boost Performance With Mysql 5.1 Partitions: Giuseppe Maxia Mysql Community Team Lead Sun Microsystems
with
MySQL 5.1 partitions
Giuseppe Maxia
MySQL Community Team Lead
Sun Microsystems
about me
2
Giuseppe Maxia
MySQL 5.1 GA
MySQL 5.1 GA
MySQL 5.1 GA
YOUR
NEEDS
6
The problem(s)
data
RAM
data
INDEXES
INDEXES
INDEXES
RAM
data
INDEXES
10
WHAT
11
What is it?
Logical splitting of tables
Transparent to user
12
Logical splitting
No need to create separate tables
No need to move chunks of data across files
13
SQL
TRANSPARENT
14
separate tables
risk of duplicates
insert in each table
no constraints
15
One table
No risk of duplicates
insert in one table
constraints enforced
16
17
18
19
20
21
22
23
24
25
logical split
but you can also split the data physically
granular partition (subpartitioning)
different methods (range, list, hash, key)
26
WHY
27
28
HOW
29
30
RTFM ...
30
RTFM ...
30
RTFM ...
30
33
34
Limitations
Can partition only by INTEGER columns
OR you can partition by an expression, which must
return an integer
Maximum 1024 partitions
If you have a Unique Key or PK, the partition
column must be part of that key
No Foreign Key support
No Fulltext or GIS support
35
PARTITIONING BY DATE
CREATE TABLE t1 (
d date
) ENGINE=InnoDB
PARTITION BY RANGE (year(d))
(
PARTITION P1 VALUES
LESS THAN (1999),
PARTITION P2 VALUES
LESS THAN (2000)
)
36
PARTITIONING BY DATE
CREATE TABLE t1 (
d date
) ENGINE=InnoDB
PARTITION BY RANGE (to_days(d))
(
PARTITION P1 VALUES
LESS THAN (to_days('1999-01-01')),
PARTITION P2 VALUES
LESS THAN (to_days('2000-01-01'))
37
)
Partition
pruning
38
Partition pruning
1a - unpartitioned table - SINGLE RECORD
select *
from
table_name
where colx =
120
Partition pruning
1a - unpartitioned table - SINGLE RECORD
D
A
T
A
I
N
D
E
X
select *
from
table_name
where colx =
120
Partition pruning
1b - unpartitioned table - SINGLE RECORD
select *
from
table_name
where colx =
350
Partition pruning
1c - unpartitioned table - RANGE
select *
from
table_name
where colx
between 120
and 230
Partition pruning
2a - table partitioned by colx - SINGLE REC
1-99
100-199
200-299
300-399
400-499
500-599
select *
from
table_name
where colx =
120
Partition pruning
2a - table partitioned by colx - SINGLE REC
1-99
D
A
T
A
100-199
200-299
300-399
400-499
500-599
I
N
D
E
X
select *
from
table_name
where colx =
120
Partition pruning
2b - table partitioned by colx - SINGLE REC
1-99
100-199
200-299
300-399
400-499
500-599
select *
from
table_name
where colx =
350
Partition pruning
2c - table partitioned by colx - RANGE
1-99
100-199
200-299
300-399
400-499
500-599
select *
from
table_name
where colx
between 120
and 230
Partition pruning
EXPLAIN
select *
before from table_name
where colx = 120
in 5.1
EXPLAIN PARTITIONS
select *
from table_name
where colx = 120
WHEN
51
52
HANDS
ON
53
54
54
54
55
56
57
58
59
60
61
# NOT PARTITIONED
62
63
LEVERAGING
REPLICATION
64
Replication schemes
MASTER
INNODB
NOT
PARTITIONED
INNODB
NOT
PARTITIONED
SLAVE
concurrent insert
concurrent read
INNODB
PARTITIONED
BY RANGE
SLAVE
concurrent batch processing
MyISAM
PARTITIONED
BY RANGE
SLAVE
large batch processing
65
Replication schemes
MASTER
SLAVE
INNODB
PARTITIONED
MyISAM
PARTITIONED
batch processing
concurrent insert
INNODB
NON
PARTITIONED
SLAVE
66
concurrent reads
MASTER
INNODB
NOT
PARTITIONED
ARCHIVE
PARTITIONED
SLAVE BY RANGE
(locations)
concurrent insert
ARCHIVE
PARTITIONED
BY RANGE
(date)
dimensional processing
SLAVE
dimensional processing
SLAVE
ARCHIVE
PARTITIONED
BY RANGE
(product)
67
dimensional processing
PITFALLS
68
Expressions
Partition by function
Search by column
69
WRONG
PARTITION BY RANGE(year(from_date))
select count(*) from salaries where
year(from_date) = 1998;
+----------+
| count(*) |
+----------+
|
247489 |
+----------+
1 row in set (2.25 sec)
70
RIGHT
PARTITION BY RANGE(year(from_date))
select count(*) from salaries where
from_date between '1998-01-01' and
'1998-12-31';
+----------+
| count(*) |
+----------+
|
247489 |
+----------+
1 row in set (0.46 sec)
71
EXPLAIN
explain partitions select count(*)
from salaries where year(from_date)
= 1998\G
***** 1. row ****
id: 1
select_type: SIMPLE
table: salaries
partitions:
p01,p02,p03,p04,p05,p06,p07,p08,p09
,p10,p11,p12,p13,p14,p15,p16,p17,p1
8,p19
type: index
72
...
EXPLAIN
explain partitions select count(*)
from salaries where from_date
between '1998-01-01' and
'1998-12-31'\G
***** 1. row ****
id: 1
select_type: SIMPLE
table: salaries
partitions: p14,p15
...
73
BENCHMARKS
74
engine
innodb
myisam
archive
innodb partitioned (whole)
innodb partitioned (file per table)
myisam partitioned
archive partitioned
storage
(MB)
221
181
74
289
676
182
72
query year
2000
1.25
query year
2002
1.25
MyISAM
1.72
1.73
Archive
2.47
2.45
InnoDB partitioned
whole
InnoDB Partitioned
(file per table)
MyISAM partitioned
0.24
0.10
0.45
0.10
0.18
0.12
Archive partitioned
0.22
0.12
engine
innodb (with PK)
myisam (with PK)
archive
innodb partitioned (no PK)
myisam partitioned (no PK)
archive partitioned
storage
(GB)
330
141
13
237
107
13
6 month range
InnoDB
4 min 30s
MyISAM
25.03s
Archive
22 min 25s
13.19
6.31
4.45
16.67
8.97
TOOLS
79
helper-improving-usability.html
> A Perl script that creates partitioning statements
80
THAN
THAN
THAN
THAN
THAN
THAN
THAN
THAN
THAN
THAN
(1000)
(2000)
(3000)
(4000)
(5000)
(6000)
(7000)
(8000)
(9000)
(10000)
81
(to_days('2004-01-01'))
(to_days('2005-01-01'))
(to_days('2006-01-01'))
(to_days('2007-01-01'))
(to_days('2008-01-01'))
(to_days('2009-01-01'))
82
(to_days('2008-01-01'))
(to_days('2008-02-01'))
(to_days('2008-03-01'))
(to_days('2008-04-01'))
(to_days('2008-05-01'))
(to_days('2008-06-01'))
(to_days('2008-07-01'))
(to_days('2008-08-01'))
(to_days('2008-09-01'))
(to_days('2008-10-01'))
(to_days('2008-11-01'))
(to_days('2008-12-01'))
(to_days('2009-01-01'))
83
TIPS
84
TIPS
For large table ( indexes > available RAM)
> DO NOT USE INDEXES
> PARTITIONS ARE MORE EFFICIENT
85
TIPS
Before adopting partitions in production, benchmark!
you can create partitions of different sizes
> in historical tables, you can partition
current data by day
recent data by month
distant past data by year
> ALL IN THE SAME TABLE!
TIPS
For large historical tables, consider using ARCHIVE
+ partitions
> You see benefits for very large amounts of data ( > 500
MB)
> Very effective when you run statistics
> Almost the same performance of MyISAM
> but 1/10 of storage
87
Thank you!
Giuseppe Maxia
http://[email protected]
[email protected]
BONUS
SLIDES
89
ANNOYANCES
90
91
91
92
92
93
95