Database Partitioning A Review Paper

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

International Journal of Innovative Technology and Exploring Engineering (IJITEE)

ISSN: 2278-3075, Volume-3, Issue-5, October 2013

Database Partitioning: A Review Paper


Mayur Sawant, Kishor Kinage, Pooja Pilankar, Nikhil Chaudhari

 Partition for availability simply follows the ‘divide and


Abstract— Data management is much tedious task in growing conquer’ approach to manage the data. It helps in
data environment. Partitioning is the best possible solution which maintenance operation where we need mission critical tables.
is partially accepted. Partitioning provides availability,
maintenance and improvised query performance to the database
Storing different partitions at different location enhances
users. This paper focuses the three key methods of partitioning
and helps to reduce the delay in response time. Paper also availability issue.
investigates the composite partition strategies which includes the A. Partition Strategies
date, range and hash partitions. The paper shows the encouraging
result with partitioning methods and basic composite partition Oracle partitioning offers three fundamental basic

 Range
strategies. partition strategies

 Hash
 List
Index Terms— Database partitioning, Dbms_Redefinition,
Range Partitioning, Hash Partitioning, List Partitioning
Using the partition strategies a table can either be
I. INTRODUCTION
 Single-Level partitioning
partitioned as a single list or as a composite partitioned table.

 Composite partitioning
Partitioning allows the table, index and index-organized
table to be decomposed into the smaller parts called as
partitions. Each Partition has its own name and optionally has Range partitioning separates the data according to range of
its characteristics. values of the partitioning key. For partition of July 2013, the
Partition key is the secret to the partition. It comprises of partitioning key values from 1st July 2013 to 31st July 2013.
one or more column that decides the partition. Any table can Each partition has ‘VALUES LESS THAN’ clause and
be partitioned except those CLOB (Character Large Object) MAXVALUE can be defined to compare with the highest
and BLOB (Binary Large Object) data types. value.
Users need to follow suggestions while partitioning the Hash partitioning maps the data according to hash
tables. Tables greater than 2GB, tables which stores the algorithm. It evenly distributes the data across devices. List
historical data and table which requires different types of partitioning provides the partition of a set of discrete values. If
storage devices to store its data need to partition. Partition a table is having data of business centres across the globe then
offers enhanced performance. list partitioning separates according to the country.
When to partition an index includes few suggestions which Composite partitioning provides the combination of the
includes firstly, avoid rebuilding the entire index. Perform basic distribution. It decomposes the partition into sub

 Composite Range- Range partitions


maintenance without invalidating entire index and reduce the partitions. Composite partition includes

 Composite Range –Hash partitions


impact of index skew are the other two suggestions. For

 Composite Range-List partitions


partitioning index-organised tables, partition key column
should be primary key. Secondary indexes can be partitioned
and OVERFLOW data segments are equi-partitioned in  Composite List-Range partitions
index-organised tables.  Composite List-Hash partitions
System Partitioning provides scalability, availability and  Composite List-List partitions
manageability without having database control. Benefits of In addition to basic partitioning strategies, partitioning
partitioning include performance, manageability and
 Manageability Extensions
extensions are provided
availability.
Partition for performance focuses on partition pruning and  Partitioning key Extensions
partition wise joins. Partition pruning provides the partitioned Manageability extension provides interval partitioning and
data without querying the entire database. If table containing partition advisor. Interval partitioning is an extension of range
3 years of historical data then query requesting a single week partition. A new partition automatically creates when the data
would only access a single partition instead of 156 partitions. exceed than the last partition. If a table follows monthly
Partition wise join decomposes big join into smaller join and partition then after 30 days, a new partition is generated.
insures the performance.
 Interval- Range
Interval partitioning can be seen with single level partitions

 Interval-Hash
Manuscript received October, 2013.  Interval -List
Mayur Mahadev Sawant, Department of Information Technology, MIT Partition advisor is part of SQL advisor. Partition advisor
College of Engineering , Pune, India. can recommend a partitioning strategy by studying workload,
Dr. Kishor Kinage, Professor, Department of Information Technology, SQL cache and SQL Tuning set.
MIT College of Engineering , Pune, India
Pooja Shashikant Pilankar, Department of Computer, Ramrao Adik Partitioning key extension extends in defining the partition
Institute of Technology, Mumbai, India. key. Reference partitioning and virtual column based
Nikhil Anil Chaudhari,, Department of Information Technology, MIT partitioning fall under key extensions. Virtual column based
College of Engineering , Pune, India.

82
Database Partitioning: A Review Paper

partitioning provides partitioning even if partitioning key is Import-Export commands are used to partition a table.
not present physically in the table. The partition key can be Two steps included in this approach. First step is to export the
defined by expression, using one or more existing column. data from non-partitioned table. Second is to import it in the
Metadata is used to store the expression. Reference partitioned table. Fig.1 and fig.2 listed below show the query
partitioning allows the partitioning of two tables related to for the import-export command.
one another by referential integrity.
This paper focuses on partitioning concepts. The rest of the
paper is organised as follows. In the section II, three papers
related to database partitioning is discussed. Section III Fig.1. Export command
consists of experiments conducted during the study of the
topic. Section IV shows the conclusion based on the
experiments and concludes the paper with future scope.

II. LITERATURE SURVEY Fig.2. Import command


Near-uniform range partition (NURP) approach [1] is
based on range partitioning. ‘Divide and conquer’ rule is used The second approach based on Dbms_Redefinition method.
to minimize the complexity and increase the performance of This method has five basic steps.
the database. Traditionally used uniform range partitioning 3.2.1 Create a sample schema
algorithm is used for partitioning. To speed up the 3.2.2 Create a partitioned interim table
partitioning, current partitioning technique is studied and 3.2.3 Start the redefinition process
three efficient range partitioning strategies are added. 3.2.4 Create Constraints and indexes
To balance the data in partitions, three more strategies are 3.2.5 Complete the redefinition process.
used. The aim behind these strategies is to automate the This process creates sample schema which goes under
partition. NURP-I is used to distribute the data equally in each partitioning process. The next is to create partitioned interim
partition. Uniform range does so but not equally. Adjustment table with the number of partitions. With this interim table, we
stage checks the each partition and adjusts the data by can start online redefinition process. First we have to check
splitting and merging. NURP-II uses a single query rather whether redefinition is possible or not by using
than looping which helps to increase the execution speed. Dbms_Redefinition.Can_redef_table
NRUP-III automates the partition when data is increasing. (USER, ’Table_name’).
NRUP is efficiently used for partitioning on large databases. If the redefinition is possible then start the redefinition
Advantage of this method is to automate the partitioning of process. If redefinition process fails due to some reason then
large database. use Dbms_Redefinition.Abort_redef_table to abort the
Near-uniform range Partitioning Algorithm (NPA) [2] is process. After successful completion of this step, copy the
used as increased partitioning approach for massive data in table dependencies and synchronize the table. Last step is to
real-time data warehouse. The primary work includes new complete the redefinition process. Use user_tables to check
challenges of data warehouse and aiming for range the number of partitions inside the table. Fig.3 shows the
partitioning using NPA. This paper also focuses on the entire Dbms_Redefinition process.
increased partitioning and efficiency of the data warehouse.
NPA also focuses on multilevel partition. It checks small
tables which help to find out the complexity and thus
increasing the performance. As the data grows on increasing,
real time partitioning is done by new partitioning plan. The
same concept can be used for star schema of data warehouse.
Shinobi [3] helps to improve the query performance by
using horizontal partitioning. It focuses on cluster the physical
data and improves the performance by frequently accessed
indexing data. This paper presents design algorithms which
optimally partition the table and manage the partition. This
paper uses index partition approach for real-world query
workload from traffic monitoring application.
Partitioning to a data warehouse [4] is discussed with
Exchange Partition method during ETL (Extraction
Transform Load). The exchange partition and basic algorithm
helps ETL process to work out in simpler ways.

III. APPROACHES
Database Partitioning can be done by using 5 different
methods.
3.1. expdp/impdp
3.2. Dbms_Redefinition method
3.3. EXCHANGE_PARTITION method
3.4. Partition Advisor

83
International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-3, Issue-5, October 2013

Partition advisor uses the SQL access advisor which is


introduced in 10g. It provides important details about
additional indexes and materialized view which leads to
improve the system performance. Partition advisor gives the
partitioning schemes to enhance the throughput.
Three ways are used to implement partition advisor. One of
them is Enterprise Manager which provides simple interface
for the SQL Access Advisor (Advisor Central > SQL
Advisor >SQL Access Advisor). DBMS_ADVISOR and
DBMS_SQLTUNE package. Fig. 5 shows the procedure
using DBMS_ADVISOR.

Fig. 5 DBMS_ADVISOR procedure


Fig.3 Dbms_Redefinition Process
DBMS_REDEFINITION process provides us two
IV. CONCLUSIONS
advantages. We can make online redefinition. Partitioning
can be done by keeping database up and running. This is one We have discussed 4 methods in last section. These
of the best advantages over all methods. methods are used to partition a database by using different
Third approach is Exchange Partition. Database should be approaches. All the above methods are best in the
offline for this approach as DDL operations are processed in respective environment.
this approach. This approach has 4 steps. Dbms_Redefinition method is best suited when
3.3.1 Create a sample schema partitioning to be done online. SQL Access Advisor can
3.3.2 Create a partitioned interim table be used when the GUI based approach is needed. Partition
3.3.3 Exchange Partition exchange comes in the picture when we want sub-partition
3.3.4 Split Partition offline. Import-Export approach can be used when we are
First we create a sample schema. Then we create dealing with the large data. Table 1 shows the comparison
appropriate partition table as a destination table. Next two of all methods.
processes include Exchange partition and split partition.
Exchange partition method exchanges the partition between TABLE 1 Comparison of partitioning approaches
source and destination table. Split partition divides a partition Methods Online Offline GUI Large
into small partitions. Fig.4 shows the entire Exchange Data
Partition approach. Import-Export - Yes - Yes
Dbms_Redefinition Yes - - -
Partition Exchange - Yes - -
Partitioning - - Yes -
Advisor

V. ACKNOWLEDGMENT
We are thankful to Mr Prathamesh Chavan for his genuine
guidance for the data management.

REFERENCES
[1] Wen Qi, Jie Song and Yu-bin Bao, Near-uniform Range Partition
Approach for Increased Partitioning in Large Database, IEEE,
978-1-4244-5265-1/10, 2010
[2] Jie Song and Yubin Bao, NPA: Increased Partitioning Approach for
Massive Data in Real-time Data Warehouse, IEEE,
978-1-4244-7585-8/10, 2010
Fig. 4 Partition Exchange Process [3] Eugene Wu and Samuel Madden, Partitioning Techniques for

84
Database Partitioning: A Review Paper

Fine-grained Indexing, 978-1-4244-8960-2/11, 2011 IEEE


[4] Scaling to Infinity:Partitioning in Oracle Data Warehouses,
SageLogix, Inc., White Paper

Mayur Mahadev Sawant received his B.E. degree in Information


Technology from Mumbai University.
Currently he is doing Master Of Engineering from MIT College Of
Engineering, Pune University. His research area includes Database
Partitioning, Keystroke Biometrics and Mouse Dynamics.

Dr. Kishor Kinage graduated from S. S. G. M. College of Engineering


Shegaon in 1989 and completed his post graduation from V. J. T. I., Mumbai
in 1998. He completed his PhD from NMIMS University Mumbai, in 2011.
Currently he is working as Professor in Information Technology at MIT
College of Engineering Pune. His special fields of interest include
Biometrics, face recognition, Geographic Information Systems.

Pooja Shashikant Pilankar received her B.E. degree in Computer Science


from Solapur University.
Currently she is doing Master Of Engineering from Ramrao Adik
Institute of Technology, Mumbai University. Her research area includes
Database and Java.

Nikhil Anil Chaudhari received his B.E. degree in Information


Technology from Pune University.
Currently he is doing Master Of Engineering from MIT College Of
Engineering, Pune University. His research area includes Database
Partitioning, Wireless Sensor Network.

85

You might also like