RDBMS in The Cloud: Oracle Database On AWS: October 2013
RDBMS in The Cloud: Oracle Database On AWS: October 2013
RDBMS in The Cloud: Oracle Database On AWS: October 2013
October 2013
October 2013
Page 1 of 33
October 2013
Table of Contents
Abstract ................................................................................................................................................................................... 3 Oracle Database Solutions on AWS ........................................................................................................................................ 3 Oracle Database on Amazon RDS ....................................................................................................................................... 3 Oracle Database on Amazon EC2 ........................................................................................................................................ 3 Other Database Scenarios................................................................................................................................................... 3 Choosing between Amazon RDS and Amazon EC2 for an Oracle Database ........................................................................... 4 Oracle Database Feature Comparison between Amazon RDS and Amazon EC2 ............................................................... 5 Oracle Licensing and Support ................................................................................................................................................. 6 Starting an Oracle Database Instance on AWS ....................................................................................................................... 6 Starting an Oracle Database Instance on Amazon RDS ...................................................................................................... 6 Starting an Oracle Database instance in Amazon EC2 ........................................................................................................ 7 Performance Management ..................................................................................................................................................... 8 Instance Sizing ..................................................................................................................................................................... 8 Disk I/O Management in Amazon RDS................................................................................................................................ 9 Disk I/O Management in Amazon EC2 .............................................................................................................................. 10 Caching .............................................................................................................................................................................. 13 Database Replicas ............................................................................................................................................................. 14 High Availability .................................................................................................................................................................... 15 High Availability Features in AWS ..................................................................................................................................... 15 High Availability Features in Oracle .................................................................................................................................. 16 High Availability Architecture in Amazon RDS .................................................................................................................. 17 High Availability Architecture in Amazon EC2 .................................................................................................................. 19 Backup and Restore .............................................................................................................................................................. 22 Backup and Restore on Amazon RDS ................................................................................................................................ 22 Backup and Restore in Amazon EC2 ................................................................................................................................. 23 Monitoring and Management............................................................................................................................................... 24 Amazon RDS Monitoring ................................................................................................................................................... 24 Monitoring and Management in Amazon EC2.................................................................................................................. 26 Security ................................................................................................................................................................................. 27 Amazon VPC ...................................................................................................................................................................... 27 Oracle Security in Amazon RDS ......................................................................................................................................... 28 Oracle Security in Amazon EC2 ......................................................................................................................................... 28 AWS for On-Premise Oracle Environments .......................................................................................................................... 29 Backing up On-Premise Oracle Databases in AWS ........................................................................................................... 29 Disaster Recovery on AWS for On-Premise Oracle Databases ......................................................................................... 30 Migrating your On-Premise Oracle Database to AWS ...................................................................................................... 31 Managing Cost ...................................................................................................................................................................... 32 Reducing Cost with Reserved Instances ........................................................................................................................... 32 Other Options to Reduce Costs......................................................................................................................................... 32 Conclusion ............................................................................................................................................................................. 32 Further Reading .................................................................................................................................................................... 33
Page 2 of 33
October 2013
Abstract
Amazon Web Services (AWS) is a flexible, cost-effective, easy-to-use cloud computing platform. Relational database management systems, or RDBMS, are widely deployed within the Amazon cloud. In this whitepaper, we help you understand how to deploy Oracle Database on AWS. You can run Oracle Database on Relational Database Service (Amazon RDS) or Amazon Elastic Compute Cloud (EC2). The goal of this whitepaper is to explain how you can run Oracle Database on both Amazon RDS and Amazon EC2, and to give you an understanding of the advantages of each approach. We review in detail how to provision and monitor your Oracle database, and how to manage scalability, performance, backup and recovery, high availability and security in both Amazon RDS and Amazon EC2. We also describe how you can set up a Disaster Recovery solution between an onpremise Oracle environment and AWS, and how you can perform a migration of your existing Oracle database to AWS. After reading this whitepaper you will be able to make an educated decision and choose the solution that best fits your needs.
Page 3 of 33
October 2013
Description If your application needs to store huge quantities of data with a high throughput, you may find NoSQL databases to be more appropriate for your needs than relational databases. While NoSQL databases are not relational and not ACID compliant, they are typically more adept at storing and retrieving large amount of data with high performance. Also, NoSQL databases are significantly easier to manage. Many AWS customers run NoSQL databases in Amazon EC2, including Cassandra, MongoDB, Redis, CouchDB, and HBase. If you do not want to manage your NoSQL database, Amazon DynamoDB is a fast, fully-managed AWS NoSQL database that provides predictable performance and seamless scalability. All data items are stored on solid state drives (SSDs) and are automatically replicated across multiple Availability Zones (within a region) to provide built-in high availability and data durability. For more information about DynamoDB, see http://aws.amazon.com/dynamodb. While you can use Oracle Database for data warehousing purposes, an alternative is Amazon Redshift. Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. It is optimized for datasets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year. For more information about Amazon Redshift, see http://aws.amazon.com/redshift. While the Oracle databases support blobs storage and management, you may find Amazon Simple Storage Service (Amazon S3) to be a better and simpler choice for BLOB storage if your application makes heavy use of these objects (video, audio, images, and so on). Many AWS customers have found it useful to store blob-style data in Amazon S3 while using a NoSQL database such as DynamoDB to manage blob metadata. For more information about Amazon S3, see http://aws.amazon.com/s3.
Data Warehousing
Choosing between Amazon RDS and Amazon EC2 for an Oracle Database
For an Oracle database, both Amazon RDS and Amazon EC2 have advantages. Amazon RDS is easier to set up, manage and maintain than running Oracle in Amazon EC2 and lets you focus on other tasks rather than the day-to-day administration of Oracle. Alternatively, running Oracle in Amazon EC2 gives you more control, flexibility and choice. Depending on your application and your requirements, one might be preferable over the other. Amazon RDS might be a better choice for you if: You want to focus on your business and applications, and outsource the following tasks to Amazon: provisioning of the database, management of backup and recovery, management of security patches, upgrades of minor Oracle versions and storage management. You need a highly available database solution and want to take advantage of the push-button, synchronous Multi-AZ replication offered by Amazon RDS, without having to manually setup and maintain a standby database. You do not want to manage backups and, most importantly, point-in-time recoveries of your database.
Page 4 of 33
October 2013
You would rather focus on high-level tasks, such as performance tuning and schema optimization, than on the daily administration of the database.
Amazon EC2 might be a better choice for you if: You need full control over the database, including access to the operating system. You have experienced database administrators that will be able to manage the database. Your database size exceeds the current maximum database size in Amazon RDS (3TB at the time of this writing). You need to use Oracle features or options not currently supported by Amazon RDS. You want to setup a disaster recovery solution between several AWS regions, or between your on-premise environment and AWS.
Oracle Database Feature Comparison between Amazon RDS and Amazon EC2
The capabilities of Amazon RDS for an Oracle database are constantly improving. However, there are a few Oracle Database features and options that are currently not supported by Amazon RDS. Here is a list of Oracle Database features that are not directly supported by Amazon RDS, but where Amazon RDS provides equivalent functionality: Oracle Data Guard and Oracle Active Data Guard allow you to create standby databases. Amazon RDS provides similar functionality when run in Multi-AZ mode. Oracle Enterprise Manager (OEM) Grid Control allows you to manage multiple Oracle database and application servers. While Amazon RDS does not support Grid Control, you can manage your Amazon RDS databases individually with OEM Database Control. Automated Storage Management (ASM) simplifies the management of Oracle data files, control files and log files. It is however unnecessary in Amazon RDS since Amazon RDS manages all the Oracle files for you. Streams can be configured to propagate database changes within and between Oracle databases, and can be used for replication purposes. Instead, you can use Amazon RDS Multi-AZ to create a synchronous standby database. Oracle XML DB provides native XML storage and retrieval capabilities. It is supported in Amazon RDS without the XML DB Protocol Server. Real Application Clusters (RAC) is a cluster database with a shared cache and shared disk architecture. Currently, you cannot run RAC in Amazon EC2 either. Some of the customers that require RAC choose to host their databases outside of AWS while running the rest of their infrastructure in AWS. To do this while providing good database performance, they host their RAC database at an AWS partner datacenter, which is linked via AWS Direct Connect to the AWS datacenter where the rest of their infrastructure exists. AWS Direct Connect lets you establish a dedicated network connection between the AWS Direct Connect location and the closest AWS region. It provides low-latency 1 Gbps and 10 Gbps connections, and you can easily provision multiple connections if you need more network capacity. For more information about AWS Direct Connect, see http://aws.amazon.com/directconnect/.
Currently the following Oracle options are not supported in Amazon RDS: Oracle Java support allows you to deploy Java server-side applications in the database. Oracle Locator and Oracle Spatial aid you in managing geographic and location-data in a native type within the database.
Page 5 of 33
October 2013
If you need to use any of these database features and options, Amazon EC2 is currently the best-suited deployment platform. This list is subject to change as we add new features to Amazon RDS. To check for recent updates, see the Amazon RDS documentation at http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.Oracle.Options.html.
Page 6 of 33
October 2013
You then configure your instance by specifying a few parameters, such as the size of the Amazon RDS instance, the size of the database, and the I/O settings. This configuration will be explained more later in this whitepaper. You can also accomplish all of this by using the command-line interface (CLI) or by invoking the application programming interface (API) from several programming languages including Java, Node.js, PHP, Python, Ruby and .NET.
Page 7 of 33
October 2013
Performance Management
The performance of a relational database instance on AWS depends on many factors, including the Amazon RDS or Amazon EC2 instance type, the configuration of the database software, the application workload, and for Oracle databases running on Amazon EC2 instances, the storage configuration. The following sections describe various options that are available to you to tune the performance of the AWS infrastructure on which your Oracle database is running.
Instance Sizing
Increasing the performance of a database requires an understanding of which of the servers resources is the performance constraint. If the database performance is limited by CPU, memory or network throughput, you can scale up the memory, compute, and network performance by choosing a larger instance type. For many customers, increasing the performance of a single DB instance is the easiest way to increase the performance of their application overall. One way to achieve this is with vertical scalability. You do this by scaling up (or down) the instance size to address the hardware performance requirements of the database. In the Amazon RDS and Amazon EC2 environments, vertical scalability is very easy.
Page 8 of 33
October 2013
When running high-performance databases, the High-Memory Instances can be a good option because they allow you to maximize the amount of memory available to the SGA (System Global Area) of the database. The Cluster Compute Instances combine very high CPU capability and high-memory. You should also consider the High I/O Instances because they feature local SSD drives and will offer the most I/O of any instance type. The High Memory Cluster Instances feature a high amount of memory with local SSD storage and can be good choice for the largest database instances. For more information about the impact of the instance size on I/O performance, see the Disk I/O Management sections. In AWS it is very easy and quick to scale vertically (change the instance type and size), if you find out that you undersized or oversized your instance. The method to change the size of the instance depends on the type of AMI that you selected: EBS-backed instances are instances where the root device is stored in Amazon Elastic Block Store (Amazon EBS). In this case, you can just stop the instance, change the instance type (either through the AWS Management Console, the CLI, or an API), and restart the instance. Instance-store backed instances are instances where the root device is stored on the instance internal storage. In this case, you would save any changes to the root device (for example by rebundling an AMI), terminate the instance and start a new one. In both scenarios, the instance size change can be accomplished within a few minutes.
Note: To determine whether your instance is EBS-backed or instance-store backed, you can look at the Amazon EC2 instances dashboard in the AWS Management Console, or parse the output of the CLI command ec2-describeimages -v <AMI ID> and look for the value of the rootDeviceType field.
Page 9 of 33
October 2013
AWS Management Console or the Amazon RDS APIs, you can provision from 1,000 IOPS to 30,000 IOPS with corresponding storage ranging from 100GB to 3TB. You can start small and scale up in increments of 1,000 IOPS. Here are some important things to know about provisioned IOPS in Amazon RDS: The ratio between the amount of storage and Provisioned IOPS should be between 3 and 10. For example, a 100GB database could have provisioned IOPS set between 300 and 1,000. Oracle RDS uses a page size of 8 KB. On a DB instance with a full duplex I/O channel bandwidth of 1000 megabits per second (Mbps), such as the m2.4xl instance, the maximum IOPS for page I/O is about 25,000 IOPS in each direction for 8 KB I/O. A workload consisting of 50% reads and 50% writes could reach 25,000 IOPS with 16 KB I/O or 25,000 IOPS with 8 KB I/O. If you are using Provisioned IOPS storage, we recommend that you use the instances optimized for Provisioned IOPS (currently the m1.large, m1.xlarge, m2.2xlarge, and m2.4xlarge instance classes). The available network bandwidth for Provisioned IOPS for m1.large instance class is 500 megabits per second (Mbps) compared to 1000 Mbps for an m1.xlarge, m2.2xlarge, or m2.4xlarge instance. For a similar IOPSintensive workload, the number of realized IOPS for instances with a 1000 Mbps storage network bandwidth will be higher than for other instances. You can convert an Amazon RDS database using standard storage to use provisioned IOPS storage. The actual amount of your I/O throughput may vary depending on your workload. For Oracle, the maximum IOPS rate is 25,000 with a 8KB page size.
Instance Storage
An Amazon EC2 instance comes with a certain amount of local storage, which is ephemeral. Any data saved on an instance will not be available after that instance is terminated by the customer, or if the underlying hardware fails, which would cause an instance restart to happen on a different server. This characteristic makes instance storage a challenging option for database persistent storage. However, Amazon EC2 instances can have the following benefits: Ephemeral disks offer good performance for sequential disk access, and dont impact your network connectivity. Some customers have found it useful to use ephemeral disks to store temporary files to conserve network bandwidth. High I/O instances (discussed later) offer unmatched I/O performance and are recommended for database workloads, provided you implement a backup or replication strategy that addresses the ephemeral nature of this storage. High Storage instances offer 48TB of internal storage, which can allow you to run very large databases on instance storage. Just like for High I/O instances, you should mitigate the risk of losing your local storage with a backup or replication strategy.
EBS Volumes
AWS offers a storage service called Amazon Elastic Block Store (Amazon EBS), which provides persistent block-level storage volumes. Amazon EBS volumes are off-instance storage that persist independently from the life of an instance. Amazon EBS volume data is mirrored across multiple servers in an Availability Zone (datacenter) to prevent the loss of data from the failure of any single component. It is easy to back them up to Amazon Simple Storage Service (Amazon S3)
Page 10 of 33
October 2013
using snapshots. These attributes make EBS volumes suitable for data files, log files and for the flash recovery area. The maximum size of an EBS volume is 1TB, but you can address larger database sizes by striping your data across multiple volumes. There are two types of EBS volumes, standard EBS volumes and Provisioned IOPS volumes, as described in the following sections. Standard EBS Volumes Standard EBS volumes provide about 100 IOPS on average, with the ability to burst to hundreds of IOPS on a best-effort basis. Standard EBS volumes are great for applications with moderate or bursty I/O requirements, as well as for boot volumes. Standard EBS volumes tend to perform better for sequential reads and writes (as opposed to random reads and writes). Provisioned IOPS Volumes Provisioned IOPS volumes are designed to deliver predictable and consistent high performance for I/O intensive workloads such as databases. With Provisioned IOPS volumes, you specify an IOPS (I/O operations per second) rate when creating a volume, and then Amazon EBS provisions that rate for the lifetime of the volume. Here are some important characteristics of Provisioned IOPS volumes: Amazon EBS currently supports up to 4,000 IOPS per Provisioned IOPS volume. By striping across 10 volumes, you would consistently provide your database with up to 40,000 IOPS. The number of provisioned IOPS applies to I/O operations with a size of 16KB or less. Beyond 16KB, the number of IOPS will decrease proportionally with the size of the I/O. For example, if you provision a 4,000 IOPS volume and your average I/O size is 32KB, you should expect 2,000 IOPS. If your I/O size is 64KB, then you should expect 1,000 IOPS. There is a maximum ratio of 10 between the volume size (in GB) and the provisioned IOPS. For example, if you provision a 50GB volume, the maximum provisioned IOPS that you could request would be 500. While providing more consistent performance, Provisioned IOPS can be more cost effective than standard EBS volumes if your database consistently generates a high I/O workload. With Provisioned IOPS you pay for the number of provisioned IOPS (whereas you pay for actual usage for standard EBS volumes). This has the added benefit of making your I/O cost more predictable.
EBS-Optimized Instances EBS-optimized instances enable Amazon EC2 instances to fully utilize the provisioned IOPS on an EBS volume. EBSoptimized instances deliver dedicated throughput between Amazon EC2 and Amazon EBS, with options between 500 Mbps and 1,000 Mbps, depending on the instance type. When attached to EBS-optimized instances, Provisioned IOPS volumes are designed to deliver within 10% of their provisioned performance 99.9% of the time. The combination of EBS-optimized instances and Provisioned IOPS volumes helps to ensure that instances are capable of consistent and high EBS I/O performance. Most databases with high I/O requirements should benefit from this feature. You can also use EBS-optimized instances with standard EBS volumes if you need predictable bandwidth between your instances and EBS. Note that EBS-optimized instances are currently available only on some of the larger instance types. For more details, see http://aws.amazon.com/ec2/instance-types/. Choosing the Right Instance Type If your performance is limited by disk I/O, changes to the configuration of your disk resources may be in order. EBS volumes are connected via the network, and therefore the network throughput available to your instance can have an impact on disk performance. A rule of thumb is that the larger the instance, the more network throughput it can provide. To take full advantage of standard or Provisioned IOPS volumes, you could use the instances running on a 10
Page 11 of 33
October 2013
Gigabit network, such as the Cluster Compute, High Memory Cluster and High I/O Instances, or choose the EBSoptimized instances described in the previous section. An increase in network throughput can have a significant impact on network-attached storage performance, so be sure to choose the appropriate instance type. Spreading the I/O Activity on Several EBS Volumes To scale up random I/O performance, you can increase the number of EBS volumes your data resides on, for example by using 8 x 100GB EBS volumes instead of 1 x 800GB EBS volume. Aggregating multiple EBS volumes increases the total IOPS of the logical volume. EBS volumes can be aggregated using different techniques like Linux software RAID, Logical Volume Manager (LVM) or Oracle Automatic Storage Management (ASM). A single standard EBS volume can provide an average of approximately 100 IOPS, and single instances with arrays of 10+ attached EBS disks can average 1,000 IOPS. Provisioned IOPS offer an easier and more predictable way to provide your database with high I/O. You can stripe two or more volumes together in order to reach multiple thousands of IOPS By striping across 10 Provisioned IOPS volumes on EBS-optimized instances, you could provide your database with storage volumes capable of up to 20,000 IOPS. However, remember that utilizing striping techniques generally reduces the operational durability of the logical volume by a degree inversely proportional to the number of EBS volumes in the stripe set. EBS volume data is natively replicated, so using RAID 0 (striping) might provide you with sufficient redundancy and availability. You could use RAID 10 (striping and mirroring) instead of RAID 0 to improve the availability of your aggregate, but in most cases using RAID 10 will decrease your I/O performance compared to RAID 0. The choice between RAID 0 and RAID 10 will depend on your availability requirements and on the number of volumes of your aggregate. At any rate, you should take snapshots of your EBS volumes as often as possible. Data, logs, and temporary files will benefit from being stored on independent EBS volumes or volume aggregates because they present different I/O patterns. In order to take advantage of additional EBS volumes, be sure to evaluate the network load to help ensure that your instance size is sufficient to provide the network bandwidth required.
Oracle ASM disk groups provide three types of redundancy: normal, high, and external. With normal and high redundancy, files are replicated within the disk group. With external redundancy, ASM does not provide any redundancy for the disk group. When setting up ASM for a group of volumes, we recommend using external redundancy since
Page 12 of 33
October 2013
Amazon EBS volumes are already redundant within an Availability Zone. Oracle ASM best practices, like using different disk groups for data and log files, work and recovery areas, also apply in Amazon EBS.
Benchmarking
As described previously, Amazon EC2 offers many options to optimize and tune your I/O subsystem. We encourage you to benchmark your actual application on several instance types and storage configurations in order to select the most appropriate configuration. For example, you could use the Oracle benchmarking tool Orion (Oracle I/O Numbers), an Oracle I/O calibration tool, which mimics the type of I/O performed by Oracle databases and allows you to measure I/O performance for storage systems.
Caching
Whether using Oracle on Amazon EC2 or Amazon RDS, Oracle users confronted with heavy workloads should look into reducing this load by caching data so that the web and application servers do not have to repeatedly access the database. There are several tools (including tools from Oracle) that can address your caching needs: Memcached: An open-source, high-performance, distributed memory object caching system. It is an in-memory key-value store for small chunks of arbitrary data (strings, objects) such as results of database calls. Memcached is widely adopted and mostly used to speed up dynamic web applications by alleviating database load. Amazon ElastiCache: A web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud. The service improves the performance of web applications by allowing you to retrieve information from a fast, managed, in-memory caching system, instead of relying entirely on slower disk-based databases. ElastiCache is protocol-compliant with Memcached, so code, applications, and popular tools that you use today with existing Memcached environments will work seamlessly with the service. ElastiCache simplifies and offloads the management, monitoring, and operation of a Memcached environment, enabling you to focus on the differentiating parts of your applications.
Page 13 of 33
October 2013
Oracle In-Memory Database Cache: A cache built using Oracle TimesTen In-Memory Database that can be deployed in the application tier as an in-memory cache database. Applications perform read/write operations on the cache tables using SQL and PL/SQL with automatic persistence, transactional consistency, and data synchronization with the underlying Oracle database. The writes on the cache can be synchronously or asynchronously replicated to the database. Oracle Coherence: A highly scalable and fault-tolerant distributed cache engine. Its an in-memory data grid designed to improve reliability, scalability and performance. You can provision and configure a Coherence cluster that caches data, and automatically and transparently fails over and redistributes its data management services when a server becomes inoperative or is disconnected from the network.
Database Replicas
A technique to provide higher performance is to spread the database query load across multiple instances. This technique is often referred to as scaling out or horizontal scalability.
Page 14 of 33
October 2013
Figure 2: Oracle Active Data Guard being used to create a scalable reader farm
High Availability
By combining the tools provided to you by Oracle and by AWS, you can build highly available databases that will be a solid foundation for your applications.
Page 15 of 33
October 2013
Snapshots are the easiest way to back up the data contained in your EBS volumes and Amazon RDS databases. EBS volumes that operate with 20GB or less of modified data since their most recent Amazon EBS snapshot can expect an annual failure rate (AFR) of between 0.1% and 0.5%. EBS volumes with no snapshots, or modifications greater than 20GB since their last snapshot was taken, will be less durable. EBS volumes are contained within a given Availability Zone. The snapshots are stored in Amazon S3 and are designed to survive the concurrent loss of data in two facilities. In the unlikely event of the failure of a full Availability Zone, you would be able to create new volumes from your most recent snapshots.
Backup
EBS snapshots and Amazon RDS backups are stored in Amazon S3, which is designed to provide 99.999999999% durability, and can sustain the loss of a full Availability Zone. Backups are discussed more in the Backup and Restore section. You can detach volumes and re-attach to new instances of Amazon EC2 servers in the event of an instance failure. You can also quickly start new instances. You can create zone independence by utilizing multiple Availability Zones. Availability Zones are distinct locations that are engineered to be insulated from failures in other Availability Zones. Since AWS is global, you can implement disaster recovery architectures spanning multiple regions. The Amazon EC2 API control layer design supports redundancy and fault tolerance. The Amazon EC2 API has a 99.95% Annual Uptime Percentage SLA.
Instance
Region
Global
System
Flashback Table recovers tables to a specific point-in-time, which can be helpful when a logical corruption is limited to one or a set of tables instead of the entire database. Flashback Transaction Query allows you to see all the changes made by a specific transaction. Flashback Query lets you query any data at some point-in-time in the past.
o o
Total Recall is based on the Flashback feature and allows users to query data "AS OF" an earlier time in the past. This allows companies to "archive" data for auditing and regulatory compliance. The main difference between
Page 16 of 33
October 2013
Flashback and Total Recall is that with Total Recall data will be permanently stored in an archive tablespace and will only age out after a user-defined retention time. In addition, the following features can be used if you run your Oracle database on Amazon EC2: In addition to the Flashback capabilities described in the previous list, the following additional Flashback features are also available:
o o o
Flashback Drop recovers an accidentally dropped table. Flashback Transaction lets you undo the effects of a single transaction. Using Flashback Database, you can restore the entire database to a specific point-in-time, using Oracleoptimized flashback logs (without using a backup). Amazon RDS has similar functionality with point-intime restore.
Online Data Reorganization and Definition offers you the flexibility to modify table physical attributes and transform both data and table structure while allowing users full access to the database. For example, you can move a table to a different tablespace, partition it, add or drop a column, and create and rebuild indexes, while the table remains open for read and write operations. Transportable Tablespaces is a feature that allows you to quickly move a user tablespace across Oracle databases. It can help you reduce database upgrade time by moving all user tablespaces from a database running an earlier software release to an empty destination database running a current software release. Edition-based Redefinition allows you to achieve online applications upgrades. Code changes are installed in the privacy of a new edition. Data changes are made safely by writing only to new columns or new tables not seen by the old edition. An editioning view exposes a different projection of a table into each edition to allow each to see just its own columns. A cross-edition trigger propagates data changes made by the old edition into the new editions columns, or vice-versa.
In addition to these engine features, you should design an architecture that protects you against hardware failures, datacenter problems, and disasters by using replication technologies. We will review how to set up Oracle replication in the following sections.
Page 17 of 33
October 2013
A failover to the standby typically takes three minutes, and will occur in the event of any of the following: Loss of availability in primary Availability Zone Loss of network connectivity to primary Compute unit failure on primary Storage failure on primary Scaling of the compute class of your DB Instance up or down Software patching
To choose the Multi-AZ option, all you have to do is to enable it while creating the database. Figure 4 shows how it looks in the AWS Management Console.
Page 18 of 33
October 2013
You can also convert an existing Oracle RDS database to the Multi-AZ mode. Running Amazon RDS in Multi-AZ has additional benefits: The Amazon RDS daily backups are taken from the standby instance, which means that there is no I/O impact to your primary Amazon RDS instance. When the database engine needs to be upgraded (and if you enabled the automatic minor version upgrade), the patches are applied to the standby instance first. When complete, the standby is promoted to be the new primary. The availability impact is then limited to the failover time, resulting in a shorter maintenance window.
Physical: A physical standby database replicates the exact contents of its primary database. The data in the database will be exactly the same as the primary database. Logical: A Logical standby database converts the redo generated at the primary database into data and SQL and then re-applies those SQL transactions on the logical standby. Thus, physical structures and organization will be different from the primary database. Users can read from the logical standby databases while the changes are being applied and can even write to tables in the logical standby database that are not being maintained by the replication. However there are a number of unsupported objects that can make this mode unusable for some databases. Snapshot: A snapshot standby database receives and archives, but does not apply, redo data from a primary database, so it cannot be used for read replicas or for high availability.
Page 19 of 33
October 2013
Because of the limitations of the logical and snapshot standby databases, we will focus on physical standbys in this discussion. Oracle Data Guard maintains the standby databases as transaction-consistent copies of the primary database, and the replication between the primary and the standby databases can be configured to be either synchronous or asynchronous. Oracle Data Guard has three protection modes that allow you to maximize protection, availability or performance. To take full advantage of the AWS environment, these instances should be placed in distinct Availability Zones. If the production database becomes unavailable, you can switch any standby database to become the new primary, which minimizes the downtime associated with the incident. Oracle Data Guard does not support setting up read replicas because the physical standby database cannot be open for reads while, at the same time, archiving transactions from the primary database. Oracle Data Guard can function either in managed-recovery mode or in read-only mode, but not in both modes at the same time. Oracle Data Guard is meant for data protection, high availability and disaster recovery. Oracle Active Data Guard, which we described previously in the Creating Read Replicas in Amazon EC2 section, is an option built upon Oracle Data Guard that allows for the setup of read replicas. In addition to the functionality described previously, Oracle Active Data Guard enables read-only access to the standby databases while at the same time being kept up-to-date by archiving transactions from the primary database. This allows you to run read queries and reports on the standby instances, and to perform your backups from a standby instance.
Both Oracle Data Guard and Oracle Active Data Guard are often used as the foundation of highly available Oracle environments, allowing you to set up one or several slave databases to which the primary database can failover in the event of a failure or a disaster. Figure 5 shows an example of this architecture.
Page 20 of 33
October 2013
Replication
Replication is the process of copying and maintaining database objects, such as tables, in multiple databases that make up a distributed database system. Changes applied at one database are captured and stored locally before being forwarded and applied at each of the remote databases. Replication uses distributed database technology to share data between multiple sites and the same data is available at multiple locations. Replication can increase availability of applications because alternate data access options are available. For example, the application can continue to function if parts of the distributed database are down as replicas of the data might still be accessible. In Amazon EC2, replication could be used to setup multiple copies of a database across multiple Availability Zones, or could be used as part of a disaster recovery strategy to replicate data across multiple AWS regions. While the primary focus of this section is high
Page 21 of 33
October 2013
availability, the replication technologies described in the following list can also be used to increase performance by spreading the workload across multiple databases: Oracle Basic and Advanced Replication Oracle Basic replication can replicate data, but cannot replicate other objects such as procedures and indexes. Replication is one-way and snapshot copies are read-only. Oracle Advanced replication supports multi-master configuration and allows for data to be updated on any replicated instance. It allows data and other database objects, like indexes and procedures, to be replicated. Oracle GoldenGate Oracle GoldenGate is a high-performance software application for real-time transactional change data capture, transformation, and delivery. It includes log-based bidirectional data replication. It can help organizations eliminate the downtime caused by both unplanned and planned outages, and improve system performance and scalability. The software can be configured to minimize downtime during system upgrade, migration, and maintenance activities. It can also be used in the context of disaster recovery by creating and maintaining an immediate failover with up-to-the-minute data. It synchronizes data for distributed applications in real-time across geographies. For AWS, this could be Availability Zones within a region or across different regions. Third-party replication solutions Besides the Oracle tools mentioned previously, there are several third-party solutions which you can use to set up replication between several Oracle databases. Dbvisit Replicate and Quest SharePlex for Oracle are examples of third-party solutions.
Automated Backups
When automated backups are turned on for your DB Instance, Amazon RDS automatically performs a full daily snapshot of your data (during your preferred backup window) and captures transaction logs as updates to your DB Instance are made. You can retain your automated database backups for up to 35 days. During the backup window, storage I/O may be suspended while your data is being backed up. This I/O suspension typically lasts a few minutes at most. This I/O suspension is avoided with Multi-AZ DB deployments, as in this case the backup is taken from the standby instance.
Page 22 of 33
October 2013
retention period, up to the last five minutes. Your automatic backup retention period can be configured to up to 35 days.
Database Snapshots
DB Snapshots are user-initiated backups of your DB Instance. These full database backups will be stored by Amazon RDS until you explicitly delete them. You can create a new DB Instance from a DB Snapshot whenever you want, for example to create a development or a test database identical to the production database or to upgrade your production database. Note that when you perform a restore operation to a point in time or from a DB Snapshot, a new DB Instance is created with a new endpoint.
Page 23 of 33
October 2013
Note that the Oracle Amazon Machine Images provided by Oracle already includes the Oracle Secure Backup Cloud module install tool. If you used an AMI to provision your database, you can find the install tool in the/home/oracle/scripts/osbws directory. If not, you can download it from the Oracle Technology Network (OTN) website. Backup to EBS Volumes As explained previously in the Performance Management section, Amazon EBS volumes are off-instance block devices that persist independently from the life of the instance, and can directly attach to your instances. Using RMAN to back up your database to EBS volumes is the closest thing to backing up to disk in an on-premise environment, and an alternative to using the Cloud Backup module described in the previous section. Another benefit is that once your backups to EBS volumes are done, you can create snapshots of your EBS volumes through the AWS Management Console, CLI or API. EBS volumes snapshots are stored in Amazon S3, which provides a highly durable and reliable repository for your data.
EBS Snapshots
If your database is stored in EBS volumes, you can, in addition to RMAN backups, put your tablespaces in hot backup mode and take snapshots of the underlying Amazon EBS volumes. EBS snapshots can be done through the AWS Management Console, the CLI and the API.
Amazon S3 and Amazon Glacier are integrated in such a way that you can store your backups in Amazon S3 and have them automatically moved to Amazon Glacier after a period of time that you specify for long-term archiving. You can also specify an expiration time that defines how long the backups will be kept in Amazon Glacier. For example, you could store each new database backup in Amazon S3 for 15 days (or any period of time during which you are most likely to use it to restore your database). You could then have the backups automatically moved to Amazon Glacier after 15 days, with an expiration time of 365 days, which would cause the backup to be removed from Amazon Glacier after a year. This would allow you to store your older backups at a minimal cost while being able to retrieve them within a few hours if needed.
Page 24 of 33
October 2013
Amazon RDS metrics include many database-specific metrics, such as database connections, free storage space, read and write I/O per second, read and write latency, read and write throughput, and available RAM. For a full list of Amazon RDS metrics, see the CloudWatch documentation at http://docs.amazonwebservices.com/AmazonCloudWatch/latest/DeveloperGuide/CW_Support_For_AWS.html#rdsmetricscollected. We also wrote an Amazon RDS monitoring guide, available at http://aws.amazon.com/articles/2934. In addition, you can use Oracle Enterprise Manager 11g Database Control (OEM) when you enable the OEM Database Control option for your DB Instance, which can be done in the AWS Management Console or with the API. For example, if you chose the default OEM port 1158 (and configured the database security group to allow this traffic to get through the firewall), you would be able to access OEM through https://<your-RDS-CNAME>:1158/em to manage and monitor your database.
Page 25 of 33
October 2013
Drop any directory Grant the any privilege privilege (the master user can grant privileges, just not the any privilege privilege) Grant any role
Instead, Amazon RDS provides wrapper procedures for many common DBA tasks that require advanced privileges, including: Setting the time zone of the database Managing tablespaces Flushing the shared pool or the buffer cache Switching online log files Checkpointing the database Managing the redo logs Accessing the alert and listener logs Managing tracefiles Killing a session
For an exhaustive list of all the administrative commands that you can run in Amazon RDS, see the Amazon RDS documentation at http://docs.amazonwebservices.com/AmazonRDS/latest/UserGuide/Appendix.Oracle.CommonDBATasks.html
Page 26 of 33
October 2013
Security
There are different ways that you can control access to your Oracle database. You can use Amazon VPC, security groups, and Oracle security features.
Amazon VPC
Amazon Virtual Private Cloud (Amazon VPC) lets you provision a private, isolated section of the AWS cloud where you can launch AWS resources in a virtual network that you define. Provisioning your Oracle databases in Amazon VPC gives you more options to secure your Oracle environment and more flexibility in the management of your network topology. You can use Amazon VPC both for Amazon RDS and for deploying your database on Amazon EC2 instances. Here are some key advantages of deploying your databases in Amazon VPC: You create your own subnets and you can configure routing tables, networking gateways and network Access Control Lists (ACL). You can define different inbound and outbound rules in an Amazon VPC security group. You can deploy your databases on private subnets that are not reachable from the Internet. Your databases run in a secure, private environment, where they can be accessed by web servers and/or application servers, but not from the Internet. You can choose whether you want to have your web and application servers on public or private subnets. You can establish IPSEC VPN connections between AWS and your own datacenters, which helps to safely encrypt the network traffic between AWS and the customer premises. This allows your administrators and you applications to securely access your databases from your internal network.
In addition, the following features are applicable to Amazon EC2 in Amazon VPC, but not to Amazon RDS in Amazon VPC: The private IP addresses of your Amazon EC2 instances are persistent and you can choose them. The permanent private IP address associated to an Amazon VPC instance facilitates the configuration of the application servers that access the database. If your databases are deployed on private subnets that are not reachable from the Internet, you can use Network Address Translation (NAT) to let your database instances access the Internet to download patches or new software. With NAT, your instances have outbound Internet access, but are unreachable from the Internet. You can assign several network interfaces (called Elastic Network Interfaces), and multiple IP addresses per interface, so that your database instances can have several private (and public) IP addresses. Your Oracle database could be part of several subnets. For example, you can have an application subnet that is reachable by your application servers, a backup subnet, and an administration subnet.
If you want to use Amazon VPC, you have three options: Use Amazon RDS. Use an Oracle AMI running on Xen virtualization. You can find out what hypervisor is an AMI designed for, for example by parsing the output of the CLI command ec2-describe-images -v <AMI ID> and looking for the value of the hypervisor field. Oracle AMIs are either designed to run on the Xen hypervisor or on the Oracle VM (OVM) hypervisor. Since OVM is currently not supported in Amazon VPC, you can choose an Oracle AMI based on Xen.
Page 27 of 33
October 2013
Amazon VPC offers many more features than listed here. For more information, see the Amazon VPC documentation at http://aws.amazon.com/vpc/.
Page 28 of 33
October 2013
resources that you control. For more information on how to use IAM to manage administrative access to your instances, see http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/UsingIAM.html. You also have the option of deploying most Oracle security features, such as: The Oracle security features described previously in the Oracle Security in Amazon RDS section. Encrypted backups as mentioned previously in the Backup and Restore section. The Oracle backup utility, RMAN, can transparently encrypt data written to backup sets and decrypt those backup sets when they are needed for database restore operations. To create encrypted backups, the database must be configured to use Advanced Security or Oracle Secure Backup. Note that Oracle Secure Backup includes Cloud Module, which allows you to back up your database directly to Amazon S3, as described in the Backup and Restore section. Oracle Label Security is a tool for classifying data and for managing access to data on a "need to know" basis. Oracle Database Vault restricts access to specific areas in an Oracle database from any user, including users who have administrative access. For example, you can restrict administrative access to critical data such as employee salaries, customer medical records, or other sensitive information.
This list is not exhaustive. Oracle has many other security features that are not listed here and can also be used in Amazon EC2.
Page 29 of 33
October 2013
AWS Storage Gateway, described at http://aws.amazon.com/storagegateway. You can also choose a third-party gateway, such as the ones provided by Aspera, Panzura, Riverbed and TwinStrata. Features of storage gateways may include deduplication, encryption, and compression of your data, in order to secure and to speed up the transfer to Amazon S3. You can use the storage gateway to move non-Oracle data, such as data stored in files. Use the AWS Import/Export service: If the database is very large, moving the backup or export files over the network might be time-prohibitive, even when using a storage gateway. In this case, it can be faster and more reliable to store an Oracle backup on local drives and physically ship them to AWS using the AWS Import/Export service. Upon reception of the drives, AWS engineers copy the data to Amazon EBS, Amazon S3, or Amazon Glacier (your choice). We then ship your drives back to you. For more information about the AWS Import/Export service, see http://aws.amazon.com/importexport/.
Page 30 of 33
October 2013
replication continue until you are ready to switch over and make the Amazon EC2 database the primary instance. For a diagram and more details about how to set up a standby database on AWS, see the High Availability Architecture in Amazon EC2 section. Note that the standby database running in Amazon EC2 can also be used to perform database backups to Amazon S3 for added protection. In this scenario, the RTO (Recovery Time Objective) is minimized because in case of an outage, all you have to do is to failover. To failover, switch the standby database to become the primary and start the rest of your applications on Amazon EC2 using AMIs that you have already prepared and regularly tested. The RPO (Recovery Point Objective) is also minimized because the standby database can be configured to be only several minutes behind the primary database. Both the RTO and the RPO can be down to a few minutes. In short, this solution offers much improved RTO and RPO at a cost barely greater than in the previous, backup-based scenario.
Page 31 of 33
October 2013
Use the AWS Import/Export service: If your database is large and can sustain extended downtime, it might be easier and faster to use the AWS Import/Export service and ship us disk drives containing a backup or an export of your database, as described in the Backing up On-Premise Oracle Databases in AWS section. Use a standby database in Amazon EC2: If the database cannot sustain extended downtime, an option is to set up a standby database in Amazon EC2, as described in the Disaster Recovery using a Standby Database section. A great advantage of this scenario is that it minimizes the amount of downtime that you will occur during the migration, down to a few minutes. After you have switched over to AWS, you can setup the onpremise database as the new standby, allowing you to temporarily switch back to your on-premise environment if an issue was discovered in your new AWS environment.
Managing Cost
Managing the cost of the IT infrastructure is often an important driver for cloud adoption. The cost advantages inherent to AWS should make running Oracle Database on Amazon EC2 a cost-effective proposition. Amazon RDS should further reduce your costs by reducing the management and administration tasks that you have to perform. In addition, you can take advantage of several strategies to manage your database cost effectively.
Conclusion
AWS provides you with two deployment platforms to deploy your Oracle databases: Amazon EC2 and Amazon RDS. In this whitepaper we have explained how to manage performance, high availability, monitoring and security in both environments. Whether you choose to deploy Oracle in Amazon EC2 or in Amazon RDS, you will benefit from the advantages inherent to AWS, including: Ease and speed of provisioning Oracle instances and storage in AWS No capital expense
Page 32 of 33
October 2013
High security High availability and durability Low cost Pay-as-you-go pricing
Further Reading
In addition to this whitepaper, we recommend that you consult the following documents and web sites: Amazon RDS Getting Started Guide: http://docs.aws.amazon.com/AmazonRDS/latest/GettingStartedGuide/ Working with Oracle on Amazon RDS: http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Oracle.html Amazon EC2 Getting Started Guide: http://docs.amazonwebservices.com/AWSEC2/latest/GettingStartedGuide/ AWS Security Center: http://aws.amazon.com/security/ AWS Architecture Center: http://aws.amazon.com/architecture/ AWS Cloud Computing Whitepapers: http://aws.amazon.com/whitepapers/
Page 33 of 33