This site requires JavaScript to be enabled
1284 views

10.0 - Updated on 2024-03-13 by John Grover

9.0 - Updated on 2022-03-28 by John Grover

8.0 - Updated on 2022-03-24 by Denise Moser

7.0 - Updated on 2021-10-06 by John Grover

6.0 - Updated on 2021-10-06 by John Grover

5.0 - Updated on 2021-03-22 by John Grover

4.0 - Updated on 2019-10-29 by Chelsea Noble

Overview

This document describes the hosting strategy adopted by the Databases Services team for hosting Oracle databases in the Amazon cloud (AWS). The document describes the factors that affect the tenancy of a particular database as well as the guidelines for deploying, migrating, monitoring, recovering, and backing up Oracle databases in AWS. You are encouraged to speak with the database services team when you are in the early stages of a new project that requires a database to ensure an optimal configuration.

Database Tenancy

General Considerations

EC2 offers a wide range of instance sizes and optimizations. We have chosen to run only one database per host which is generally considered a best practice. One database per host allows us to choose the instance type and optimizations that are the best fit for the application.

We do not intend to use the multitenant (pluggable databases) feature which is separately licensed.

In this section, a “tenant” is an application that uses a backend Oracle database. It could connect to the database as a single user or multiple users (schemas). A “single-tenant” database supports a single application. A “multi-tenant” database supports many applications.

Fork-lift migration

A fork-lift migration is a database migration by simply copying the database to an EC2 instance that closely matches the physical server that the database currently runs on. No other changes are made to the database or server OS. The database is moved as a whole. Such operations require minimal planning.

Fork-lift migrations are a good choice generally, but not always the best choice. The following are the major factors to take into consideration when migrating an Oracle database to AWS or implementing a new Oracle database in AWS:

Database version-specific features

Applications with the same Oracle feature requirements may be hosted in the same database. For example, APEX applications can be hosted on a single Oracle Standard Edition database. Such a migration would involve exporting and importing all APEX applications from different databases and consolidating them into one database. Note that we currently do not have a license for Standard Edition, but if the license is obtained in the future, then databases that do not require enterprise edition features may be moved to a standard edition environment.

Size

Many small databases can be consolidated into a single database. Large databases (>100GB) can be migrated as such. The DBA should determine good candidates based on database manageability history.

Sensitivity of the data

Databases that contain highly sensitive data should not be consolidated with databases that do not contain highly sensitive data. Generally, the databases with sensitive data will have more stringent security requirements (security groups, public synonyms, human user privileges, etc.) which could interfere with the functionality of other apps.

Availability expectations

Databases with the same availability expectations (planned and unplanned downtime) and the same maintenance window schedule can be consolidated into a single database. If a customer has a demonstrated need for a highly available database, an HA instance will be made available. An HA instance would have a physical standby running a separate AZ. Also, availability requirements in terms of on-times and off-times of server instances should also be considered when determining tenancy. For example, it is expected that most non-production database servers will be shut down during non-working hours (exceptions may apply).

Compute requirements

If an application has documented or observed resource requirements that are unique or significant to the extent that it could affect the performance of other applications, then it should be considered for single tenancy.

Unique behavioral and/or security requirements

Certain third-party applications are written so that they require elevated or specific database level privileges to run, employ sub-standard security, or are otherwise “badly behaved”. The database for such an application should be evaluated as a candidate for single tenancy.

On a case-by-case basis, an application’s database should be evaluated against the above criteria to determine the tenancy of the database.

To maximize our investment in the AWS infrastructure the default implementation will be an Oracle Enterprise Edition 19c on an r4.4xlarge EC2 instance with Redhat Linux 7 as the OS.

Database Deployment

Using Amazon’s Elastic Compute Cloud Service

Deploying an Oracle Database Server on an Elastic Compute Cloud (EC2) instance offers us the greatest flexibility. EC2 gives us full control of the operating system and the database.

Advantages and limitations of EC2

An EC2 instance provides the same functionality as a physical machine that we use in our data center, except that there is no console access available. We will use only HVM (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/virtualization_types.html) instances. EC2 offers the added benefit of “upgrading” to a larger instance without the need to reinstall. Minimal reconfiguration of the software on the instance may be needed (e.g. Oracle SGA memory).

One minor disadvantage is that EC2 instances can only be purchased with certain combinations of CPU and Memory. We cannot choose an arbitrary combination of CPU and memory. Network throughput is generally limited by the size of the instance i.e. for high network throughput a bigger EC2 instance would be required.

Oracle Database deployment on EC2

EC2 instances can be created using the AWS Console or an API. Using an API allows operations to be scripted so that they can be made repeatable and automated.

We have chosen to script AWS resource creation as well operating system packages and database software installations. This is the best practice when working in virtual environments.

Ansible is an excellent automation tool. It uses the boto API for AWS to perform AWS resource creation and ssh to perform Linux operations like database software installs.

For AWS resource creations and Linux installs, we use available Ansible modules like ec2, yum, etc.

For Database software installs, we use silent installs triggered by Ansible “command” module.

Ansible scripts are stored in our bitbucket repository for version control and are divided into logical “ansible roles” that can be run independently. These scripts are designed to be run by the system admin who has privileges to create ec2 hosts and has root OS access.

Server Commissioning

An EC2 instance must be started from an AMI. There are several community AMIs available in AWS but our system admin has created our own AMI that we use to spawn all EC2 Linux instances.

Once a server commissioning request is received by the system admin, they:

  1. Clone bitbucket ansible repository to get the latest version.
  2. Set variables according to the commissioning request. These variables are region, vpc-id, instance-tier, subnet, security-group, size of EBS volumes, IP address, and tags for provisioning the ec2 instance, database name, and hostname for the Oracle install.
  3. Run the ansible script

The above will non-interactively do the following:

  1. Create an EC2 host
  2. Create users, groups, and RPMs needed for the Oracle install.
  3. Silently install the Grid Infrastructure binaries, followed by the management agent and Oracle Database binaries, and start an Oracle database named per the commissioning request.

The final result is an EC2 instance with a running oracle database.

Using Amazon’s Relational Database Service

Some of the limitations of Amazon’s Oracle RDS offering make it unsuitable as a general-purpose platform for hosting Oracle Database Servers in our environment. For most OIT-managed Oracle databases, we will provide Notre Dame standard management of the instances.

Advantages and Limitations of Using Amazon’s Oracle RDS

For non-OIT departments that want to implement an Oracle RDS solution on their own, we can offer the following analysis of the service. There are some limitations that should be considered before committing to the RDS platform.

  1. Amazon’s License Included model is only for Oracle Standard Edition 1. All other versions require licensing. As we reduce our Oracle Database footprint on campus there may be a point in time when it becomes advantageous to go to a processor-based licensing model vs the current site license model. This link discusses cloud licensing: (http://www.oracle.com/us/corporate/pricing/cloud-licensing-070579.pdf)
  2. Oracle RDS does not allow access to SYS or SYSDBA privileges. If your custom solution or your vendor’s software requires this level of access then RDS is not the solution. Banner is an example of this. (See “Best Practices for Running Oracle Database on Amazon Web Services” published by Amazon December 2014. http://d0.awsstatic.com/whitepapers/best-practices-for-running-oracle-database-on-aws.pdf)
  3. RDS does not allow special database parameters that are sometimes needed to fix bugs and performance issues. (We have had various parameters set for Advance, Banner, and ODS)
  4. RDS does not support micro patching. This is important for vendor-provided solutions that may need bug fixes applied.
  5. RDS Limits the number of database directories to 10 currently. Directories support various levels of integration between other Notre Dame internal databases and Vendor hosted solutions and 10 may not be enough for our needs. (http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.Oracle.CommonDBATasks.html)
  6. RDS does not support Oracle Data Guard. This is important for HA, DR, transitioning into RDS, and the ability to transition to another provider in the future. Data Guard will allow us to move the database from one hosted solution to another with minimal downtime because users could use the database while it was being copied to a new location. The only option with RDS appears to be data pump. It would mean locking all users out of the database. Then it could take days to export a large database, then copy the files to a new site and then load the data into a new database.
  7. RDS only guarantees a backup within 5 minutes unless you implement Mult-AZ. With an EC2 instance, we can use Oracle Data Guard which guarantees zero data loss of all "committed" transactions.
  8. Certain Oracle features are not currently supported by RDS. These include Real Application Clusters (RAC), Real Application Testing, Data Guard / Active Data Guard, Oracle Enterprise Manager Grid Control, Automated Storage Management, Database Vault, Streams, Java Support, Locator, Spatial, Oracle XML DB Protocol Server, and Network access utilities (such as utl_http, utl_tcp, utl_smtp, and utl_mail).
  9. RDS backups are limited to 35 days. You will need to manually manage backups if you need a retention period longer than 35 days.
  10. RDS instances cannot be shut down. They can only be terminated. Terminating an RDS instance will, by default, delete all the backups. You also need to explicitly request a backup when stopping your instance to avoid data loss. Terminating an RDS also means that when it is re-created, the instance endpoint address will be different from the last time it existed.

Oracle deployment on RDS

To deploy an Oracle RDS instance, follow the steps outlined in the migration section below. Also, make sure that the Core Services team is aware of the new instance so that is included in the master server list.

High Availability

Oracle High Availability is accomplished using Data Guard. This requires at least two servers with different AWS AZs. Redo generated at the primary database is shipped to the standby database and applied in real-time. Three configurations are available Maximum Performance, Maximum Availability, and Maximum Protection. There is a tradeoff between performance and data loss in case of failure of the primary in these configurations.

For more information about data guard please refer to https://docs.oracle.com/cd/B19306_01/server.102/b14239/concepts.htm

Data Guard is only available in Enterprise Edition.

We intend to use Data Guard in Maximum Availability mode with Fast-Start Failover (FSFO) enabled using an observer instance. The general steps are:

  1. Create a physical standby database.
  2. Set protection mode to Maximum Availability (requires database restart)
  3. Create an Observer instance, which is another Linux instance with an Oracle database home installed.
  4. Start data guard broker. The broker runs from the database home on the primary, standby, and observer instances.

Once set up FSFO can be enabled in the broker configuration. FSFO will automatically failover to and start the standby database in case of a catastrophic event on the primary. Manual switchovers to the standby can also be performed if needed. For example during maintenance windows.

Note that Multi-AZ RDS instances do not use Data Guard. They implement high availability using storage sync the details of which have not been revealed by Amazon.

Data Guard in Maximum Availability mode with Fast Start Failover (FSFO) enabled

Migration Methodology

How to migrate an Oracle database to EC2

The migration of Oracle databases to AWS is done using one of the 3 methods below:

How to migrate an Oracle database to RDS.

  1. Build an AWS RDS instance, or decide which one to target.
  2. Backup source database using Oracle export backup, compression is advised to reduce transfer time to the cloud.
  3. Transfer the database export to the RDS instance using the dbms_file_transfer package.
  4. Use the dbms_datapump package to import data to the RDS instance. This process is described in the RDS user guide here: http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Oracle.Procedural.Importing.html
  5. Review and test.

Database Monitoring

Oracle Enterprise Manager Grid Control (OEM)

All database servers are monitored using Oracle Enterprise Manager. This requires installing an agent on each database server. OEM monitors hosts, ASM, databases, listeners, data guard, and a multitude of metrics associated with these components.

An alert is generated when a metric breaches the threshold set by the DBA or a service down event occurs. This alert is sent as an email and SMS to the on-call DBA. For more information on OEM, refer to

http://www.oracle.com/technetwork/oem/enterprise-manager/documentation/index.html

Database Backup and Recovery

Data backups

Although EBS snapshots (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSSnapshots.html) can be used to back up an Oracle database, they do not provide point-in-time recovery. We will therefore use Oracle Recovery Manager (RMAN) to back up our database directly to an S3 bucket. In order to back up to S3, a separately licensed OSB module for S3 is required. The OSB module presents S3 as an SBT_TAPE device to RMAN.

Oracle charges a “per channel” license fee for OSB. The database administrators, therefore, need to be judicious about how they use these channels.

Once downloaded and installed, OSB can be used by configuring sbt_tape channels as shown below. All backups to S3 are encrypted using AES256 encryption. This is enabled by using an Oracle wallet to store the encryption keys. This is covered later in the security section.

CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 31 DAYS;

CONFIGURE DEFAULT DEVICE TYPE TO 'SBT_TAPE';

CONFIGURE CONTROLFILE AUTOBACKUP ON;

CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE 'SBT_TAPE' TO '%F';

CONFIGURE DEVICE TYPE 'SBT_TAPE' PARALLELISM 1 BACKUP TYPE TO COMPRESSED BACKUPSET;

CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE 'SBT_TAPE' TO 1;

CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE 'SBT_TAPE' TO 1;

CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' PARMS 'SBT_LIBRARY=/u01/app/oracle/product/12.1.0.2/db_1/lib/libosbws.so, SBT_PARMS=(osb_ws_pfile=/u01/app/oracle/product/12.1.0.2/db_1/dbs/osbws.ora)' FORMAT '%d-%U';

CONFIGURE MAXSETSIZE TO UNLIMITED;

CONFIGURE ENCRYPTION FOR DATABASE ON;

CONFIGURE ENCRYPTION ALGORITHM 'AES256';

CONFIGURE COMPRESSION ALGORITHM 'BASIC' AS OF RELEASE 'DEFAULT' OPTIMIZE FOR LOAD TRUE;

RMAN backup is done on the following schedule. The backup jobs are scheduled using the OEM jobs category “RMAN script” on a repeating weekly, daily, minutely or monthly frequency.

No backups are kept on disk.

The retention is set as a “recovery window” in days. We have chosen a default of 31 days as this will work for most applications. Any backup pieces not needed for recovery as defined by the recovery window are considered obsolete and are deleted every Sunday after the full backup. Monthly archival backups are kept for 18 months, obsolescence is controlled by the KEEP directive rather than the recovery window.

A catalog of backups is kept in the OEM repository database.

Server Backups

The server snapshots are performed periodically by Core Services and copies are kept in a separate AWS account for DR.

Recovery Scenarios

  1. Individual databases will be recoverable to a point in time in the recovery window using the full, incremental, and archive log backup pulled from S3.
  2. In the event a server is lost, it will be re-instantiated from the snapshot and the EBS volumes reattached.
  3. In the event that a server and its EBS volumes are lost the server will be re-instantiated from the snapshot with new, empty EBS volumes, and the databases recovered individually.

The recovery time objective for option 1 will be less than 30 minutes except in cases of very large databases where the backup file copying and database restore time may exceed this limit.

For option 2 the recovery time objective is also less than 30 minutes. In this case, since attaching the EBS volumes will negate the need to copy data files there should not be an exception for large databases.

When the server and storage are lost the recovery time will be dependent again on how quickly the backup files can be copied from S3 and the recovery performed. The size and quantity of the databases involved will determine the complete RTO.

It is recommended that any databases with a small RTO use a HA setup (Data Guard) that spans two different AZs.

Database Security Standards

Encrypted EBS volumes:

Based on best practices published by The Center for Internet Security (CIS) for Oracle. The currently available document is For Oracle Database 18c (CIS Oracle Database 18c Benchmark version 1.0.0). Adapted for practice at the University of Notre Dame by the Databases Services team.

Unless specifically authorized databases instances will reside in the private subnet of the VPC with the appropriate security groups attached.

All EBS volumes for databases are encrypted using Amazon EBS encryption http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSEncryption.html

The Notre Dame password policy will be enforced.

Glossary