Anthony - Awr Report Interpretation

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Anthony AWR report INTERPRETATION

PART I

What is AWR?

AWR stands for Automatically workload repository, Though there could be many types of
database performance issues, but when whole database is slow, then there are two possibilities.

1. Background process
2. A set of tables
3. A set of reports
4. Takes snapshots of statistics every hour
5. Take snapshots of high-cost SQL every hour

If you do not have Oracle Enterprise Manager tool installed then you can create the AWR reports
manually using the following commands:

1. Creating Snapshot
BEGIN
DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();
END;
/

The list of the snapshot Ids along with database Ids is availabe in the
view DBA_HIST_SNAPSHOT.

2. Dropping a Range of Snapshots.

Refer to the DBA_HIST_SNAPSHOT view column SNAP_ID to view available snapshots. To


delete contain SNAP_ID from from 102 to 122,

1|Page
BEGIN
DBMS_WORKLOAD_REPOSITORY.DROP_SNAPSHOT_RANGE (low_snap_id => 102,
high_snap_id => 122, dbid => 8187786345);
END;
/

3. Modifying Snapshot Settings

If you want to modify the retention period as 43200 minutes (30 days), the interval between each
snapshot is specified as 30 minutes, and the number of Top SQL to flush for each SQL criteria as
100 then use following:
BEGIN
DBMS_WORKLOAD_REPOSITORY.MODIFY_SNAPSHOT_SETTINGS( retention =>
43200,
interval => 30, topnsql => 100, dbid => 7123356265);
END;
/

The dbid is optional.

4. Extract the AWR Data

The awrextr.sql script extracts the AWR data for a range of snapshots from the database into a
Data Pump export file. Once created, this dump file can be transported to another system where
the extracted data can be loaded. To run the awrextr.sql script, you need to be connected to the
database as the SYS user.
To extract AWR data at the SQL prompt, enter:

SQL> @$ORACLE_HOME/rdbms/admin/awrextr.sql

5. Load the AWR Data

Once the export dump file is transported to the target system, you can load the extracted AWR
data using the awrload.sql script. The awrload.sql script will first create a staging schema where
the snapshot data is transferred from the Data Pump file into the database. The data is then
transferred from the staging schema into the appropriate AWR tables. To run the awrload.sql
script, you need to be connected to the database as the SYS user.

2|Page
To load AWR data at the SQL prompt, enter:
SQL> @$ORACLE_HOME/rdbms/admin/awrload.sql

6. Generate AWR Reports

The awrrpt.sql SQL script generates an HTML or text report that displays statistics for a range of
snapshot Ids.
To generate an HTML or text report for a range of snapshot Ids, run the awrrpt.sql script at the
SQL prompt:

SQL> @$ORACLE_HOME/rdbms/admin/awrrpt.sql
First, you need to specify whether you want an HTML or a text report.

Enter value for report_type: text

Specify the number of days for which you want to list snapshot Ids.
Enter value for num_days: 2

After the list displays, you are prompted for the beginning and ending snapshot Id for the
workload repository report.
Enter value for begin_snap: 95
Enter value for end_snap: 97

Next, accept the default report name or enter a report name. The default name is accepted in the
following example:
Enter value for report_name:
Using the report name awrrpt_1_95_97
The workload repository report is generated.

awrrpt.sql
The awrrpt.sql SQL script generates an HTML or text report that displays statistics for a range of
snapshot Ids.

awrrpti.sql
The awrrpti.sql SQL script generates an HTML or text report that displays statistics for a range
of snapshot Ids on a specified database and instance.

3|Page
awrsqrpt.sql
The awrsqrpt.sql SQL script generates an HTML or text report that displays statistics of a
particular SQL statement for a range of snapshot Ids. Run this report to inspect or debug the
performance of a SQL statement.

awrsqrpi.sql
The awrsqrpi.sql SQL script generates an HTML or text report that displays statistics of a
particular SQL statement for a range of snapshot Ids on a specified database and instance. Run
this report to inspect or debug the performance of a SQL statement on a specific database and
instance.

awrddrpt.sql
The awrddrpt.sql SQL script generates an HTML or text report that compares detailed
performance attributes and configuration settings between two selected time periods.

awrddrpi.sql
The awrddrpi.sql SQL script generates an HTML or text report that compares detailed
performance attributes and configuration settings between two selected time periods on a specific
database and instance.

OK let start analyzing AWR—part I

This is the example of AWR report. I have tried here to discuss the finding of sample AWR
report I have generated.

Report Header

This section is self-explanatory which provides database name, id, instance if RAC , platform
information and snap interval. (Database workload time duration in review).
This report is for instance number 2 of my RAC environment. So if you need to do the analysis
on RAC environment, you need to do it separately of all the instances in the RAC to see if all the
instances are balanced the way they should be.

4|Page
Load Profile
This section provides the snapshot of the database workload occurred during the snapshot
interval.

5|Page
DB time(s):
It’s the amount of time oracle has spent performing database user calls. Note it does not include
background processes.

DB CPU(s):
It’s the amount of CPU time spent on user calls. Same as DB time it does not include
background process. The value is in microseconds

Redo size:
For example, the table above shows that an average transaction generates about 19,000 of redo
data along with around 48,000 redo per second.

Logical reads:
Consistent Gets+ DB blocks Gets = Logical reads

Block Changes:
The number of block modified during the sample interval

Physical reads:
Number of block request causing I/O operation

Physical writes:
Number of physical writes performed

User calls:
Number of user queries generated

Parses:
The total of all parses; both hard and soft.

Hard Parses:
The parses requiring a completely new parse of the SQL statement. These consume both latches
and shared pool area.

Soft Parses:
Soft parses are not listed but derived by subtracting the hard parses from parses. A soft parse
reuses a previous hard parse; hence it consumes far fewer resources.

Sorts:
No of sorts performed

Logons:
No of logons during the interval

Executes:
No of SQL Executes

6|Page
Transactions:
No of transactions per second

Instance Efficiency Percentages (Target 100%)

Instance efficiency should be close to 100 %

Execute to Parse % and Parse CPU to Parse Elapsd %:

If the the value are low like in the above case of 3.40 and 0.01 means that there could be a
parsing problem. You may need to look at bind variable issues or shared pool sizing issue.

Redo NoWait%

Usually this stats is 99 or greater

In-memory Sort %
This can tell you how efficient is you sort_area_size, hash_area_size or pga_aggrigate_target are.
If you don’t have adequate sizes of sort, hash and pga parameters, then you in-memory sort per
cent will go down

Soft parse %
with 98.20 % for the soft parse meaning that about 1.72 % (100 -soft parse) is happening for hard
parsing. You might want to look at you bind variables issues.

Latch Hit %
should be close to 100.

% Non-Parse CPU:
Most of our statements were already parsed so we weren't doing a lot of re parsing. Re parsing is
high on CPU and should be avoided.

Shared Pool Statistics

7|Page
Memory Usage % is the shared pool usage. So here we have use 73.86 per cent of our shared
pool and out of that almost 94 percent is being re-used. if Memory Usage % is too large like 90
% it could mean that your shared pool is tool small and if the percent is in 50 for example then
this could mean that you shared pool is too large

Top 5 Timed Foreground Events

It’s critical to look into this section. If you turn off the statistic parameter, then the Time(s) won’t
appear. Wait analysis should be done with respect to Time(s) as there could be millions of waits
but if that happens for a second or so then who cares. Therefore, time is very important
component.

So you have several different types of waits. So you may see the different waits on your AWR
report. So let’s discuss the most common waits.

 df file type waits

db file sequential read


Is the wait that comes from the physical side of the database. it related to memory starvation and
non-selective index use. sequential read is an index read followed by table read because it is
doing index lookups which tells exactly which block to go to.

db file scattered read


Caused due to full table scans may be because of insufficient indexes or un-availability of
updated statistics

Direct Path writes


You won’t see them unless you are doing some appends or data loads
Direct Path reads:
could happen if you are doing a lot of parallel query activity

8|Page
db file parallel writes / read:
if you are doing a lot of partition activity then expect to see that wait even. it could be a table or
index partition

db file single write:


if you see this event than probably you have a lot of data files in your database.

direct path read temp or direct path write temp:


this wait event shows Temp file activity (sort, hashes, temp tables, bitmap)
check pga parameter or sort area or hash area parameters. You might want to increase them

buffer type waits


So what's going on in your memory

Latch: cache buffer chains:


check hot objects

Free buffer waits:


insufficient buffers, process holding buffers too long or i/o subsystem is over loaded. Also check
you db writes may be getting clogged up.

buffer busy waits:


see what is causing them further along in report. most of the time its data block related.

gc buffer busy:
its in the RAC environment. caused may be because of not enough memory on your nodes,
overloaded interconnect. Also look RAC specific section of the report latch:

cache buffers lru chain – Free list issues, hot blocks latch: cache buffer handles – Free list
issues, hot blocks

buffer busy - See what is causing them further along in report

no free buffers – Insufficient buffers, dbwr contention

 Log Type Waits


log file parallel write – Look for log file contention
log buffer space – Look at increasing log buffer size
log file switch (checkpoint incomplete) – May indicate excessive db files or slow IO subsystem

9|Page
log file switch (archiving needed) – Indicates archive files are written too slowly
log file switch completion – May need more log files per
log file sync – Could indicate excessive commits

 GC Events
gccr multi block request – Full table or index scans
gc current multi block request – Full table or index scans
gccr block 2-way – Blocks are busy in another instance, check for block level contention or hot
blocks
gccr block 3-way – Blocks are busy in another instance, check for block level contention or hot
blocks
gccr block busy – Blocks are busy in another instance, check for block level contention or hot
blocks
gccr block congested – cr block congestion, check for hot blocks or busy interconnect
gccr block lost – Indicates interconnect issues and contention
gc current block 2-way – Blocks are busy in another instance, check for block level contention
or hot blocks
gc current block 3-way – Blocks are busy in another instance, check for block level contention
or hot blocks
gc current block busy – Block is already involved in GC operation, shows hot blocks or
congestion
gc current block congested – current block congestion, check for hot blocks or busy
interconnect
gc current block lost - Indicates interconnect issues and contention

 Undo Events
undo segment extension – If excessive, tune undo
latch: In memory undo latch – If excessive could be bug, check for your version, may have to
turn off in memory undo
wait for a undo record – Usually only during recovery of large transactions, look at turning off
parallel undo recovery.

 What Next?
Determine wait events of concern
Drill down to specific sections of report for deeper analysis
Use custom scripts, ADDM and Ash to investigate issues

10 | P a g e
10 Steps to Analyze AWR Report in Oracle
PART II

As you have Generated AWR Report in Oracle, Next task is to Analyze AWR Report in
Oracle. By Reading AWR Report you can easily solve issues like Slow database, high wait
events, Slow query and many more. Though it’s a lengthy report but Analyzing or Reading
relevant part of AWR Report can help to troubleshoot issues in easy and fast manner.

1. Issue with Database Machine. OS Watcher is the best tool to start.


2. If Database performance issue, Then AWR Report is the place to look at.

In case if a particular query is not performing well, i would suggest looking at execution plan of
the query, stats of underlying table etc. In this case AWR won't help much.

Recommendations before getting an AWR Report

1. Collect Multiple AWR Reports: It's always good to have two AWR Reports, one for good
time (when database was performing well), second when performance is poor. This way Remote
DBA can easily compare good and bad report to find out the culprit.

2. Stick to Particular Time: "Database is performing slow" will not help anymore to resolve
performance issues. We have to have a specific time like Database was slow yesterday at 1 Pm
and continue till 4Pm. Here, DBA will get a report for these three hours.

3. Split Large AWR Report into Smaller Reports: Instead of having one report for long time like
one report for 4hrs. it's is better to have four reports each for one hour. This will help to isolate
the problem.

In case of RAC environment

Generate one report for each instance. Once, you have generated AWR report. Now, it's time of
analyze the report. Since, AWR report is a huge report and area to look into AWR is also
depends on problem to problem. Here, I am list most common area for a DBA to look into which
will give a clear picture of the issue.

Steps to Analyze AWR Report

1. Database Details:

After getting an AWR Report This is first and Top part of the report. In this part cross check
for database and instance and database version with the Database having performance issue.
This report also show RAC=YES if it's an RAC database.

11 | P a g e
2. Host Configuration:

This will give you name, platform CUP, socket and RAM etc. Important thing to notice is
number of cores into the system. In this example there are 12 CUP's in Cores.

3. Snap Shot Detail:

This are the detail about snap shot taken, Snap start time and end time. Difference between
them is as "Elapsed". Here is a new term "DB Time"

DB Time= session time spent in database.

DB Time= CPU Time + Non IDLE wait time.

You can find, DB time is very large as compared to Elapse time, which is not a concern. Check if
you have taken a report for the time having performance problem. If yes fine, otherwise take a
report for performance problem time.

Next is Cache Sizes, which is just detail about SGA components.

4. Load Profile:

Here are few important stats for a DBA to look into. Fist is "DB CPU(s)" per second. Before that
let's understand how DB CPU's work. Suppose you have 12 cores into the system. So, per wall
clock second you have 12 seconds to work on CPU.

12 | P a g e
So, if "DB CPU(s)" per second in this report > cores in (Host Configuration = 24(on #2)).

means environment is CPU bound and either need more CPU's or need to further check is this
happening all the time or just for a fraction of time. As per my experience there are very few
cases, when system is CPU bound.

In this case, machine has 12 cores and DB CPU(s) per second is 6.8. So, this is not a CPU
bound case.

Next stats to look at are Parses and Hard parses. If the ratio of hard parse to parse is high, this
means Database is performing more hard parse. So, needs to look at parameters like cursor
sharing and application level for bind variables etc.

5. Instance Efficiency Percentages

In these statistics, you have to look at "% Non-Parse CPU". If this value is near 100% means
most of the CPU resources are used into operations other than parsing, which is good for
database health.

13 | P a g e
6. Top 5 Timed Foreground Events:

This is another most important stats to consider while looking at AWR Report for any database
performance related issue. This has a list of top 5 foreground wait events.

Here, first of all check for wait class if wait class is User I/O , System I/O, Others etc this could
be fine but if wait class has value "Concurrency" then there could be some serious problem. Next
to look at is Time (s) which show how many times DB was waiting in this class and then Avg
Wait (ms). If Time(s) are high but Avg Wait (ms) is low then you can ignore this. If both are
high or Avg Wait (ms) is high then this has to further investigate.

In the above screen shot, most of the resource are taken by DB CPU = 64% DB time. Taking
resource by DB CUP is a normal situation.

Let's take an example, In which event is "log file switch (checkpoint incomplete) " which has
high waits, huge Time (s) and large values in Avg Wait (ms) and wait class is configuration.
So, here you have to investigate and resolve log file switch (checkpoint incomplete).

Host CPU, Instance CPU and Memory Statistics are self explanatory. Next is RAC Statistics, I
did not find any issue in these stats most of the time.

7. Time Model Statistics:

This is a detailed explanations of system resource consumptions. Stats are order by Time (s) and
% of DB Time.

14 | P a g e
A noticeable result Sum of all % of DB time is > 100%. why is this ?

Because this is cumulative time i.e. In this case SQL execute elapsed time is taking 89% of DB
time, which includes it sub parts like parse time elapsed, hard parse elapsed time etc. So, if you
find Hard parse time elapsed is taking more %. So investigate further so on and so forth.

DBA has to look for stat which is taking abnormal % of DB time.

8. Operating System Statistics - Detail

This is the information related to OS, what is the load status on System shown here.

This report shows, system is 62 and 70% idle at time of report taken, So, there is no resource
crunch at system level. But if, you found very high busy, user or sys % and indeed this will led to
low idle %. Investigate what is causing this. OS Watcher is the tool which can help in this
direction.

Next, very crucial part of AWR report for a DBA is SQL Statistics. Which has all sql query
details executed during report time interval?

15 | P a g e
We will explore few of them, To understand, how to analyzed these reports. Let's start with

9. SQL Ordered by Elapsed Time:

As explained by name itself, this lists SQL queries ordered by Elapsed time into reported time
interval.

In this report, look for query has low executions and high Elapsed time per Exec (s) and this
query could be a candidate for troubleshooting or optimizations. In above report, you can see
first query has maximum Elapsed time but no execution. So you have to investigate this.

In Important point, if executions are 0, it doesn't means query is not executing, this might be the
case when query was still executing and you took AWR report. That's why query completion
was not covered in Report.

10. SQL Ordered by CUP Time:

In this report, SQL queries are listed on the basis of CPU taken by the query i.e. queries causing
high load on the system. The top few queries could be the candidate query for optimization.

16 | P a g e
From above stat, look for queries using highest CPU Times, if a query shows executions 0, this
doesn't means query is not executing. It might be same case as in SQL queries ordered by
Elapsed time. The query is still executing and you have taken the snapshot.

However, There are so many other stats in AWR Report which a DBA needs to consider, I have
listed only ten of them but these are the most commonly used stats for any performance related
information.

17 | P a g e

You might also like