ORA-4031 On RAC Database With 'Ges Big MSG P' Consuming Memory (Doc ID 1433164.1)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

8/6/2020 Document 1433164.

PowerView is Off Switch to Cloud Support Pedro (Available) (0) Contact Us Help

Dashboard Knowledge Service Requests Patches & Updates

Give Feedback...
Copyright (c) 2020, Oracle. All rights reserved. Oracle Confidential.

ORA-4031 on RAC Database With 'ges big msg p' Consuming Memory (Doc ID 1433164.1) To Bottom

In this Document Was this document helpful?

Symptoms Yes
No
Changes
Cause
Document Details
Solution
References
Type:
PROBLEM
Status:
PUBLISHED
Last Major
07-Aug-2019
Update:
05-Mar-2020
Last Update:
APPLIES TO:

Oracle Database - Enterprise Edition - Version 10.2.0.5 and later Related Products
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database - Enterprise
Gen 1 Exadata Cloud at Customer (Oracle Exadata Database Cloud Machine) - Version N/A and later Edition
Oracle Cloud Infrastructure - Database Service - Version N/A and later Oracle Database Cloud
Oracle Database Backup Service - Version N/A and later Schema Service
Information in this document applies to any platform. Gen 1 Exadata Cloud at
Customer (Oracle Exadata
Database Cloud Machine)
SYMPTOMS Oracle Cloud Infrastructure -
Database Service
Oracle Database Backup
Service
This incident was observed on a Oracle 10.2.0.5 on a Solaris SPARC 64-bit server, using manual memory management, and Show More
on a RAC database.
Information Centers
The database may encounter a short duration hang, maybe 3 to 4 minutes, and there may be no error messages in the alert log.
Information Center: Overview
Database Server/Client
The buffer cache will be large. In this example, it is 64G. Installation and
The shared pool in this example is 10G. Upgrade/Migration
There is most likely many subpools in the shared pool. [1351022.2]

Index of Oracle Database


You may also notice the presence of a ORA-4031 trace file in the dump directory, and this ORA-4031 trace file will have the Information Centers
same time stamp as the hang. [1568043.2]

インフォメーション・センタ
A review of the ORA-4031 trace file will show the subpools, and we can get the largest memory consumers in each subpool. ー: データベースおよび
Enterprise Manager 日本語ド
This is an example with seven subpools and this example shows the top 5 consumers in each subpoool. キュメント [1946305.2]

Information Center: Overview


subpool 1 major consumers (top 5)
of Database Security Products
------------------------- [1548952.2]
"gcs resources " 205908312
"sql area " 172895856 Oracle Information Center
"gcs shadows " 118403984 Catalog: All Products -
Database - EBS - JDE - Fusion
"library cache " 78437344
- Middleware - GBUs - Siebel -
"Checkpoint queue " 74761344 Sun Systems - PeopleSoft -
Enterprise Manager - MICROS
- Cloud - IaaS - PaaS - SaaS
subpool 2 major consumers (top 5) [50.2]
------------------------ Show More
"gcs resources " 205869184
"gcs shadows " 169042816
"sql area " 159811352 Document References
"FileOpenBlock " 114943056 Troubleshooting and
"Checkpoint queue " 75017376 Diagnosing ORA-4031 Error
[Video] [396940.1]

ORA-4031 Due To Large 'GCS


subpool 3 major consumers (top 5) RESOURCES' And 'GCS
------------------------ SHADOWS' [844879.1]
"gcs resources " 266302280
"sql area " 199968128
"gcs shadows " 169021384 Recently Viewed
"library cache " 76432024
"Checkpoint queue " 74761344 Changing Database Name
With Standby Database In
Place Using NID Utility
[790406.1]
subpool 4 major consumers (top 5)
Dataguard Physical Standby
------------------------ occurs ORA-600 [2130] after
"gcs resources " 266281424 added new node to primary
"sql area " 199432544 RAC database. [759181.1]
"gcs shadows " 118437168 Archive Log Shipping From
"library cache " 76502416 Primary to Standby Fails
"Checkpoint queue " 74761344 With ORA-12514 [563801.1]
After Switchover using
Dataguard Broker there is a

https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=118740011547227&parent=EXTERNAL_SEARCH&sourceId=PROBLEM&id… 1/3
8/6/2020 Document 1433164.1
subpool 5 major consumers (top 5) Gap between primary and
------------------------ standby And The Archive
Logs Are Not On Tape Or
"ges big msg p " 1848340560 <----- note largest consumer
Disk [2158808.1]
"gcs resources " 205907944
"gcs shadows " 118439152 LOG FILE SYNC WAITS
SPIKES DURING RMAN
"Checkpoint queue " 75017376 ARCHIVELOG BACKUPS
"library cache " 60400992 [1229104.1]
Show More

subpool 6 major consumers (top 5)


------------------------
"sql area " 206921392
"gcs resources " 205888416
"gcs shadows " 118414264
"library cache " 75584432
"Checkpoint queue " 74761344

subpool 7 major consumers (top 5)


------------------------
"gcs resources " 205875496
"sql area " 172248416
"free memory " 146475520
"gcs shadows " 118418472
"Checkpoint queue " 75273408
"library cache " 74915600

If there is a ASH report present for the time of the hang, you may notice the presence of Dynamic ReMastering (DRM) occurring.
It will look like this:

gcs drm freeze in enter server mode

Note: This ORA-04031 error can generate a database crash. The alert.log file shows the following information for this crash:

...
Errors in file /<diagnostic_dest>/diag/rdbms/<dbname>/<instname>/trace/<sid>_lmd1_27987.trc
(incident=289814):
ORA-04031: unable to allocate 8504 bytes of shared memory ("shared pool","unknown object","sga
heap(1,0)","ges big msg pool")
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
DDE: Problem Key 'ORA 4031' was completely flood controlled (0x6)
Further messages for this problem key will be suppressed for up to 10 minutes
Mon Dec 11 19:41:07 2017
LMD0 (ospid: 27985) waits for event 'SGA: allocation forcing component growth' for 0
secs..........*****
Errors in file /<diagnostic_dest>/diag/rdbms/<dbname>/<instname>/trace/<sid>_lmhb_27999.trc
(incident=285944):
ORA-29770: global enqueue process LMD0 (OSID 27985) is hung for more than 70 seconds
Incident details in:
/<diagnostic_dest>/diag/rdbms/<dbname>/<instname>/incident/incdir_285944/<sid>_lmhb_27999_i285944.trc
LOCK_DBGRP: GCR_SYSTEST debug event locked group GR+DB_PANRPRD by memno 1
ERROR: Some process(s) is not making progress.
LMHB (ospid: 27999) is terminating the instance.
Please check LMHB trace file for more details.
Please also check the CPU load, I/O load and other system properties for anomalous behavior
ERROR: Some process(s) is not making progress.
LMHB (ospid: 27999): terminating the instance due to error 29770

CHANGES

CAUSE

The data show that in most subpools the major memory consumer is 'GCS resources' which is directly due to a very large buffer
cache. The sum of the GCS components
and the GES components in the subpools are a significant portion of the total shared pool. So the shared pool is under stress
from the RAC components (GCS and GES).

When the RAC remastering occurs, it initiates a large 'GES big msg pool' and is responsible for redistributing resources, mainly
locks.
The GES big msg pool will be isolated to a single heap so it will be in a single subpool and can cause a short term memory
shortage when there are many subpools.
If the shared pool is already under stress, this sudden message pool request can push the shared pool into a ORA-4031
condition.

SOLUTION

None of the above scenario is a bug. The occurrences are all normal and expected.

What it means is that the shared pool is undersized for this RAC environment and needs to be increased, so that it will be able to

https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=118740011547227&parent=EXTERNAL_SEARCH&sourceId=PROBLEM&id… 2/3
8/6/2020 Document 1433164.1
handle the GCS and GES components,
as well as a sudden memory spike when dynamic ReMastering occurs.

The solution is to increase the size of the shared pool.

Other solutions to consider:


Will reducing the number of subpools help?

Possibly. When the number of subpools is manually reduced (by setting _kghdsidx_count), each subpool will be
larger, so the shared pool would be better equipped to handle a large requirement like the GES big msg pool..

*** The _kghdsidx_count parameter should not be set without guidance from Oracle Support. ***

The downside of this solution is that the number of latch sets are also reduced (each subpool has its own latch
set), so you increase the risk of latch contention. At best, this solution is a just a way to avoid an increase to the
shared pool size, at the risk of latch problems.

To determine the number of subpools, use this SQL


select KSMCHIDX "SubPool", sum(ksmchsiz) Bytes
from x$ksmsp
group by ksmchidx;

Will switching to Automatic Shared Memory Management (ASMM) help?

Possibly. ASMM allows the shared pool to take memory from the buffer cache, offsetting a ORA-4031. But ASMM
will require an increase in the SGA. The reason is that with the enabling of ASMM, each subpool is further divided
into 4 durations. So the largest contiguous piece of shared pool, is shared_pool_size / #of subpools / 4.
Additionally, unless some excess is built in to the SGA_TARGET, if the buffer cache fills up and cannot provide any
free memory, the shared pool will not be able to grow. This solution is best if you build in excess memory to
handle the unexpected memory requirements.

NOTE: It is NOT recommended to run queries on X$KSMSP when the database instance is under load.
Performance of the database will be impacted, especially nowadays with very large SGAs. Bug 14020215 was
filed for ORA-600 errors and unplanned outages running queries directly on X$KSMSP. There is a view,
X$KSMSP_NWEX, in later versions of 11g that is safer to use for investigation of memory usage. However, we
STRONGLY recommend you not run these queries unless given specific instructions from Oracle Support to do
so.

REFERENCES

BUG:10076669 - ONE OF THE RAC INSTANCE CRASHED WITH ORA-600 [KJMCHKISEQ:!SEQ] AFTER ORA-4031
NOTE:396940.1 - Troubleshooting and Diagnosing ORA-4031 Error [Video]
NOTE:844879.1 - ORA-4031 Due To Large 'GCS RESOURCES' And 'GCS SHADOWS'
Didn't find what you are looking for? Ask in Community...

Related
Products

Oracle Database Products > Oracle Database Suite > Oracle Database > Oracle Database - Enterprise Edition > RDBMS > ORA-4031 errors raised from shared pool
Oracle Cloud > Oracle Platform Cloud > Oracle Database Cloud Service > Oracle Database Cloud Schema Service
Oracle Cloud > Oracle Infrastructure Cloud > Oracle Cloud at Customer > Gen 1 Exadata Cloud at Customer (Oracle Exadata Database Cloud Machine)
Oracle Cloud > Oracle Platform Cloud > Oracle Cloud Infrastructure - Database Service > Oracle Cloud Infrastructure - Database Service
Oracle Cloud > Oracle Platform Cloud > Oracle Database Backup Service > Oracle Database Backup Service
Oracle Cloud > Oracle Platform Cloud > Oracle Database Cloud Service > Oracle Database Cloud Exadata Service
Oracle Cloud > Oracle Platform Cloud > Oracle Database Cloud Service > Oracle Database Exadata Express Cloud Service
Oracle Cloud > Oracle Platform Cloud > Oracle Database Cloud Service > Oracle Database Cloud Service

Keywords
BUFFER CACHE; CACHE; FROZEN; HANGING; RAC; SHARED POOL
Errors
04031; ORA-4031

Back to Top
Copyright (c) 2020, Oracle. All rights reserved. Legal Notices and Terms of Use Privacy Statement

https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=118740011547227&parent=EXTERNAL_SEARCH&sourceId=PROBLEM&id… 3/3

You might also like