Erman Arslan's Oracle Forum - Re: Batch performance

Re: Batch performance

Posted by ErmanArslansOracleBlog on Jan 30, 2024; 7:13am
URL: http://erman-arslan-s-oracle-forum.124.s1.nabble.com/Batch-performance-tp12269p12287.html

*Send me LGWR Trace.
*Send me Cellsrv - alert.log
*Send me the Alert Log (not the whole of it, just the related part)
*Your DB version - 4 digit, Your Exadata Image version and Your Exadata Generation and model (For instance X8-2M)
*Any IORM Plan in place / active?
*Also go through the MOS Note Exadata: Troubleshooting High Log File Parallel Write Waits (Doc ID 2634755.1), and send me your findings.
*Also review the ADDM findings:

Findings
--------------------
For Read I/O Storage seems good. See TBS IO Stats .. Most of the time, you re doing your I/Os in less than 2 ms.
But! For the LGWR IO, there may be something.. Actually it is in us second level, not bad but it is related with the waits.

Regarding the log file sync events;

LGWR writes commit records and writes enough information to recover transactions.
LGWR performance may impact negatively on OLTP response times.
For instance; in case of a commit operation, that is when a session wants to commit, it posts the LGWR to write the redo information of its transactions and commit record, as well.
In this state, the session waits the LGWR and its wait event will be Log file Sync.. The session waits in this wait while LGWR is writing the redo information in to the redologs..
In details we can say that;

When a user session commits or rolls back , Oracle/LGWR needs to flush the contents of log buffer to the Redolog files. So, Log file sync occcurs in these operations.
When we wait for log file sync, we actually wait for; LGWR to wake up if idle, gather the redo to be written, issue IO (this IO is a log file parallel write, can be monitored using log file parallel wait event), process after the IO and acknowledge user session/foreground..
In addition to that, we wait for user session/foreground to wake up lastly.

*If avg time per log file sync operation << avg wait time per log file parallel write operation -> IO is bottleneck.
*If the avg wait time per log file sync operation >> avg wait time per log -> CPU bottlenecek , Run queue problem or bug. In such a situation, check CPU consumption..

you got 162.95us for the average log file parallel write..
you got 210.09us for the average log file sync

Also check the top blocking sessions event ->

Top Blocking Sessions .. There is LGWR there, and in the top session you also got LGWR.
That top LGWR session is waited on log parallel write mostly.

Also , your Flash cache hit is low and you got something with the parses.
You re mostly on the CPU, but there are also log file sync events.

For the CPU, you need to check the CPU related things. Start by analyzing Top SQL SQL ordered by CPU Time. Tuning a little thing may have e big effect on that, because you got lots of executions and those executions in total makes that CPU time.
You may check these sqls ->
hgb9kh19dvzz
1f3n8fuf5sy5r
19x1189chq3xd

Also someone runs a query using SQL Developer.. Why? -> buw2v54n4sjrb

ADDM Findings (related part)
------------------------------
Finding 2: Commits and Rollbacks
Impact is .21 active sessions, 8.3% of total activity.
------------------------------------------------------
Waits on event "log file sync" while performing COMMIT and ROLLBACK operations
were consuming significant database time.

Recommendation 1: Application Analysis
Estimated benefit is .21 active sessions, 8.3% of total activity.
-----------------------------------------------------------------
Action
Investigate application logic for possible reduction in the number of
COMMIT operations by increasing the size of transactions.
Rationale
The application was performing 57801 transactions per minute with an
average redo size of 3221 bytes per transaction.

Recommendation 2: Host Configuration
Estimated benefit is .21 active sessions, 8.3% of total activity.
-----------------------------------------------------------------
Action
Investigate the possibility of improving the performance of I/O to the
online redo log files.
Rationale
The average size of writes to the online redo log files was 3 K and the
average time per write was 0 milliseconds.
Rationale
The total I/O throughput on redo log files was 7.8 M per second for
reads and 3.3 M per second for writes.
Rationale
The redo log I/O throughput was divided as follows: 0% by RMAN and
recovery, 29% by Log Writer, 0% by Archiver, 0% by Streams AQ and 70% by
all other activity.

Symptoms That Led to the Finding:
---------------------------------
Wait class "Commit" was consuming significant database time.
Impact is .2 active sessions, 7.91% of total activity