regarding interconnect latency

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

regarding interconnect latency

satish
Dear Erman,

Today we have faced the performance issue on one of the rac node where cpu idle was almost down to 5%.we have killed few processes and on the database server and the instance was back to normal


Our findings:

Below are the top five pid's and corresponding sessions and serial# with sql_id and wait event

top - 11:32:21 up 141 days, 21:57,  3 users,  load average: 11.98, 11.59, 7.59
Tasks: 1009 total,   7 running, 1002 sleeping,   0 stopped,   0 zombie
%Cpu(s): 85.1 us,  1.2 sy,  0.0 ni, 13.5 id,  0.1 wa,  0.0 hi,  0.2 si,  0.0 st
KiB Mem : 26384160+total, 97140816 free, 14500060 used, 15220072+buff/cache
KiB Swap: 26843545+total, 26843545+free,        0 used. 21531563+avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 8473 oraprod   20   0 25.366g 2.186g 2.161g R 100.0  0.9  10:11.39 oracle_8473_pro
12827 oraprod   20   0 25.354g 1.159g 1.146g R  99.7  0.5  13:27.29 oracle_12827_pr
15413 oraprod   20   0 25.348g 1.070g 1.059g R  99.7  0.4  11:40.56 oracle_15413_pr
16672 oraprod   20   0 25.348g 1.076g 1.066g R  99.7  0.4   9:48.79 oracle_16672_pr
15642 oraprod   20   0 25.348g 1.076g 1.065g R  99.3  0.4  11:18.79 oracle_15642_pr




SPID      SID  serial USERNAME   STATUS   SQL_ID         PROGRAM           MODULE   OSUSER  MACHINE              EVENT

                                                                           
8473       66    7471 APPS       ACTIVE   3n1zm19fstads  frmweb@erpprodapp e:SQLAP: applpro nodeapp01.ttd.com resmgr:cpu quantum
                                                         01.ttd.com (TNS V frm:FNDR d
                                                         1-V3)             SRUN


12827    1472   33109 APPS       ACTIVE   3n1zm19fstads  frmweb@erpprodapp e:SQLAP: applpro nodeapp02.ttd.com resmgr:cpu quantum
                                                         02.ttd.com (TNS V frm:FNDR d
                                                         1-V3)             SRUN

15413      51   33477 APPS       ACTIVE   3n1zm19fstads  frmweb@erpprodapp e:SQLAP: applpro nodeapp01.ttd.com resmgr:cpu quantum
                                                         01.ttd.com (TNS V frm:FNDR d


16672     829    8588 APPS       ACTIVE   3n1zm19fstads  frmweb@erpprodapp e:SQLAP: applpro nodeapp02.ttd.com resmgr:cpu quantum
                                                         02.ttd.com (TNS V frm:FNDR d
                                                         1-V3)             SRUN


15642      76   34918 APPS       ACTIVE   3n1zm19fstads  frmweb@erpprodapp e:SQLAP: applpro nodeapp02.ttd.com resmgr:cpu quantum
                                                         02.ttd.com (TNS V frm:FNDR d
                                                         1-V3)             SRUN

All the above top processes are with resmgr:cpu quantum..not sure what exactly caused the issue.we have killed these processes

sql_id was same for all the above processes

SELECT USER_NAME,USER_NAME VALUE, DESCRIPTION DESCRIPTION, NVL('N', 'N'), NVL(TO_NUMBER(NULL), -1), NULL, NVL('Y', 'Y'), NVL(TO_CHAR(TO_DATE(NULL), 'J'), 0), NVL(TO_CHAR(TO_DATE(NULL), 'J'), 0) FROM FND_USER WHERE (END_DATE IS NULL AND EMPLOYEE_ID IN (SELECT PERSON_ID FROM PER_PEOPLE_F WHERE LOCAL_NAME IN (SELECT TO_USER FROM WF_NOTIFICATIONS)))  AND 'Y' = 'Y'  AND TO_DATE(:A00, 'YYYY/MM/DD HH24:MI:SS') BETWEEN NVL(TRUNC(TO_DATE(NULL)), TO_DATE(:A01, 'YYYY/MM/DD HH24:MI:SS')) AND NVL(TRUNC(TO_DATE(NULL)), TO_DATE(:A02, 'YYYY/MM/DD HH24:MI:SS'))    ORDER BY USER_NAME, 3


Module  SQLAP:frm:FNDRSRUN
Action  SQLAP/XXSAS_AP_VERF_DESK
Program frmweb@nodeapp1.ttd.com (TNS V1-V3)
event GC CR BLOCK2 WAY

Any suggestion after looking into above data?
How can we check if we have interconnect network latency?

Thanks for all the support
Reply | Threaded
Open this post in threaded view
|

Re: regarding interconnect latency

satish
SQL> sho parameter RESOURCE_MANAGER_PLAN

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
resource_manager_plan                string      SCHEDULER[0x72F78]:DEFAULT_MAI
                                                 NTENANCE_PLAN
SQL>
Reply | Threaded
Open this post in threaded view
|

Re: regarding interconnect latency

ErmanArslansOracleBlog
Administrator
Did you check what those processes were doing?
Did you check awr reports? 

12 Eyl 2018 Çar 09:49 tarihinde satish [via Erman Arslan's Oracle Forum] <[hidden email]> şunu yazdı:
SQL> sho parameter RESOURCE_MANAGER_PLAN

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
resource_manager_plan                string      SCHEDULER[0x72F78]:DEFAULT_MAI
                                                 NTENANCE_PLAN
SQL>



If you reply to this email, your message will be added to the discussion below:
http://erman-arslan-s-oracle-forum.2340467.n4.nabble.com/regarding-interconnect-latency-tp6698p6699.html
To start a new topic under RAC, email [hidden email]
To unsubscribe from Erman Arslan's Oracle Forum, click here.
NAML
Reply | Threaded
Open this post in threaded view
|

Re: regarding interconnect latency

satish
we have checked the processes and these are frmweb processes running the below sql as updated in my previous update

SELECT USER_NAME,USER_NAME VALUE, DESCRIPTION DESCRIPTION, NVL('N', 'N'), NVL(TO_NUMBER(NULL), -1), NULL, NVL('Y', 'Y'), NVL(TO_CHAR(TO_DATE(NULL), 'J'), 0), NVL(TO_CHAR(TO_DATE(NULL), 'J'), 0) FROM FND_USER WHERE (END_DATE IS NULL AND EMPLOYEE_ID IN (SELECT PERSON_ID FROM PER_PEOPLE_F WHERE LOCAL_NAME IN (SELECT TO_USER FROM WF_NOTIFICATIONS)))  AND 'Y' = 'Y'  AND TO_DATE(:A00, 'YYYY/MM/DD HH24:MI:SS') BETWEEN NVL(TRUNC(TO_DATE(NULL)), TO_DATE(:A01, 'YYYY/MM/DD HH24:MI:SS')) AND NVL(TRUNC(TO_DATE(NULL)), TO_DATE(:A02, 'YYYY/MM/DD HH24:MI:SS'))    ORDER BY USER_NAME, 3

The issue occured for 10minutes
Reply | Threaded
Open this post in threaded view
|

Re: regarding interconnect latency

ErmanArslansOracleBlog
Administrator
Ok. At first, I didn't see the sqlid and session info..
now it is ok.
Well. it is interesting..

1)what is the cost of that query?
2)Is that still executed in the system? (so is it now executed with reasonable durations)
3)there are resource manager specific bugs, did you check them?
   3a)actually, your OS is fully loaded, I mean your Cpus are fully loaded.. It is not the resource manager that makes these processes wait.. They already allocated the cpu and they were spinning on that. So by knowing this, we can suspect from the wait -> event GC CR BLOCK2 WAY ..  Again there are bugs for this GC wait. Did you check them? (I say check the bugs, because this problematic query seems to be an ordinary one...)
4)Are you sure about that sql? Because these seems to be coming from a customization, and maybe you were just seeing the last sql executed, but not the one , which was currently executed by those cpu intensive session? Are you sure it is the correct sql?
5)Did you speak with the developers, who developed this customization -> XXSAS_AP_VERF_DESK ? (maybe they can give you a clue.)
Reply | Threaded
Open this post in threaded view
|

Re: regarding interconnect latency

satish
Thanks erman,will check with developers.can you plz give us some good query to get the CPU consuming sessions.There are many on internet but still need this favour from you.
Reply | Threaded
Open this post in threaded view
|

Re: regarding interconnect latency

ErmanArslansOracleBlog
Administrator
Check the problematic time intervals using AWR reports.
Reply | Threaded
Open this post in threaded view
|

Re: regarding interconnect latency

satish
sure erman,my senior dba was looking into it.
I want to setup one alert which will mail us top consuming CPU sessions,we don't have oem.

Can you pls share any sql query so that I will use it in my script
Reply | Threaded
Open this post in threaded view
|

Re: regarding interconnect latency

ErmanArslansOracleBlog
Administrator
I would use OS utilities for that.
You can use top outputs for example..
Parse the output of top in your script, and then send an alert email accordingly.

There are lots of sql queris for checking the top cpu consuming session, please use one of them..

The purpose of this forum is not sending you sqls..
Reply | Threaded
Open this post in threaded view
|

Re: regarding interconnect latency

satish
thanks for the update erman