Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Troubleshoot Oracle - hung process

I'm trying to understand a problem we're having with a Java process that is hanging. This process has been running in production for about 4 months and earlier this week it started hanging. When I look at a thread dump of the process all of the relevant threads (3) have stacks like the following:

    "TxnParser_1" prio=6 tid=0x69bd3400 nid=0x2534 runnable [0x6aa2f000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at oracle.net.ns.Packet.receive(Unknown Source)
        at oracle.net.ns.DataPacket.receive(Unknown Source)
        at oracle.net.ns.NetInputStream.getNextPacket(Unknown Source)
        at oracle.net.ns.NetInputStream.read(Unknown Source)
        at oracle.net.ns.NetInputStream.read(Unknown Source)
        at oracle.net.ns.NetInputStream.read(Unknown Source)
        at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1099)
        at oracle.jdbc.driver.T4CMAREngine.unmarshalSB1(T4CMAREngine.java:1070)
        at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:478)
        at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:207)
        at oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:790)
        at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1039)
        at oracle.jdbc.driver.T4CStatement.executeMaybeDescribe(T4CStatement.java:830)
        at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1132)
        at oracle.jdbc.driver.OracleStatement.executeInternal(OracleStatement.java:1687)
        at oracle.jdbc.driver.OracleStatement.execute(OracleStatement.java:1653)
        - locked <0x40e22f88> (a oracle.jdbc.driver.T4CStatement)
        - locked <0x28f8d398> (a oracle.jdbc.driver.T4CConnection)
        at com.gcg.data.LogParsingInfo.initFromDB(LogParsingInfo.java:262)
        at com.gcg.om.OmQueueEntry.initParseInfoFromDB(OmQueueEntry.java:104)
        at com.gcg.om.GenericQueueEntry.run(GenericQueueEntry.java:237)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)

There are no threads waiting for locks so the process isn't deadlocked. These 3 threads that are doing the work are just blocked waiting for a response from Oracle, at least that is what it looks like to me.

Looking at Oracle, when I query v$session, it looks like one of the connections associated with these threads is currently executing a query, although I can't see the sql.

select ... from v$session where ...;
SQL_ADDRESS      SQL_HASH_VALUE SQL_ID        SQL_CHILD_NUMBER SQL_EXEC_START SQL_EXEC_ID PREV_SQL_ADDR    PREV_HASH_VALUE PREV_SQL_ID   PREV_CHILD_NUMBER PREV_EXEC_START PREV_EXEC_ID
---------------- -------------- ------------- ---------------- -------------- ----------- ---------------- --------------- ------------- ----------------- --------------- ------------
              00              0                                                           0000000239F59EE8      1483377872 fqr8pndc6p36h                 5 26-JUL-12           32080545
              00              0                                                           0000000239F59EE8      1483377872 fqr8pndc6p36h                 5 26-JUL-12           32080546
0000000148CABD88     1784444892 a16hxxtp5sxyw                                             0000000239F59EE8      1483377872 fqr8pndc6p36h                 5 26-JUL-12           32080544

select * from v$sql where sql_id = 'a16hxxtp5sxyw';

no rows selected

My questions are:

  1. Am I correct in my analysis that the process is simply blocked waiting for a response from Oracle?
  2. What should I be looking for in Oracle to understand why this process is blocking?

Updated:

Based on the comment regarding looking in DBA_WAITERS and DBA_LOCKS

select * from dba_waiters;

no rows selected

select * from dba_locks where BLOCKING_OTHERS <> 'Not Blocking';

no rows selected 

There were 98 rows in dba_locks but since all are 'Not Blocking' I don't think it is a locking issue? The process in question has been in this state for over 3 hours so any deadlock would have been detected by now.

I'm of the theory that the Oracle instance is not "healthy", but I'm at a loss as to what to look at. I have a request in to reboot the Oracle server, but that hasn't been done yet.

Follow-up question: Is it normal for the v$session to contain a sql_id that doesn't exist in v$sql and if so, under what conditions?

like image 369
sceaj Avatar asked Jul 26 '12 16:07

sceaj


People also ask

How to check hung threads in Oracle?

If the application console is available, then press Control+\ (on Oracle Solaris or Linux), or Control+Break (on Windows) to cause the HotSpot VM to print a thread dump. On Oracle Solaris and Linux operating systems the thread dump can also be obtained by sending a SIGQUIT to the process (command kill -QUIT pid).

Why is process hanging?

In computing, a hang or freeze occurs when either a process or system ceases to respond to inputs. A typical example is when computer's graphical user interface (such as Microsoft Windows) no longer responds to the user typing on the keyboard or moving the mouse.

How to Detect hung threads in java?

The Java ThreadPool does not have a mechanism for detecting hanging threads. Using a strategy like fixed threadpool ( Executors. newFixedThreadPool() ) will not work because if some tasks hang over time, all of the threads will eventually be in a hung state.

What is Jstack command?

Description. The jstack command prints Java stack traces of Java threads for a specified Java process. For each Java frame, the full class name, method name, byte code index (BCI), and line number, when available, are printed. C++ mangled names aren't demangled.


1 Answers

The problem was solved and the answer was right in the v$session table. Apparently Oracle sessions can block for reasons other than just locking. Notice the column FINAL_BLOCKING_SESSION - it identifies the session that is the root cause of the blocking. We investigated session 845 and found that the client process (identified by MACHINE and PORT) no longer existed. The DBA killed session 845 and all returned to normal.

SID     SERIAL# STATUS    PROGRAM          TYPE SQL_ID        PREV_SQL_ID    BLOCKING_SESSION_STATUS BLOCKING_INSTANCE BLOCKING_SESSION FINAL_BLOCKING_SESSION_STATUS FINAL_BLOCKING_INSTANCE FINAL_BLOCKING_SESSION EVENT
------- ------- --------- ---------------- ---- ------------- -------------- ----------------------- ----------------- ---------------- ----------------------------- ----------------------- ---------------------- ----------------------------
 108    22447   ACTIVE    Gcg log parser 1 USER               fqr8pndc6p36h  VALID                   1                 1581             VALID                         1                       845                    library cache: mutex X
 639    40147   ACTIVE    Gcg log parser 3 USER               fqr8pndc6p36h  VALID                   1                 1581             VALID                         1                       845                    library cache: mutex X
 742    34683   ACTIVE    Gcg log parser 2 USER a16hxxtp5sxyw fqr8pndc6p36h  VALID                   1                 1581             VALID                         1                       845                    library cache: mutex X
like image 61
sceaj Avatar answered Oct 02 '22 22:10

sceaj