application server CPU go to >80 and hang after nearly 24 hour the same problem repeats every day

-1

I have IBM WebSphere Application 8.5 server work with Db2 11.1 works from 2 years. Since a month the Application server hangs, the dB CPU goes to 0 and the application server CPU go to >80 , and hang after nearly 24 hour the same problem repeats every day. with logs on app server

db2diag Error today 2020-12-09-10.03.24.732486+120 I1234525159E610 LEVEL: Error PID : 5737 TID : 139739072030464 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : WPJCR APPHDL : 0-38161 APPID: ::ffff:x.42258.201209075007 UOWID : 199 ACTID: 1 AUTHID : DB2INST1 HOSTNAME: ERTUWCMDB1Az EDUID : 1760 EDUNAME: db2agent (WPJCR) 0 FUNCTION: DB2 UDB, common communication, sqlcctest, probe:50 MESSAGE : sqlcctest RC DATA #1 : Hexdump, 2 bytes 0x00007F1789BFCDE0 : 3600 6.

2020-12-09-10.03.24.732661+120 I1234525770E601 LEVEL: Error PID : 5737 TID : 139739072030464 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : WPJCR APPHDL : 0-38161 APPID: ::ffff:x.42258.201209075007 UOWID : 199 ACTID: 1 AUTHID : DB2INST1 HOSTNAME: ERTUWCMDB1Az EDUID : 1760 EDUNAME: db2agent (WPJCR) 0 FUNCTION: DB2 UDB, base sys utilities, sqeAgent::AgentBreathingPoint, probe:10 CALLED : DB2 UDB, common communication, sqlcctest RETCODE : ZRC=0x00000036=54

[11/3/20 6:42:13:596 EET] 000006ad XATransaction E J2CA0027E: An exception occurred while invoking rollback on an XA Resource Adapter from DataSource jdbc/wpjcrdbDS, within transaction ID {XidImpl: formatId(57415344), gtrid_length(36), bqual_length(54),

data(000001758c648aa7000000082a775800f8c220c5f6bdab92156eae0be31e28ea7605ade8000001758c648aa7000000082a775800f8c220c5f6bdab92156eae0be31e28ea7605ade8000000010000000000000000000000000001)} : com.ibm.db2.jcc.am.XaException: [jcc][t4][2041][12326][4.25.13] Error executing XAResource.rollback(). Server returned XAER_NOTA. ERRORCODE=-4203, SQLSTATE=null

After a while the dB CPU goes to 0 and the application server CPU go to >80 and hang after nearly 24 hour the same problem repeats.

is this deadlock or locktimeout due to data corruption??

db2
websphere
database-deadlocks
lock-timeout
asked on Stack Overflow Dec 10, 2020 by noha Abdallah • edited Dec 10, 2020 by noha Abdallah

1 Answer

0

Without seeing any other app server logs, the combination of you noting that

  1. "nearly 24 hour the problem repeats"
  2. the sqeAgent::AgentBreathingPoint error (see IBM technote https://www.ibm.com/support/pages/what-does-agentbreathingpoint-error-mean-db2 for more info)
  3. the "works from 2 years. Since a month the Application server hangs"

would lead me to look for a change in your network where an connection timeout has been set recently, closing connections after 24 hours. This can be caused by replacing a router or upgrading firmware where settings are different. Does this occur at about the same time everyday and if so, is it occurring as the app goes from a quiet state (like overnight) to a busy state (like start of a workday)? Based on your answer, it sounds like the entire connection pool is becoming "stale" overnight, meaning the connections are not being used and a network timeout is causing them to become disconnected from the db server. You can try changing the WAS datasource settings for "Minimum connections" to 0 and the "Unused Timeout" to perhaps 12 hours. This will allow the connection pool to drain overnight as the server traffic quiesces. As the app load starts in the morning, new connections will be obtained, avoiding the errors. If your "Maximum Connections" settings is very large, you may experience some slowness as the connection pool is being filled.

answered on Stack Overflow Dec 10, 2020 by F Rowe • edited Dec 10, 2020 by F Rowe

User contributions licensed under CC BY-SA 3.0