Oracle Active Session Status on a Client's Abnormal Shutdown

https://stackoverflow.com/questions/23636814

21-07-2023
|

Question

Good time.

Some days ago our DB team has detected that there are sessions with an 'ACTIVE' status for clients that are not active any more. An investigation showed that there are two main sources for such issues:

remote SQL Developer connections (actually, this case is not very interesting),
abnormal tomcat (where our application was running) shutdowns (like 'kill -9')

It is strange for me that all that sessions are in the 'ACTIVE' status. Could somebody, please, clarify how could this be (maybe there is something with underlying processes that wait on the corresponding sockets or ... As far as everything is ok upon a proper tomcat shutdown, it seems like a root cause is with transactions...)?

Would it help if we set the 'IDLE_TIME' (for all connections) and 'EXPIRE_TIME' (for all our RAC instances)?

Am I right that the following scenarios should take place (with the above parameters set up):

When a client connects it's session is marked as 'ACTIVE'
Without respect to the 'ACTIVE' status, there is a 'ping' process, originated by the 'EXPIRE_TIME' parameter that pings a client.
If a ping process fails for the EXPIRE_TIME time period, even though a session is 'ACTIVE', the session is being killed by oracle.
If a client responds on pings, but does not do any processing, after an IDLE_TIME time period it's session becomes 'INACTIVE' and (if the 'IDLE_TIME' parameter is set) after some time - 'SNIPED'. After that a 'SMON' process starts a housekeeping activity for this session (and others with the 'SNIPED' status).

UPDATE:

It seems that the only way to work with such a situation is to configure Oracle instances. There are results of my investigation:

https://community.oracle.com/thread/873226?start=0&tstart=0

For Dead Connection Detection is used server side sqlnet.ora file parameter SQLNET.EXPIRE_TIME= <# of minutes>

The other option would be implement idle_time in profile setting. And then with some job kill SNIPED sessions (when idle_time will be reached, session will become from INACTIVE to SNIPED).

If I open up a connection and go off to lunch, an IDLE_TIME limit will cause my session to be terminated after 15 minutes of inactivity. An EXPIRE_TIME of 15 minutes will merely have Oracle send a packet to my client application to verify that the client has not failed. If the client is up, it will answer the ping, and my session will stay around indefinitely. EXPIRE_TIME only causes sessions to be killed if the client application fails to respond to the ping, implying that the client process has failed or that the client system has failed. IDLE_TIME will kill sessions that don't have activity for a period of time, but that generally doesn't work well with applications that maintain a connection pool, since the assumption there is that the connection pool will have a fair number of connections that are idle for periods of the day and since applications that use connection pools tend to react poorly to connections in the pool getting killed.
https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:2233109200346833212

TCP/IP doesn't interrupt things by default, by design. When a connection goes away, the client and/or server do not immediately get notified. So, if your client connects to the database and does nothing and you unplug your client (blue screen it, kill it, pull out the network cable, crash the computer, whatever) the odds are the session will stay in the database. The server will not know that it will never receive a message from you. We have dead client detection for that if it becomes an issue:

http://download.oracle.com/docs/cd/B12037_01/network.101/b10776/sqlnet.htm#sthref476

As for the active session, that is really easy. You open a connection, you submit a request over this connection like "lock table T". Table t is locked by your transaction in your session. You then submit a block of code like:

begin loop dbms_lock.sleep(5); end loop; end; /

your session will be active for as long as that code is running - the client process is blocked on a socket read waiting for the server to send back a result - a response (which of course will never come). The server isn't touching the network at all right now - it is active with your code. So, if your client 'dies' right now - the block of code will continue to run, and run, and run - and since it never commits - it'll just hang in there and run and of course any locks you have will remain in place.
http://www.databaseskill.com/4267817/
http://www.dba-oracle.com/t_connect_time_idle_expire_timeout.htm

The sqlnet.expire_time parameter is used to set a time interval, in minutes, to determine how often a probe should be sent verifying that client/server connections are active. If you need to ensure that connections are not left open indefinitely (or up to the time set by operating system-specific parameters), you should set a value that is greater than 0. This protects the system from connections left open due to an abnormal client termination.
https://asktom.oracle.com/pls/apex/f?p=100:11:0::NO::P11_QUESTION_ID:453256655431

If the session is waiting for a resource locks or latches, and if this wait time exceeds idle_time setting in th profile, does the session gets sniped, even if the session is in the middle of a transaction and waiting for lock etc.

If so, will there be any entries in the alert log.

Followup

if waiting for a lock, you are active -- not idle.

These page get my attention, six month before I had an Oracle Support for a Data Guard issue, so one of the Oracle guys, notice that I use the Idle_Time and he told me that this parameter dont work very well because Oracle dont release the resource of sessions that were marked as snipped, until the next time the user try to use it (waiting to tell your session was killed, to clear the session resources)

Followup

... after an investigation... the "session" is there, that will not go away until the client acknowledges it, but the "transaction" is gone.

Tom, I've altered a profile to have IDLE_TIME=240(4 hours) and made sure my resource_limit parameter is set to TRUE. When I query v$session I see some "snipped" sessions, but also "inactive" ones that have been idle for more than a day. All those users have this profile assigned to them. If the user session was connected before idle_time was set, would those sessions be affected by this change or not? I've made a change quite some time ago. Is there anything else I should have done?

Followup

If the user session was connected before idle_time was set, they are "grandfathered" in -- they will not be sniped. it only affects new sessions.
http://agstamy.blogspot.ru/2011/03/profile-and-resource-limit.html
additional stuff and recommendations: https://rebby.com/blog.php?detail=32

Solution

We have checked the parameters, listed in the above investigation and everything worked correctly!

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow