EXPDP snapshot too old live monitoring
-
03-10-2020 - |
Question
The likely errors that raises "snapshot too old"s during datapump export are well documented all over the internet such as :
- undo_retention parameter being too short
- redo logs groups are too small or there are too few
- undo tablespace is too small
Is there a way to monitor those failure cases while the job is running ?
The kind of solution I'm looking for would be a query that would report :
- total percentage of the snapshot's undo tablespace usage until it needs to auto-extend or maxes out
- total percentage of redo logs usage until it overwrites itself
- time left until undo retention is reached by the snapshot
Any query that does part of these things would be welcome. Does looking up such information requires oracle analytic packages? Is it possible at all to query such a thing ?
Solution
You need to fully understand how undo works to troubleshoot this error. I recommend this blog post:
http://blog.oracle48.nl/wordpress/oracle-database-undo-space-explained/
You can use this query to see the current profile of your Undo tablespace(s):
SELECT DISTINCT STATUS,TABLESPACE_NAME, SUM(BYTES), COUNT(*)
FROM DBA_UNDO_EXTENTS
GROUP BY STATUS, TABLESPACE_NAME;
ACTIVE
extents contain uncommitted or currently-rolling-back transactions.UNEXPIRED
extents contain transactions required to be kept in the Undo tablespace to meet the undo_retention parameterEXPIRED
extents are transactions still in the Undo tablespace that are older than the undo_retention parameter.
The database will delete the oldest expired extents to make room for new active ones. If there are no expired extents, it will attempt to allocate a new extent. If that means extending the datafile, it will do that.
If the datafile cannot extend, the database will delete some Unexpired extents to make room, violating the undo_retention value. Thus, the undo_retention parameter is not a guarantee, but a guideline that the database will attempt to adhere to as best it can.
Now, there are two common failure modes here:
If a query tries to read from a deleted extent, it will fail with
ORA-01555 Snapshot Too Old
If the whole undo tablespace becomes full of active undo, transactions will fail with
ORA-30036 unable to extend segment in Undo tablespace
total percentage of the snapshot's undo tablespace usage until it needs to auto-extend or maxes out
If the sum of Active and Unexpired bytes in DBA_UNDO_EXTENTS is close to the size of the relevant undo tablespace, you have a chance of experiencing an ORA-01555.
If the sum of Active bytes is close to the size of the relevant undo tablespace, you are likely to experience an ORA-30036 error.
total percentage of redo logs usage until it overwrites itself
I'm not aware of anything you can change in the redo log configuration that will effect Snapshot Too Old errors. Where did you get this info?
time left until undo retention is reached by the snapshot
It is very difficult if not impossible to do this accurately, because it depends on every other transaction in the database. If you subtract the query running time from the undo_retention parameter, you will have a good idea, but that's only assuming that the undo tablespace was big enough to meet the undo_retention value. Once that line has been crossed, all bets are off.