Question

In a normal replicated world, the following sequence happens:

  • MAIN ASE: delete N rows issued
  • RS: delete N rows sent to replication target
  • TARGET: delete same N rows

However, what happens if the following sequence of events occurs?

  • TARGET: delete N rows (before main server has them deleted)
  • MAIN ASE: delete N rows issued
  • RS: delete N rows sent to replication target
  • TARGET: No such rows to delete?!?!?!

What happens then?

Does the rep server fail? Does the main ASE transaction abort? Does everyone act like this never happened and replication worked (since end result is as if it did)?

P.S. Yes, I know, typically you shouldn't ever pull such a stunt and data should never be changed on replication target. And if not you should use "set replication off" on main server session. The question is "what happens if this does occur, in violation of 'should'".

Was it helpful?

Solution

What happens from a SAP/Sybase Replication Server (SRS) perspective?

Depends on SRS version and how the DBA's configured the SRS instance.


Prior to SRS 15.2 the 'empty' DELETE would cause no problems, ie, the DELETE would be issued against the target, no rows found/affected, and SRS would continue to process the next transaction.

The key here is that a DELETE statement that does not find any rows to delete merely completes, successfully (ie, without error), with @@rowcount=0.

NOTE: The same 'logic' applies to UPDATEs, INSERT/SELECTs and SELECT/INTOs that complete successfully but affect no rows.

NOTE: The same 'logic' applies to DML statements that complete successfully but affect a different number of rows.


With SRS 15.2+ a replication server error class was introduced with the primary emphasis being to notice when transactions affect a different number of rows in the target ... and when said issue occurs to raise an error (eg, 5185) and suspend the DSI connection.

Each transaction coming over from the source is tagged with the number of affected rows. If the transaction, when applied to the target, affects a different number of rows then SRS will (by default) suspend the DSI and dump an error message to the errorlog.

The DBA can modify the default rowcount mismatch behavior by assigning a different action to the corresponding error number in the replication server error class (see the assign action command for more details).

Obviously (?) the DBA should research any rowcount mismatch errors to determine if they are ok/valid, or if they indicate an issue with the the target being out of sync with the source.


Other scenarios the DBA would want to investigate could include:

  • replicated stored procs that don't affect the same set of rows on the target
  • custom function strings that modify a transaction in such a was as to (explicitly) modify the number of affected rows in the target
  • lack of a primary key for a table

Keep in mind that while SRS is routinely used to keep a DR/backup database in sync with the primary database, SRS can also be used to:

  • distribute subsets of data (distribution model)
  • consolidate data (roll-up model)
  • maintain historical/audit data (eg, convert DELETEs/UPDATEs to save 'old' image of data in a history table)
  • propagate sub-/super-sets of data per application requirements

Net result is that the occurrence of rowcount mismatches may or may not be ok ... it all depends on the individual SRS environments.

OTHER TIPS

From Control Row Count Validation in Replication Server Administration Guide Volume 2:

Disable row count validation for all database connections, enter:

configure replication server set dsi_row_count_validation to 'off'

You must suspend and resume all database connections to Replication Server after you execute configure replication server with dsi_row_count_validation. The change in setting takes effect after you resume database connections.

Enable row count validation for a specific connection — pubs2 database in SYDNEY_DS data server, enter:

alter connection to SYDNEY_DS.pubs2 set dsi_row_count_validation to 'on'

You need not suspend and resume a database connection when you set dsi_row_count_validation for the connection; the parameter takes effect immediately. However, the new setting affects the batch of replicated objects that Replication Server processes after you execute the command. Changing the setting does not affect the batch of replicated objects that Replication Server is currently processing.

I think maybe, older version of the repserver require DSI suspend/resume after making the DSI specific change.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top