Question

When I look in one of our CDC tables, I see four rows in the table with a __$start_lsn value of 0x000CB13700041C06001B.

My question is this. When SQL Server wrote the four rows, containing this lsn to the CDC table, did it write the only four rows that will ever have this lsn, or is it possible that the next transaction could include more rows with the same lsn?

Or, put another way, when I look in, or query, a CDC table, at/for a particular LSN, can I know for sure that I will never see more rows, in the future, with the same LSN?

Was it helpful?

Solution

did it write the only four rows that will ever have this lsn

Yes. Consider how they describe Querying for All New Changes Since the Last Set of Changes

For typical applications, querying for change data will be an ongoing process, making periodic requests for all of the changes that occurred since the last request. For such queries, you can use the function sys.fn_cdc_increment_lsn to derive the lower bound of the current query from the upper bound of the previous query. This method ensures that no rows are repeated because the query interval is always treated as a closed interval where both end-points are included in the interval. Then, use the function sys.fn_cdc_get_max_lsn to obtain the high end-point for the new request interval. See the template Enumerate All Changes Since Previous Request for sample code to systematically move the query window to obtain all changes since the last request.

(My emphasis)

The technique to move beyond the current set of changes you're looking at is to increment the highest lsn in the current set. And then you use that as your new lower bound. And that is described as a way to obtain all changes since the last request. Therefore, we can conclude that no further changes can possibly have the same lsn as one you've already seen.

OTHER TIPS

In addition to Damien_The_Unbeliever's excellent answer, I'll add that LSN's are the fundamental identifier of change within the database. From this BOL Article:

Every record in the SQL Server transaction log is uniquely identified by a log sequence number (LSN). LSNs are ordered such that if LSN2 is greater than LSN1, the change described by the log record referred to by LSN2 occurred after the change described by the log record LSN.

(emphasis mine)

You can also infer a couple of other things from the second sentence. Because of the inherent orderability of LSNs, you can, well, order by them in your query and you get the order in which those things were committed to the log. You can also infer the uniqueness from this statement insofar as if LSN2 = LSN1, then those records were committed to the log at the same time.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top