Question

I have a scenario where I need to validate a domain using values in another domain. Rule is : Each value in domain 1 must exist in domain 2. The list of unique values in second domain will be around 50 million. How do I do this in DQS?

Was it helpful?

Solution

You are correct in thinking that the ref integrity scenario is not directly supported in this release of DQS. However, by using a combination of existing techniques, you can get this functionality. I'll list them below. Also you may want to take a look at some of the short screencasts I've done on DQS - here.

To simulate referential integrity: 1) Create a domain from each source field in a KB 2) Create a composite domain from those source fields 3) Create a matching rule for the composite domain in your KB where the value from the first source table is the 'pivot' record is set as a prerequisite - more detail here and here.

Example of Composite Rule

OTHER TIPS

Cross table checks seems to be a missing feature in a lot of data profiling tools. Only way i can see is to create a view somewhere with a left join. Generally this is going to mean having the data you want to profile restored to your own environment (no bad thing if you can meet your own latency requirements) and creating views to define relationships between the tables you wish to test.

Checking referential integrity seemed like the obvious first step in data profiling but searching the web it doesn't seem to be a hot issue. Am i missing something?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top