Question

I have n number of databases and I have to collect data from those into one database. The databases has some specific value (e.g. currencies) which I want to update from the collector database(of course I want this to update for all databases). So in the first case I need to avoid primary key collision because i don't want to loose data. I think that setting up auto_increment.increment and auto_increment.offset, can do the trick, but in the second case i have to identify specific rows in all databases. What kind of technology and topology can I use?

Was it helpful?

Solution

I have a data warehouse that multiple sources of data feed into. I generate a new PK, but bring in the original PK from the source along with it. With a lot of the feeds I also bring in a 'source key' that's typically an environment variable or something to identify the actual source, and then create a composite key to differentiate between original PKs that may be the same and from different sources. In my aggregations this data is stored, and update a dimension to keep a record of these sources, so I can query based on that table.

I read a Brent Ozar post a while back of a technique to alter the seed for one lot to subtract from 0 so you end up with negative integers. For more than 2 you could expand it with number patterns (odd/even, multiples, ending-in). It wasn't so much for this scenario, as you need a lot of foresight to choose which pattern to implement based on the number of sources you think you may have.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top