database synchronization - MS Access

https://stackoverflow.com/questions/1801658

05-07-2019
|

Question

I have an issue at the moment where multiple (same schema) access 2003 databases are used on laptops.

I need to find an automated way to synchronize the data into a central access database.

Data on the laptops is only appended to so update/delete operations wont be an issue.

Which tools will allow me to do this easily? What factors will affect the decision on the best tool or solution?

Solution

It is possible to use the Jet replication built into Access, but I will warn you, it is quite flaky. It will also mess up your PK on whatever tables you do it on because it picks random signed integers to try and avoid key collisions, so you might end up with -1243482392912 as your next PK on a given record. That's a PITA to type in if you're doing any kind of lookup on it (like a customer ID, order number, etc.) You can't automate Access synchronization (maybe you can fake something like it by using VBA. but still, that will only be run when the database is opened).

The way I would recommend is to use SQL Server 2005/2008 on your "central" database and use SQL Server Express Editions as the back-end on your "remote" databases, then use linked tables in Access to connect to these SSEE databases and replication to sync them. Set up either merge replication or snapshot replication with your "central" database as the publisher and your SSEE databases as subscribers. Unlike Access Jet replication, you can control the PK numbering but for you, this won't be an issue as your subscribers will not be pushing changes.

Besides the scalability that SQL server would bring, you can also automate this using the Windows Synchronization manager (if you have synchronized folders, that's the annoying little box that pops up and syncs them when you logon/logoff), and set it up so that it synchronizes at a given interval, on startup, shutdown, or at a time of day, and/or when computer is idle, or only synchronizes on demand. Even if Access isn't run for a month, its data set can be updated every time your users connect to the network. Very cool stuff.

OTHER TIPS

Access Replication can be awkward, and as you only require append queries with some checking, it would probably be best to write something yourself. If the data collected by each laptop cannot overlap, this may not be too difficult.

You will need to consider the primary keys. It may be best to incorporate the user or laptop name in the key to ensure that records relate correctly.

The answers in this thread are filled with misinformation about Jet Replication from people who obviously haven't used it and are just repeating things they've heard, or are attributing problems to Jet Replication that actually reflect application design errors.

It is possible to use the Jet replication built into Access, but I will warn you, it is quite flaky.

Jet Replication is not flakey. It is perfectly reliable when used properly, just like any other complex tool. It is true that certain things that cause no problems in a non-replicated database can lead to issues when replicated, but that stands to reason because of the nature of what replication by any database engine entails.

It will also mess up your PK on whatever tables you do it on because it picks random signed integers to try and avoid key collisions, so you might end up with -1243482392912 as your next PK on a given record. That's a PITA to type in if you're doing any kind of lookup on it (like a customer ID, order number, etc.)

Surrogate Autonumber PKs should never be exposed to users in the first place. They are meaningless numbers used for joining records behind the scenes, and if you're exposing them to users IT'S AN ERROR IN YOUR APPLICATION DESIGN.

If you do need sequence numbers, you'll have to roll your own and deal with the issue of how to prevent collisions between your replicas. But that's an issue for replication in any database engine. SQL Server offers the capability of allocating blocks of sequence numbers for individual replicas at the database engine level and that's a really nice feature, but it comes at the cost of increased administrative overhead from maintaining multiple SQL Server instances (with all the security and performance issues that entails). In Jet Replication, you'd have to do this in code, but that's hardly a complicated issue.

Another alternative would be to use a compound PK, where one column indicates the source replica.

But this is not some flaw in the Replication implementation of Jet -- it's an issue for any replication scenario with a need for meaningful sequence numbers.

You can't automate Access synchronization (maybe you can fake something like it by using VBA. but still, that will only be run when the database is opened).

This is patently untrue. If you install the Jet synchronizer you can schedule synchs (direct, indirect or Internet synchs). Even without it, you could schedule a VBScript to run periodically and do the synchronization. Those are just two methods of accomplishing automated Jet synchroniziation without needing to open your Access application.

A quote from MS documentation:

Use Jet and Replication Objects

JRO is really not the best way to manage Jet Replication. For one, it has only one function in it that DAO itself lacks, i.e., the ability to initiate an indirect synch in code. But if you're going to add a dependency to your app (JRO requires a reference, or can be used via late binding), you might as well add a dependency on a truly useful library for controlling Jet Replication, and that's the TSI Synchronizer, created by Michael Kaplan, once the world's foremost expert on Jet Replication (who has since moved onto internationalization as his area of concentration). It gives you full programmatic control of almost all the replication functionality that Jet exposes, including scheduling synchs, initiating all kinds of synchronization, and the much-needed MoveReplica command (the only legal way to move or rename a replica without breaking replication).

JRO is one of the ugly stepchildren of Microsoft's aborted ADO-Everywhere campaign. Its purpose is to provide Jet-specific functionality to supplement what is supported in ADO itself. If you're not using ADO (and you shouldn't be in an Access app with a Jet back end), then you don't really want to use JRO. As I said above, it adds only one function that isn't already available in DAO (i.e., initiating an indirect synch). I can't help but think that Microsoft was being spiteful by creating a standalone library for Jet-specific functionality and then purposefully leaving out all the incredibly useful functions that they could have supported had they chosen to.

Now that I've disposed of the erroneous assertions in the answers offered above, here's my recomendation:

Because you have an append-only infrastructure, do what @Remou has recommended and set up something to manually send the new records whereever they need to go. And he's right that you still have to deal with the PK issue, just as you would if you used Jet Replication. This is because that's necessitated by the requirement to add new records in multiple locations, and is common to all replication/synchronization applications.

But one caveat: if the add-only scenario changes in the future, you'll be hosed and have to start from scratch or write a whole lot of hairy code to manage deletes and updates (this is not easy -- trust me, I've done it!). One advantage of just using Jet Replication (even though it's most valuable for two-way synchronizations, i.e., edits in multiple locations) is that it will handle the add-only scenario without any problems, and then easily handle full merge replication should it become a requirement in the future.

Last of all, a good place to start with Jet Replication is the Jet Replication Wiki. The Resources, Best Practices and Things Not to Believe pages are probably the best places to start.

You should read into Access Database Replication, as there is some information out there.

But I think that in order for it to work correctly with your application, you will have to roll out a custom made solution using the methods and properties available for that end.

Use Jet and Replication Objects (JRO) if you require programmatic control over the exchange of data and design information among members of the replica set in Microsoft Access databases (.mdb files only). For example, you can use JRO to write a procedure that automatically synchronizes a user's replica with the rest of the set when the user opens the database. To replicate a database programmatically, the database must be closed.

If your database was created with Microsoft Access 97 or earlier, you must use Data Access Objects (DAO) to programmatically replicate and synchronize it.

You can create and maintain a replicated database in previous versions of Microsoft Access by using DAO methods and properties. Use DAO if you require programmatic control over the exchange of data and design information among members of the replica set. For example, you can use DAO to write a procedure that automatically synchronizes a user's replica with the rest of the set when the user opens the database.

You can use the following methods and properties to create and maintain a replicated database:

MakeReplica method

Synchronize method

ConflictTable property

DesignMasterID property

KeepLocal property

Replicable property

ReplicaID property

ReplicationConflictFunction property

Microsoft Jet provides these additional methods and properties for creating and maintaining partial replicas (replicas that contain a subset of the records in a full replica):

ReplicaFilter property

PartialReplica property

PopulatePartial method

You should definitely read the Synchronizing Data part of the documentation.

I used replication in a00 for years, until forced to upgrade to a07 (when it went away). The most problematic issue we ran into, at the enterprise level, was managing the CONFLICTS. If not managed timely, or there are too many, users get frustrated and the data becomes unreliable.

Replication did work well when our remote sites were not always connected to the internet. This allowed them to work with their data, and synchronize when they could. At least twice daily.

We install a separate database on the remote computers that managed the synchronization, so the user only had to click an icon on their desktop to evoke the synchronization.

The user had a separate button to push/pull in feeds off a designated FTP file that would update from the Legacy systems.

This process worked quite well, as we had 30 of these "nodes" working around the country, managing their data and updating to the FTP servers.

If you are seriously considering this path, let me know and I can send you my documentation.

You can write your own synchronization software that connects to the laptop selects the diff from it's db and inserts it to the master. It is depends on your data scheme how easy this operation will be. (if you have many tables with FKs... you will need to do it smartly). I think it will be the most efficient if you write it yourself.

Automating this kind of behavior is called replication, and Accesss Supports that apparently, but I've never seen it implemented.

As I guess most of the time the laptop is not connected to the main DB it is not a good idea anyway (to replicate data).

if you will look for a 3rd party tool to do it - look for something that can easily do the diff between the tables before copying, and can do it incrementally of course.

FWIW:

Autonumbers. I agree with David - they should never be exposed. To remove that temptation, I use a Random autonumber.
Replication. I used this extensively some years back, with scheduled syncs, and using GUIDs as the PK. I repeatedly found that any hiccups over the network corrupted the replicas, with the result that I had to salvage data, and re-issue replicas. Painful!

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow