Read Only Generic data access layer Best practice

https://softwareengineering.stackexchange.com/questions/364773

26-01-2021
|

Pergunta

I am trying to write some art of "generic" data access library to access the data of my company's ERP Software, which is our main/core application where all our related data is managed.

I am a student worker and my boss & coworkers always come to me and ask to write small apps, to make any kind of stuff, from continue analysis of specific customers to automatic device measurement.

For all those apps, that like said can have the most different domains, I need to pull the data from the same DB, our ERP DB.

All of the data will be read only, since any change will be persisted back to the ERP DB.

So I thought to make a library where I just create a model that mirror the DB and expose some repositories returning the model classes or an interface implementing the same properties.

So my questions are:

Is this a good practice or should I create a data access layer for every app?
Are there maybe some patterns for this use case? I searched a lot but didn't find anything about read only scenarios, apart of using AsNoTracking() with EF.
This way the repos will return more information than required for the apps 99% of the time, but it will save me write duplicate code.
- So instead of making optimized queries with the requirements for the specific app and return a custom class with only required data, I am returning a lot of info, doing some logic and mapping it to whatever I need. I understand for huge datasets, highly accessed apps or very quick responsiveness this won't be a good idea, but for normal cases?

Solução

Having a generic data access layer to interface a database is a standard approach, and there are several ORM tools which actually allow you to generate the model classes automatically from an existing database.

Since you mentioned entity framework, this is called "database first" approach in that context. The naive way of applying Microsoft's tools to generate model classes would lead to a data access layer with (approximately) one class per table, so one that "mirrors the DB". However, from what you wrote it is clear you will need only a subset of those model classes. So for getting code generated for this subset, see this older SO post.

If you need those only in a read-only manner, you could simply don't use any insert, update or delete operations, or hide them from the DB context, as shown here. When your apps don't need write access, they could connect to the DB using a special user which has only the required access rights. AsNoTracking is AFAIK just a performance optimization option for read-only access, see this SO post.

So the question remains if one should put all of those generated classes into one shared library, or better generate different data access libraries, one for each app?

This actually depends on the apps, how independent they are, how independently they need to be maintained, how much maintenance actually is required, and if there is a lot of manually written code on top of the generated model classes which can be shared between the different apps. If there is no such code, it probably won't hurt if each of the apps gets it's own individual generated data access layer, each one including only the model classes it requires. Even if there is some overlap in the tables, and some classes will get generated 10 times, there is no real maintenance problem. If the underlying tables will change, one can simply regenerate those classes again. Of course, for this you should put the generation commands for each app's layer into a command line script which becomes part of your code base.

If however, you were going to write a lot of duplicate code on-top of those generated classes, for example some repositories or common business logic, which can be shared between those apps, it will start to pay off to put that code into a shared library. This is a huge improvement for maintenance and later evolvement of the programs, however, it comes for a cost:

whenever your library changes for some reason, you will have to recompile all of the dependent apps to make sure you did not make some backwards-incompatible changes (or you have to adapt all of the dependent apps at once to those changes)
you also need to make a decision if you want all of your apps to be treated as "one system" (so you give them a common version number, and redeploy them all at once when something changes), or if you still want to deploy them separately, which will lead to a situation where you have several different versions of your shared lib at the same time in production.

In general this is a trade-off, see this older SE.SE post about when having a "core library" is a good or bad idea. The topmost answer leads to another link which might be of interest to you, a good discussion of "redundancies vs. dependencies" in library creation.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a softwareengineering.stackexchange