質問

I would like to know what are the best practices for sharing Prolog code / libraries with other programmers (and with oneself between multiple projects). I am using SWI-Prolog myself, but are also interested in how other Prologs tackle this.

For comparison, Java has Maven+JARs, Python has EasyInstall+PythonEggs, and there are probably many others for other languages as well. But are there any for Prolog?

SWI-Prolog Packs

In SWI-Prolog there are Packs, supported by module library(prolog_pack). The downsides to these are:

  1. You have to create either an archive file or a Git repository for each pack. Say I want to create 10 packs. Now I need to create 10 Git repositories. I sometimes make edits that impact multiple files, potentially residing in multiple packs/repos, requiring me to commit several Git repositories for a single (multi-file) edit.
  2. In order to create a pack you have to hand-pick a number of files that 'belong together'. Sometimes I find out that a file X belongs to pack A as well as pack B. Now I need to maintain copies of file X in repositories A and B, or I need to create yet another pack C consisting of only X and import C into A and B.
  3. Packs are published on a public Web site. Most of my libraries are only interesting for me. A bunch of them are interesting to specific people that I collaborate with, and only a few are 'ready' for wider/public dissemination.
  4. The pack maintainer has to specify inter-pack dependencies. For intricate hierarchies of libraries that seems like unnecessary work to me. I already use Prolog modules quite stringently and would like to simply use the hierarchy of Prolog module imports as the dependency graph.

Git submodule

Another approach, one that I've used until now, is Git submodules. Dependencies between libraries are achieved by importing one repository into another. This has some of the same downsides as SWI-Prolog packs:

  1. A Git repository for each library, and thus lots of repositories to maintain.
  2. The maintainer has to choose the files per repo wisely and has to specify which Git submodule inclusions are needed.
  3. Updating existing libraries is very difficult. I have found out (the hard way) that most of the people I hand my code to are unable to update a Git repository with many intricately interdependent submodule imports successfully. (I have the utmost respect for the occasional Git guru who uses submodules and always gets it right, but most non-programmers and quite a few programmers I work with find it too difficult.)

My ideal approach

My personal preferences for a perfect Prolog code sharing methodology would be:

  1. The number of libraries you disseminate and the number of Git repositories you have are independent. Specifically, I can have a sizable repository, parts of which get disseminated in different ways. If somebody likes to (re)use my Prolog module with DCG helper predicates, then I can simply disseminate that single file (plus potential dependencies) to that person.
  2. You do not have to hand-pick and manually copy individual files, rather you let an algorithm traverse the hierarchy of module imports to extract those files that (apparently) belong together. The files are downloaded when the program is run for the first time. The files may all belong to the same Git repository or to several, the algorithm should not care at all about the mapping between repositories and libraries or between repositories and files.
  3. The maintainer of the code is able to decide whether a library gets published publicly or to a limited group of people (or to the limited group including only the maintainer).
  4. The hierarchy of module-imports between files is all you need for dependency-tracking.

The above implies that my ideal library sharing approach is file-based and not package-based. If Prolog module A uses Prolog module B and A is loaded, then B is either loaded from a local file (if it is there) or it is downloaded from a repository. I am not sure how common a file-based approach is in other languages. The aforementioned Maven+JARs and EasyInstall+PythonEggs are both package-based.

I am very interested in what other Prolog programmers use and think about this topic!

役に立ちましたか?

解決

I guess such a simple traversal algorithm can give you a collection of modules, if you have already annotated those modules which belong to a package and those modules which do not yet belong to a package. It will yield a subset of the modules that do not yet belong to a package.

But I have the feeling that this misses the point. I think software engineering for packages has a different goal than simply delivering one package. Usually one is faced with multiple packages and these packages can have dependencies which are rooted in the dependency of the modules itself.

Mathematically:

   M: The set of modules
   P: The set of packages
   p(m): The package a module belongs to or null.

So if I have module dependencies, I can derive package dependencies from it:

   d(m1,m2): The module m1 depends on the module m2
   d'(p1,p2): The package p1 depends on the package p2

   d'(p1,p2) <=> exists m1,m2 (p(m1)=p1 & p(m2)=p2 & d(m1,m2))

Your algorithm might derive one package p which then might depend on some packages p1, .., pm which have been already used to annotate the existing modules. But software engineering has found many ways to identify multiple packages, typical architectures are vertical layering, horizontal layering, etc.. Maybe there are also algorithms for that.

But matters are not that simple. Packages are usually defined to help in the co-evolution of modules and to facilitate change management and release management. If modules co-evolve one does not want to release one module after the other. One wants to release a set of modules that has reached the same evolution level, so that this set of modules can fruitfully interact.

But if we have evolution of modules we will also have evolution of packages. And this evolution will happen if you have a single packages or if you go more with the multiple packages for your stuff. And I am not sure whether the existing package systems for Prologs already help here. What I see for SWI-Prolog is for example a versioning of packages and a todo list for packages:
http://www.swi-prolog.org/howto/PackTodo.txt

The above todos make all sense. But they do not directly address package dependency and their evolution. In Jekejeke Prolog I am currently experimenting in improving the module dependency. The idea is for example that the end-user can load a module clpfd via the following command:

   ?- use_module(library(clpfd)).

If a package is installed and activated that has a module clpfd the command will succeed. And if no such package is installed or a package is installed and not yet activated the command will fail. The home package respectively the module clpfd will use other modules and thus packages. If it uses a module local to its own package it can do so as follows, no need for the library/1:

   ?- use_module(helper).

But if it uses a module that is not local to its own package it typically does differently. For example a module clpfd might use a module apply. And it will do so with library/1:

   ?- use_module(library(apply)).

And now we recognize that we will not know by inspecting the clpfd or helper module, when it does the above from where it will take the apply module. We only know the package dependency when we have a particular set of packages at hand and when we resolve the module name apply to its package.

This flexibility has its pro and cons. The cons are that we cannot establish fixed package dependencies. And that tools for versioning etc.. that rely on fixed package dependencies will not work. So a solution would be to bootstrap versioning for packages from versioning for modules, similar as how we derive dependencies between packages from dependencies of modules.

But I am not yet sure how this would work. The complexity can surely be reduced if we can distinguish between public and private modules. For example the above module helper could be used solely by clpfd, and can be left out when determining package dependencies and package versioning.

My ideas so far:

  • Derive package dependencies from modules
  • Derive package versioning from modules
  • Allow private and public modules inside packages.

Bye

他のヒント

About Swi-prolog packs:

  1. I see a package as a collection of Prolog modules. A single module should belong to a single package only, not into packages A and B.
  2. If file X wants to belong into package A and package B then it should actually belong into package C that is dependency of both A and B.
  3. You can have private packages already now, just have to be really careful to not automatically publish them during installation. This needs improvement indeed. Patches welcome :)
  4. I see explicit dependencies as the only sane solution. There is already requires/1 in pack.pl. Needs some work to specify version range.

Imho, the biggest issue with the current pack spec/implementation is that it's not mandating semantic versioning. Some packages explicitly state they use it, some use it but do not point it out. Semantic versioning, if respected by pack maintainers, would make version confict detection/resolution much easier. The other issue is that Swi packs are, well, for Swi, and leave other Prologs in the cold.

I wish the pack spec/implementation was kept as simple and magic-free as possible. There is enough of Prolog "AI this" and "AI that" code already. An alternative package manager could be implemented on top of these and be released as a pack :).

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top