Question

More complex questions in this vein have been asked and answered elsewhere: this one may be just too trivial to pose. In a cluster, I have a program compiled and linked to shared boost libraries on the head node in /usr/local/lib, and this boost version is newer than that available with the cluster's distribution. The program runs fine on the head node, as I have LD_LIBRARY_PATH exported. If I try and run it using the torque scheduler, the libraries aren't available on the compute node(s), thus producing the "error while loading shared libraries ...". Aside from rebuilding boost as a static library and linking the program that way, is there an easy way to make these libraries available to all nodes? Simply exporting the path in the job script, i.e.

export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH

does not work, because obviously the path doesn't exist on the compute node. This seems like such a stupid question, but what would be the easiest way of making a local object on the head node available to all compute nodes?

Was it helpful?

Solution

The most common way that people do this is installing the library on the compute nodes as well. Sometimes this means installing it on a subset of the nodes, marking those nodes with a feature, and then requiring that feature for the nodes that a specific job will use. Other times its installing the library everywhere so that you don't have to worry about it, but these are the most common approaches towards this problem.

If the versioning is too hard to manage on the compute nodes, then compiling the library statically into your program is probably the only option you have.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top