Question

I am setting up a small, 256 core compute cluster at my university for fluid dynamics simulations. The code we use is written in a mix of C and Fortran and currently runs on a large supercomputer just fine.

In our cluster development, we have 16 compute nodes with 16 AMD CPUs each. We also have an 8 core Dell box that we would like to use as a "head" or "login" node. This box, however, is Intel Xenon.

We would like to NFS mount the home directory of each user to the login node and restrict their access to the compute nodes. This would require the users to compile and run their programs via mpirun on the login node. Our questions are:

  1. Is this possible with a mixed CPU system like this? Or would we run into problems with compiling on Intel and executing on AMD?
  2. If this is a problem, is there a work around? Could we somehow have the user transparently compile their code on a compute node while only logged into a login node?
  3. In a cluster with a head node, should only the home directory be shared via NFS mount? Or are there other directories which we should also share between compute and head node(s)?

If there's a good resource out there that could help, we'd appreciate that, too. We've found so many suggestions and ideas on various pages... It'd be nice to be pointed towards one that the community considers reputable. (Disclaimer... we aren't computer scientists, we are just regular scientists.)

Was it helpful?

Solution

Intel and AMD processors are at large binary compatible though there are things like difference in cache sizes and instruction scheduling that could result in sub-optimal run of a particular code on AMD if the code was compiled with optimisations for Intel and vice versa. There are some differences in the instruction sets implemented by both vendors but those are usually not very useful in scientific computing anyway.

Since (1) is not a problem, one does not need a workaround. Still one has to keep in mind that some compilers enable by default instruction sets and optimisations for the processor, on which the code is being compiled. Therefore one has to be extra careful with the compiler options when the head node uses CPUs from a different vendor or even from the same vendor but from a different generation. This is especially true for Intel's compiler suite, while GCC is less aggressive by default. On the other hand, one could usually instruct the compiler what architecture to target and optimise for, e.g. by providing the appropriate -mtune=... option to GCC.

As for sharing the file system, it depends on how your data storage is organised. Parallel applications often need to access the same files from all ranks (e.g. configuration files, databases, etc.) and therefore require both home and work file systems to be shared (unless one uses the home file system as working one). Also you might want to share things like /opt (or whatever the location where you store cluster-wide software packages) in order to simplify the cluster administration.

It is hard to point you to a definitive source since there are as many "best practices" as cluster installations around the world. Just stick with a working setup and tune it iteratively until you reach convergence. Installing TORQUE is a good start.

OTHER TIPS

I also have the same question. But coming to think of it heterogeneity is the norm. GPU is a different processor architecture compared to a GPU. But during cross-compilation of the program, exact target acrhitecture should be defined. Compiler will create binary exactly for the target architecture.

While compiling for GPU, I have seen compiler flags specifying the right arch options

For example:

/usr/local/cuda/bin/nvcc -ccbin /opt/anaconda3/bin/x86_64-conda_cos6-linux-gnu-gcc -I../../../Common  -m64    --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o deviceQuery.o -c deviceQuery.cpp
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top