next up previous
Next: What is not supported Up: Using Pelegant efficiently Previous: Parallelization overview

Achieving high performance

In our master/slave model, the master will be responsible for I/O operations and communicating with the slave processors only, i.e., it will not do the tracking for most of the elements. As a result, 10 or more processors are recommended when running simulation with Pelegant. To run simulations efficiently, we also suggest when possible that the user arrange all serial elements in a continuous sequence, which will minimize the communication overhead for gathering and scattering particles. This will be unnecessary in the future when all of the elements are parallelized.

By default, Pelegant is built in such a way that it does load balancing after each pass through the accelerator. This is particularly important when the user does not have exclusive use of the nodes. When running Pelegant in an environment where only one user is allowed to run a job on a computer node at a time, then Pelegant can be optimized by defining the complier flag CHECKFLAGS=1 in the Makefile.OAG. In this case, the load balance will be checked only after the first turn or when the particle number is changed, instead of every turn.

For ANL users who need to run simulations that would normally take several weeks or months with serial elegant, we can provide help to perform runs on the Jazz cluster (350 nodes, each with a 2.4 GHz Pentium Xeon) or the BlueGene/L supercomputer (1024 dual PowerPC 440 700MHz 512MB nodes) at ANL. Pelegant is pre-built and available on both systems.


next up previous
Next: What is not supported Up: Using Pelegant efficiently Previous: Parallelization overview
Yusong Wang 2007-04-03