K. McManus, C. Walshaw, S. Johnson, M. Cross
Unstructured mesh codes for modelling continuum physics phenomena have evolved to provide the facility to model complex interacting systems. Parallelisation of such codes using single Program Multi Data (SPMD) domain decomposition techniques implemented with message passing has been demonstrated to provide high parallel effciency, scalability to large numbers of processors P and portability across a wide range of parallel platforms. High effciency, especially for large P requires that load balance is achieved in each parallel loop. For a code in which loops span a variety of mesh entity types, for example, elements, faces and vertices, some compromise is required between load balance for each entity type and the quantity of inter-processor communication required to satisfy data dependence between processors.