to represent coupled 3D flow, heat and mass calculations of compressible gases, complex homogemeous and heterogenous chemical reactions, rotation of the susceptor, deposition on reactor walls and heat transport in the gas as well as efficient numerical methods for threedimensional process simulation of the above parameters.
Development of algorithmic speedup techniques
Usually mathematical models are discreeted - in order to reduce the degrees of freedom - along a regular grid. In the sparse grid model however, grid nodes without effect on the solution need not be computed and are dropped, greatly reducing the complexity of the model. Furthermore, adaptive methods, i.e. adapting the solver algorithm to the momentary residuum during the computing cycles, can lead to an earlier stable equilibrium of the algorithm.
Parallelization of the simulation program
Applying parallellism in order to reduce the computing and memory demands, enabling faster simulation of large problem domains.
As the most resource consuming part of the overall simulation the solver of the simulation program FASTEST 3D has been choosen to be re-implemented in preparation of algorithmic and parallel speedup techniques.
A sparse grids based Solver for Navier-Stokes equations has been developed and successfully tested on two- and threedimensional Problems. Hand in hand with the implementation a parallelization concept was developed.
Following the above concept the sparse grid solver is being parallelized for support of a wide range of hardware platforms.
Project Progress
Sparse Grids are a numerically very efficient way to solve partial differential equations (PDE). They need significantly less grid points for the computation of a given function than other common methods leading to regular full grids.
Sparse Grids are a subset of full grid points. If we divide the support space into individual subspaces it becomes obvious that all subspaces on the same diogonal possess the same nummber of grid points and -- as conceivable in a short proof ([griebel91,zenger90]) -- the same contribution to the overall error of the approximation. If we combine the highest contributing subspaces and neglect the subspaces with low bias on the overall error, we receive the prototypical Sparse Grid. By skipping low error subspaces Sparse Grids achieve same order approximation quality with substantially less points than regualr full grids.
If the target function satisfies certain conditions of smoothness - which are often present in the numerical practice - the approximation error increases only slightly, so Sparse Grids prove an effective way to decrease the computing and memory demands of PDE solvers.
The code differs in a variety of points from the usual numerical packages being parallelized and presented on conferences and in literature. First the code does not consist of an application but of a library of Sparse Grid based operators, offered as a toolkit to build and implement methods to solve common numerical problems like Burger, Poisson, Navier-Stokes or the shown Conjugate Gradient iteration ([vdv92]). The operators are based on the method of Finite Differences as Finite Elements on Sparse Grids provided unsuitable for convective terms ([schiekofer98]). This enables a modular approach for adapting and implementing both old and new methods based on Finite Differences for use with Sparse Grids. Moreover the modular design allows the use of parallel operators without knowledge about the inherent parallelism by simply using the parallelized operators instead of the sequential ones ([griebel97]).
cg_iteration
BEGIN
LAPLACE_FINITE_DIFFERENZEN() // Ax
CG_ADD() // d := b - Ax
COPY_HASH_VALUES() // g := d
SKALTMULT_HASH(-1.0) // g := -g
WHILE (r*r < epsilon)
LAPLACE_FINITE_DIFFERENZEN() // A*d
nenner := CG_SKALARPRODUKT() // d*Ad
zaehler := CG_SKALARPRODUKT() // r - r*r
alpha := zaehler / nenner
CG_ADD() // x := x - alpha_id_i
nenner := CG_SKALARPRODUKT() // r*r
CG_ADD() // r := r + alpha Ad
zaehler := CG_SKALARPRODUKT() // r + q*r + 1
beta := zaehler / nenner
CG_ADD() // d := d + beta
i := i+1
END
END
Second the code is not a Fortran legacy code that was developed and maintained over several generations of software engineers and reanimated for another life cycle on a parallel plattform. The SPL is well programmed and documented in the programming language C and is based on dynamic data structures reflecting the recursive structure of Sparse Grids and contrasting to the usual array based structures of Fortran code. Hence the program structure of the Sparse Grids operators is not iterative but recursive.
The usual data structures used for Sparse Grids are systems of binary trees. The algorithms based on these data structures are hence coined by their recursive structure. The Sparse Grid operators pass through the grid in depth-first or breadth-first order and apply their functionality -- an operation local to every grid point, i.e. hierarchization or differentiation -- to the grid as a side effect of the procedure. This preorder (see code example for a twodimensional case \footnote[the operator dispatches two recursive calls for the left and right son BEFORE applying the local operation]) or postorder dependency of course much affects the parallelization approach.
PROC preorder_wrapper(bintree node)
BEGIN
IF is_leaf(node)
THEN return
FI
preorder_wrapper(node.leftson)
preorder_wrapper(node.rightson)
local_node_operation(node)
END
Multithreading in the field of scientific computing
As elegant and compact recursive algorithms may be, they are however not well suited for parallelization with the usual approaches. The partitioning of instructions or data for execution on message coupled massively parallel processors (MPP) or networks of workstations (NOW) is not feasible because of the strong dependencies of data and program structures. And as the operators posess hardly any iterative code there is also no potential for vectorization. Hence a parallelization by means of message passing for distributed memory systems is out of question as the synchronisation effort. i.e. the message size and frequency, would be too high.
PROC preorder_wrapper(bintree node)
BEGIN
IF is_leaf(node)
THEN return
FI
local_node_operation(node)
IF (node.depth < MaxDepth)
THEN
lock(LIFO)
insert(LIFO,node)
unlock(LIFO)
return
ELSE
preorder_wrapper(node.leftson)
preorder_wrapper(node.rightson)
FI
END
We chose to handle the recursive data structures and operators by using shared memory as communication and synchronisation mechanism and the POSIX Pthread API as a plattform independent interface to it. The only (visible) problem remained to partition the work load on the threads. An apparent approach is to have every recursive call handled by an individual thread. As this leads to the creation of a high number of active threads when the corresponding Sparse Grid's binary tree is deep, this idea can be further optimized by using a workpile model to manage the active threads. A master thread dispatches recursive calls to (static) idle worker threads by means of an exclusive LIFO (preorder) or FIFO (postorder) queue. The preorder wrapper procedure keeps putting recursive call for node
into the mutex-protected queue LIFO until a maximal depth of MaxDepth is reached. Recursive calls deeper then MaxDepth are handled by the calling thread itself as the small amount of work in the remaining subtree doesn't justify a referral to another thread. The worker threads constantly try to extract a recursive call from the LIFO queue for execution.
PROC worker
BEGIN
WHILE (TRUE)
lock(LIFO)
extract(LIFO,node)
unlock(LIFO)
preorder_wrapper(node)
END
END
Publications
Michael May and Thomas Schiekofer,
An abstract Data Type for parallel Simulations based on Sparse Grids.
In Third European PVM Users' Group Meeting, EuroPVM 96,
München, Germany, October 1996, Springer Verlag, Berlin/Heidelberg, Lecture Notes in Computer Science, 1997.
Michael May,
Parallel distributed Representation of Sparse Grids using Process Arrays.
In Proceedings of the Workshop on Applied Parallel computing in Industrial Problems and Optimization, Lyngby, Denmark, August 1996,
Springer Verlag, Berlin/Heidelberg, Lecture Notes in Computer Science, 1997.
M.Dauelsberg, F.Durst, L.Kadinski, Y.Makarov, G.Strauch, H.Jürgensen, T.Schiekofer, M.Griebel, A.Bode, P.Luksch, M.May,
PAR-CVD: Entwicklung leistungsfähiger paralleler Berechnungsverfahren zur Untersuchung und Optimierung von CVD-Prozessen
In Proceedings of BMWF Symposium HPSC97, München 1997.
Schiekofer Thomas
Griebel Michael.
An adaptive sparse grid navier-stokes solver in 3d based on the finite
difference method.
In Proceedings ENUMATH97, 1997.
Zumbusch Gerd
Schiekofer Thomas.
Software concepts of a sparse grid finite difference code.
In G. Wittum W. Hackbusch, editor, Proceedings of the 14th GAMM-Seminar
Kiel on Concepts of Numerical Software, Notes on Numerical Fluid
Mechanics. Vieweg, 1998.