|
|
Master Seminar SS10
|
| Per Email to Prof. Dr. Michael Gerndt (gerndt@in.tum.de). Please select one of the topics. |
|
Topics |
Student | Advisor | date |
|
Molecular Dynamics Simulation and Charm++ parallel programming language Charm++ is a machine independent parallel programming system. Programs written using this system will run unchanged on MIMD machines with or without a shared memory. It provides high-level mechanisms and strategies to facilitate the task of developing even highly complex parallel applications. Molecular dynamics (MD) simulation is a form of computer simulation in which atoms and molecules are allowed to interact for a period of time by approximations of known physics, giving a view of the motion of the particles. There are a lot of parallel methods applied to such application which needs great power of computing resources. NAMD is a state-of-art parallel molecular dynamics code designed for high-performance simulation of large bio-molecular systems based on charm++ parallel language. Currently, NAMD scales to hundreds of processors on high-end parallel platforms and tens of processors on commodity clusters using gigabit ethernet. (more information) |
Haowei Huang | ||
| High performance
computing using accelerators In the last few years, computational accelerators have emerged and have taken a firm foothold now. Accelerators are computing components containing functional units, together with memory and control systems that can be easily added to computers to speedup portion of application. They can also be aggregated into groups for supporting acceleration of large problem sizes. Accelerators are not a new phenomenon, in the 1980's, for instance, Floating point systems sold attached processors like the API120-B with a peak performance of 12Mflop/s, easily 10 times faster than the general purpose system they were connected to. They come in various types like: General Purpose Graphical Processing Units (GPGPUs), Field Programmable Gate Arrays (FPGAs), ClearSpeed's floating point accelerators. GPU is dedicated processor for rendering graphics. There are two dominant producers of High Performance GPUs chips: NVIDIA and ATM. Most GPU programs are written in shader language such as OpenGL (Linux, Windows) or HLSL (Windows). FPGAs have a long history in embedded processing and specialized computing. These areas include DSP, ASIC prototyping, medical imaging and other specialized compute intensive areas. An important differentiator between FPGAs and other accelerators is that they are programmable. The dominant FPGA chip vendors are Xilinx and Altera. ClearSpeed Technology produces a board that is designed to accelerate floating-point calculations. This board plugs into a PCI-X slot, has a clock cycle of 500 MHz, and contains 96 floating-point functional units that can each perform a double precision multiply-add in one cycle. |
Shaveta Tatwani | ||
|
Performance Analysis of Petascale Applications with the HPCToolkit Cutting-edge science and engineering applications require petascale computing. It is, however, a significant challenge to use petascale computing platforms effectively. Consequently, there is a critical need for performance tools that enable scientists to understand impediments to performance on emerging petascale systems. In this paper, we describe HPCToolkit, a suite of multi-platform tools that supports sampling-based analysis of application performance on emerging petascale platforms. HPCToolkit uses sampling to pinpoint and quantify both scaling and node performancebottlenecks. We study several emerging petascale applications on the Cray XT and IBM BlueGene/P platforms and use HPCToolkit to identify special source lines in their full calling context associated with performance bottlenecks in these codes. Such information is exactly what application developers need to know to improve their applications to take full advantage of the power of petascale systems. |
Michael Gerndt | ||
| Dynamic instrumentation
infrastructure for Grid/Cloud services Grid / Cloud computing expounds the vision of applications having on-demand, ubiquitous access to distributed services running on diverse, managed resources like computation, storage, instruments, and networks among others, that are owned by multiple administrators. Many Grid workflow middleware services require knowledge about the performance behavior of Grid applications/services in order to effectively select, compose, and execute workflows in dynamic and complex Grid systems. Moreover, Grid workflows introduce multiple levels of abstraction and all levels must be taken into account in order to understand the performance behavior of a workflow. Hence, any instrumentation infrastructure for Grid workflows shall assist the user/tool to conduct the monitoring and analysis in a specific way. As an outcome of this seminar, the student would have an idea to do performance analysis for Grid/Cloud applications.
|
Shajulin Benedict | ||
|
Partitioned Global Address Space Programming Model An overview of the concepts, languages that implement it, advantages/disadvantages to MPI/OpenMP, and Tricks/Pitfals with getting performance.
|
Falco Cescolini | Ventsislav Petkov | |
| Performance tuning
techniques for CUDA CUDA is a parallel computing architecture developed by NVIDIA for its GPUs that allows access to the native instruction set and memory of the parallel computational elements in the GPUs. GPUs are a parallel many-core processing units capable of running thousands of threads simultaneously. GPUs offer a high performance gain for applications that are suited for that architecture. GPUs being intrinsically different from CPUs, developement techniques should be different on those architecture. Draw shortly the differences between the two architecture, putting the emphasis on the advantages and limitation of CUDA. List performance tuning techniques for CUDA with supportive explanation of each. Performance techniques should constitute at least 60% of your report/presentation. |
Claudia Simion | Houssam Haitof | |
|
Performance analysis on GPGPU based Architectures Utilization of the modern graphics processing units (GPUs) for non-graphical high performance computing purpose becomes more and more popular recently. The trend is motivated by their incredible computational speed and attractive cost/performance ratio. Although modern GPUs permit high throughput and employ more parallelism, achieving appropriate application performance is a complicated task. Also the performance analysis procedure itself is more difficult. This seminar topic will provide an insight into performance analysis of general purpose applications running on GPUs (GPGPU), performance analysis tools available and important optimization issues. |
Shulei Zhu |
Yury Oleynik | |
| Programming Models for
Scalable Multicore Architectures To answer the need of more and
more computational power the underlying
hardware gets more and more parallel. Nowadays the number of computational
cores on each chip is growing fast and heterogenous systems are used
soaring.
This also affects the area of High Performance Computing, where
parallelization on clusters or supercomputers was already in heavy use for
many years. |
Marcel Meyer | ||
| HPC - made for applications: splitting the productivity
and efficiency layers "Writing programs that scale with increasing numbers of cores should be as easy as writing programs for sequential computers." This is the challange arising nowadays from the industry towards the academia world, as multicore architectures and HPC emerge into the market. The need for more computation power is clearly seen in the large applications already existing. The question is now, do the researchers manage to meet this requirement by providing suitable programming models and frameworks along with the new architectures? Projects like ParLab and ACES III already made first steps towards a solution. Whether reviving the Programming Patterns paradigm or sticking to the High-Level Languages, the key to the problem seems to hide in splitting the productivity and efficiency layers. Whether this is possible, how could it be realized and what are the advances so far is to be researched for and discussed in this topic. |
Anca Berariu |
More information: gerndt@in.tum.de.