Tuesday, October 21, 2008

Distributed Revision System Mercurial

Converting CVS to HG

To get hand on knowledge on the distributed revision systems like Mercurial,
just export one of your CVS Reps to a test HG Rep. Important for any repository, the history should stay intact (and hopefully will)!

A more complete guide can be found here:

Generate the repository folder and enter:
mkdir -p /path/to/hg/repo
cd /path/to/hg/repo

Generate the config file:

tailor -v --source-kind cvs --target-kind hg --repository /path/to/CVS/REP --module YourModuleName -r INITIAL >Config.tailor

for SSH access to the repository change /path/to/CVS/REP to :

Change configfile to your needs
vi Config.tailor

Now you will at least need to change subdir from . to MODULENAME, and remove /MODULENAME from root-directory in the MODULENAME.tailor file (if it is really there).

Add the line:

patch-name-format =

Generate Mercurial project

tailor --configfile MODULENAME.tailor

Cloning repositories with ssh

To clone the repository, ssh can be used easily.
Just type the following hg clone ssh://yourlogin@yourhost/
or insert ssh://yourlogin@yourhost// in your client program as the source path.

Distributed Revision System Mercurial

Converting CVS to HG

To get hand on knowledge on the distributed revision systems like Mercurial,
just export one of your CVS Reps to a test HG Rep. Important for any repository, the history should stay intact (and hopefully will)!

A more complete guide can be found here:

Generate the repository folder and enter:
mkdir -p /path/to/hg/repo
cd /path/to/hg/repo

Generate the config file:

tailor -v --source-kind cvs --target-kind hg --repository /path/to/CVS/REP --module YourModuleName -r INITIAL >Config.tailor

for SSH access to the repository change /path/to/CVS/REP to :

Change configfile to your needs
vi Config.tailor

Now you will at least need to change subdir from . to MODULENAME, and remove /MODULENAME from root-directory in the MODULENAME.tailor file (if it is really there).

Add the line:

patch-name-format =

Generate Mercurial project

tailor --configfile MODULENAME.tailor

Cloning repositories with ssh

To clone the repository, ssh can be used easily.
Just type the following hg clone ssh://yourlogin@yourhost/
or insert ssh://yourlogin@yourhost// in your client program as the source path.

Thursday, October 16, 2008

Co-array Fortran and UPC

CAF and UPC are Fortran and C extensions for the Partitioned Global Adress Space (PGAS) model.
So independent of the hardware restrictions, each processor can access (read and write) data from other processors, without the need of additional communication libraries, e.g. MPI.

HLRS provided an introductory course about this.
At the current development stage I do not clearly see the benefit for production codes. However, some ideas might be implemented more quickly with these paradigms than with ordinary MPI for testing purposes.

Monday, October 13, 2008


  • Johannes Habich: Performance Evaluation of Numeric Compute Kernels on NVIDIA GPUs, Master's Thesis , RRZE-Erlangen, LSS-Erlangen, 2008.

  • Johannes Habich: Improving computational efficiency of Lattice Boltzmann methods on complex geometries , Bachelor's Thesis , RRZE-Erlangen, LSS-Erlangen, 2006.

Other publications (not fully reviewed)

  • G. Hager, J. Treibig, J. Habich, and G. Wellein: Exploring performance and power properties of modern multicore chips via simple machine models. Submitted. Preprint: arXiv:1208.2908

  • J. Habich, C. Feichtinger, G. Wellein: GPGPU implementation of the LBM: Architectural Requirements and Performance Result,
    Parallel CFD Conference 2011, BSC, Barcelona, Spain, May 2011.

  • G. Wellein, J. Habich, G. Hager, T. Zeiser: Node-level performance of the lattice Boltzmann method on recent multicore CPUs,
    Parallel CFD Conference 2011, BSC, Barcelona, Spain, May 2011.

  • C. Feichtinger, J. Habich, H. Köstler, U. Rüde,  G. Wellein: WaLBerla: Heterogeneous Simulation of Particulate Flows on GPU Clusters,
    Parallel CFD Conference 2011, BSC, Barcelona, Spain, May 2011.

  • J. Habich, C. Feichtinger, G. Hager, G. Wellein: Poster: Parallelizing Lattice Boltzmann Simulations on Heterogeneous GPU&CPU Clusters. 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (Supercomputing '10, New Orleans, 13.11. -- 19.11.2010) , 2010.

  • J. Habich, T. Zeiser, G. Hager, G. Wellein: Enabling temporal blocking for a lattice Boltzmann flow solver through multicore aware wavefront parallelization. Parallel CFD Conference 2009, NASA AMES, Moffet Field (CA, USA), Mai, 2009.

  • S. Donath, T. Zeiser, G. Hager, J. Habich, G. Wellein: Optimizing performance of the lattice Boltzmann method for complex geometries on cache-based architectures, (In: F. Hülsemann, M. Kowarschik, U. Rüde (editors), Frontiers in Simulation -- Simulationstechnique, 18th Symposium in Erlangen, September 2005 (ASIM)), SCS Publishing, Fortschritte in der Simulationstechnik, ISBN 3-936150-41-9, (2005) 728-735.

Given or co-authored talks and presentations (see also section on lectures below)

  • J. Habich, C. Feichtinger, G. Wellein, waLBerla: MPI parallele Implementierung eines LBM Lösers auf dem Tsubame 2.0 GPU Cluster, Seminar Talk, Leibniz Rechenzentrum, München, Germany, Feb. 29th 2012.

  • J. Habich, C. Feichtinger, G. Wellein, Hochskalierbarer Lattice Boltzmann Löser für GPGPU Cluster , High Performance Computing Workshop , Leogang, Austria, Feb. 27th 2012.

  • G. Wellein, J.Habich, G. Hager, T. Zeiser, Node-level performance of the lattice Boltzmann method on recent multicore CPUs I,
    Parallel CFD Conference 2011, Barcelona, Spain, May 2011.

  • G. Wellein, J.Habich, G. Hager, T. Zeiser, Node-level performance of the lattice Boltzmann method on recent multicore CPUs II,
    Parallel CFD Conference 2011, Barcelona, Spain, May 2011.

  • J.Habich, C. Feichtinger, G. Wellein, GPGPU implementation of the LBM: Architectural Requirements and Performance Result,
    Parallel CFD Conference 2011, Barcelona, Spain, May 2011.

  • C. Feichtinger, J. Habich, H. Köstler, U. Rüde G. Wellein, WaLBerla: Heterogeneous Simulation of Particulate Flows on GPU Clusters,
    Parallel CFD Conference 2011, Barcelona, Spain, May 2011.

  • J.Habich, Ch. Feichtinger and G. Wellein, GPU optimizations at RRZE,
    invited Talk, ZISC GPU Workshop, Erlangen, Germany, April, 2011.

  • G. Wellein, G. Hager and J.Habich, The Lattice Boltzmann Method: Basic Performance Characteristics and Performance Modeling,
    invited Minisymposia talk, SIAM CSE 2011, Reno, Nevada, USA, March, 2011.

  • J.Habich and Ch. Feichtinger, Performance Optimizations for Heterogeneous and Hybrid 3D Lattice Boltzmann Simulations on Highly Parallel On-Chip Architectures,
    invited Minisymposia talk, SIAM CSE 2011, Reno, Nevada, USA, March, 2011.

  • J.Habich, Ch. Feichtinger, T. Zeiser, G. Wellein, Optimizations on Highly Parallel On-Chip Architectures: GPUs vs. Multi-Core CPUs (for stencil codes),
    iRMB TU-Braunschweig, invited Seminar talk, Braunschweig, Germany, July 2010.

  • J.Habich, Ch. Feichtinger, T. Zeiser, G. Wellein, Optimizations on Highly Parallel On-Chip Architectures: GPUs vs. Multi-Core CPUs (for stencil codes),
    iRMB TU-Braunschweig, invited Seminar talk, Braunschweig, Germany, July 2010.

  • J.Habich, Ch. Feichtinger, T. Zeiser, G. Hager, G. Wellein, Performance Modeling and Optimization for 3D Lattice Boltzmann Simulations on Highly Parallel On-Chip Architectures: GPUs Vs. Multi-Core CPUs,
    ECCOMAS CFD Lisboa, Lisbon, Portugal, June 2010.

  • J.Habich, T. Zeiser, G. Hager, G. Wellein, Performance Modeling and Multicore-aware Optimization for 3D Parallel Lattice Boltzmann Simulations,
    Facing the Multicore-Challenge, Heidelberger Akademie der Wissenschaften, Heidelberg, Germany, March 2010.

  • J. Habich, T. Zeiser, G. Hager, G. Wellein: Performance Evaluation of Numerical Compute Kernels on GPUs,
    First International Workshop on Computational Engineering - Special Topic Fluid-Structure Interaction, Herrsching am Ammersee, Germany, October, 2009.

  • J.Habich, T. Zeiser, G. Hager, G. Wellein: Towards multicore-aware wavefront parallelization of a lattice Boltzmann flow solver,
    5th Erlangen High-End-Computing Symposium, Erlangen, Germany, June 2009.

  • J. Habich, T. Zeiser, G. Hager, G. Wellein: Enabling temporal blocking for a lattice Boltzmann flow solver through multicore-aware wavefront parallelization, submitted to Parallel CFD Conference,
    Moffett Field, California, USA, May 18-22, 2009.

  • J. Habich, T. Zeiser, G. Hager, G. Wellein: Speeding up a Lattice Boltzmann Kernel on nVIDIA GPUs,
    First International Conference on Parallel, Distributed and Grid Computing for Engineering (PARENG09-S01), Pecs, Hungary, April 2009.

  • J. Habich, G. Hager: Erfahrungsbericht Windows HPC in Erlangen,
    WindowsHPC User Group 2nd Meeting, Dresden, Germany, March 2009.

  • J. Habich, G. Hager: Windows CCS im Produktionsbetrieb und erste Erfahrungen mit HPC Server 2008,
    WindowsHPC User Group 1st Meeting, Aachen, Germany, April 2008.

  • T. Zeiser, J. Habich, G. Hager, G. Wellein: Vector computers in a world of commodity clusters, massively parallel systems and many-core many-threaded CPUs: recent experience based on advanced lattice Boltzmann flow solvers,
    HLRS Results and Review Workshop, Stuttgart, Germany, September 2008.

  • S. Donath, T. Zeiser, G. Hager, J. Habich, G. Wellein: On cache-optimized implementations of the lattice Boltzmann method on complex geometries,
    ASIM, Erlangen, Germany, September 2005.

Conference, workshop and tutorial participation without own presentation

  • WindowsHPC User Group 3rd Meeting, St. Augustin, March 2010.

  • WindowsHPC User Group 2nd Meeting, Dresden, March 2009.

  • Introduction to Unified Parallel C (UPC) and Co-array Fortran (CAF) HLRS, October 2008

  • Course on Microfluidics University of Erlangen-Nuremberg Computer Science 10, System Simulation, October 2008

  • IBM Power6 Programming Workshop at RZG, September, 2008

  • PRACE Petascale Summer School (P2S2), Stockholm, Sweden, August, 2008.