Thursday, December 9, 2010

Win2008 HPC Server and CUDA TCC revisited

The release of the stable NVIDIA Driver 260.83 broke my Windows CUDA programming environment.
With the currently newst driver, 263.06, I gave it another shoot. Initially the CUDASDK sample programs did not recognize the GPU as CUDA capable and there was just some babbling about DRIVER and TK mismatch.
However this time searching the web got me to an IBM webpage which got a solution for their servers running Windows 2008 R2.
I tried this in Win2008 and it works like charm:

  • Enter the registry edit utility typing regedit in the run dialog and navigate to:


[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\{4D36E968-E325-11CE-BFC1-08002BE10318}\

  • You will find subfolders named 0001 0002 aso. depending on the number of GPUs in your system.

  • For each card you want to enable CUDA go to that 000X directory and add the following reg key (32bit dword worked for me):


"AdapterType"=dword:00000002

If you access the system via RDP read my blog entry on Using nvidia-smi for TCC on how to set this up!

Source of this information is IBM and can be found here for further reference and even more details: IBM Support Site

Saturday, November 6, 2010

Embedding fonts in Latex / XMgrace / PDF toolchain

How to embed fonts in pdf files which are generated from latex and eps source under linux is nicely described here:
http://www.wkiri.com/today/?p=60

Be sure to disable the field "use device fonts" in the xmgrace printer settings when printing to eps

Friday, November 5, 2010

GIT: Distributed Revision System

Distributed Revision Systems have the advantage that a central server is not necessary in contrast to CVS or SVN. Furthermore commits and even more important diffs to other versions can be made  with a local repository only. This is an adanvantage when working offline, e.g. on journeys, on a plane etc. Furthermore a new branch for testing can quickly be made by simply cloning the repository once more and work in the new directory.

To setup up git client on your ubuntu linux just type aptitude install git-core and you're done.

For windows you need to download 2 packages:

MSysGIT
Tortoise GIT

If you install in this order everything should be fine. If Tortoise later on complains about not finding git, you get to Tortoise' Settings and point the git path to the directory where you installed msysgit +\bin e.g. c:\Program Files\git\bin

Working together on a project can be done like CVS and SVN. But in contrast to these central methods, you do not commit your files to the central repository but first to your local rep.

The central repository is then updated by a so called push.

To get the updates from the central server you do a pull.

Basic tasks are:

  1. Clone the repository to your local workstation: git clone git@someserver.com:projectname

  2. Add new files: git add xyz

  3. Commit the new files or changes is already added files: git commit xyz

  4. Send changes to server: git push

  5. Get changes from server to rep only: git fetch

  6. Get changes from server to rep and update checked out files: git pull


This is just for starting with git, please consult the manuals and documentation on more advanced topics.
Thanks to Thomas for this information for a quick start.

Another Distributed Revision System is mercurial where you find a small tutorial here: MercurialHG

GIT: Distributed Revision System

Distributed Revision Systems have the advantage that a central server is not necessary in contrast to CVS or SVN. Furthermore commits and even more important diffs to other versions can be made  with a local repository only. This is an adanvantage when working offline, e.g. on journeys, on a plane etc. Furthermore a new branch for testing can quickly be made by simply cloning the repository once more and work in the new directory.

To setup up git client on your ubuntu linux just type aptitude install git-core and you're done.

For windows you need to download 2 packages:

MSysGIT
Tortoise GIT

If you install in this order everything should be fine. If Tortoise later on complains about not finding git, you get to Tortoise' Settings and point the git path to the directory where you installed msysgit +\bin e.g. c:\Program Files\git\bin

Working together on a project can be done like CVS and SVN. But in contrast to these central methods, you do not commit your files to the central repository but first to your local rep.

The central repository is then updated by a so called push.

To get the updates from the central server you do a pull.

Basic tasks are:

  1. Clone the repository to your local workstation: git clone git@someserver.com:projectname

  2. Add new files: git add xyz

  3. Commit the new files or changes is already added files: git commit xyz

  4. Send changes to server: git push

  5. Get changes from server to rep only: git fetch

  6. Get changes from server to rep and update checked out files: git pull


This is just for starting with git, please consult the manuals and documentation on more advanced topics.
Thanks to Thomas for this information for a quick start.

Another Distributed Revision System is mercurial where you find a small tutorial here: MercurialHG

Friday, October 22, 2010

[Winhpcug] Einladung: Windows-HPC in Wirtschafts- und Finanzwiss. am 05.11. in Köln

Am 05.11.2010 findet in Köln ein eintägiges Treffen (10:00 - 17:00 Uhr) der
Benutzergruppe mit Schwerpunkt auf den Wirtschafts- und Finanzwissenschaften
statt. Dazu kommt die Diskussion neuer Technologien wie dem Windows HPC
Server 2008 R2. Mehr Informationen finden Sie im angehängten Flyer sowie auf
der Webseite , die auch eine Anmeldemöglichkeit bietet.

Anmeldung and die WINHPCUG Mailingliste:
Winhpcug@lists.rwth-aachen.de
https://mailman.rwth-aachen.de/mailman/listinfo/winhpcug

Friday, October 15, 2010

Lectures and seminars of the HPC group

Interested in topics in and around HPC for your studies?


Than have a look at the official homepage .


Find all lectures and seminars here.

Wednesday, October 13, 2010

NVIDIA CUDA TCC Driver Released 260.83

Just today Nvidia released the WHQL certified Tesla Compute Cluster driver TCC 260.83 for usage in e.g. Windows 2008 Server/HPC.
Till now only a beta version was available
With that special driver you have the ability to use GPGPU compute resources via RDP or via WindowsHPC batch processing mode.

Download the driver here


/edit:
Actually installing this driver broke my working environment. So be sure to keep a backup of the system. Even reinstalling the beta version did not solve the problem.

Tuesday, October 12, 2010

Win2008 HPC Server and CUDA TCC

Nvidia now provides a beta driver called Tesla Compute Cluster (TCC) in order to use CUDA GPUs within a windows cluster environment. Not only remotely via RDP but also in batch processing. Till now, the HPCServer lacked this ability, as Windows did not fire up the graphics driver inside the limited batch logon mode.

My first steps with TCC took a little bit longer than estimated.

First of all It is not possible to have a NVIDIA and AMD or INTEL GPU side by side as Windows needs to use one unified WDM and thats either one or the other vendor. This was completely new to me.

After this first minor setback and reequipped with only the tesla C2050 the BIOS did not finish, so be sure to be up to date with your BIOS revision.
Another NVIDIA card was the quick fix on my side.

Next thing is the setup. Install the 260 (currently beta) drivers and the exclamation mark in the device manager should vanish.
After that install the toolkit and SDK if you like.
With the nvidia-smi tool, which you find in one of the uncountable NVIDIA folders which are there now, you can have a look if the card is initally correctly recognized.
As well set the TCC mode of the Tesla card to enabled if you want to have remote cuda capabilities:

nvidia-smi -s --> shows the current status
nvidia-smi -g 0 -c 1 --> enables TCC on GPU 0


Next thing you want to test the device query coming with the SDK.
If it runs and everything looks fine, feel gifted!

Nothing did run on my setup. So first of all I tried to build the SDK example myself. Therefore first of all build the Cuda utilities, lying somewhere in the SDK within the folder "common".
Depending on the Nsight or TK version you have installed you get an error when opening the VS project fles . The you need to edit the visual studio with a text editor of your choice and replace the outdated build rule with the one actually installed.

  • In the error message get to the folder where VS does not find the file.

  • Copy the path and go there with your file browser

  • Find the file most equal to the one in the VS error message.

  • Once found open the VS file and replace the wrong filename there with the correct one

  • VS should open



In order to compile, add the correct include and library directories to the VS project.
Finally you can build deviceQuery or any other program.

Still this setup gave me the same error as the precompiled deviceQuery:
cudaGetDeviceCount FAILED CUDA Driver and Runtime version may be mismatched.

With the help of the DependencyWalker i found out that a missing DLL was the problem, namely:
linkinfo.dll.

You can get this by adding the feature named "Desktop Experience" through the server manager.
Once installed and rebooted the device query worked.

Friday, September 3, 2010

TinyGPU offers new hardware



TinyGPU has new hardware: tg010. The hardware configuration and the currently deployed software are different to the non-Fermi nodes:

  • Ubuntu 10.04 LTS (instead of 8.04 LTS) as OS.
    Note: For using the Intel Compiler <= 11.1 locally on tg010, you have to use gcc/3.3.6 Module [currently]. If not,  libstdc++.so.5 is missing , as Ubuntu 10.04 does no longer contain this version. This is necessary only for compilation. Compiled Intel binaries will run as expected.

  • /home/hpc and /home/vault are mounted [only] through NFS  (and natively via GPFS-Cross-Cluster-Mount)

  • Dual-Socket-System with  Intel Westmere X5650 (2.66 GHz) processor, having 6 native cores per socket (instead of Dual-Socket-System with  Intel Nehalem X5550 (2.66 GHz), having  4 native cores per socket)

  • 48 GB DDR3 RAM (instead of  24 GB DDR3 RAM)

  • 1x NVidia Tesla C250 (“Fermi” with  3 GB GDDR5 featuring ECC)

  • 1x NVidia GTX 280 (Consumer-Card with 1 GB RAM – formerly know as F22)

  • 2 further PCIe2.0 16x slots will be equipped with  NVidia C2070 Cards (“Fermi” with  6 GB GDDR5 featuring ECC) in Q4, instead of  2x NVidia Tesla M1060 (“Tesla” with  4 GB RAM) as in the remaining cluster nodes

  • SuperServer 7046GT-TRF / X8DTG-QF with  dual Intel 5520 (Tylersburg) chipset instead of  SuperServer 6016GT-TF-TM2 / X8DTG-DF with  Intel 5520 (Tylersburg) chipset


To allocate the fermi node, specify  :ppn=24 with your job  (instead of  :ppn=16) and explicitly submit to  the  TinyGPU-Queue fermi. The wallclock limit is set to the default of 24h . The ECC Memory status is shown on job startup.
This article tries to be a translation from the original posted here: Zuwachs im TinyGPU-Cluster