Friday, October 22, 2010

[Winhpcug] Einladung: Windows-HPC in Wirtschafts- und Finanzwiss. am 05.11. in Köln

Am 05.11.2010 findet in Köln ein eintägiges Treffen (10:00 - 17:00 Uhr) der
Benutzergruppe mit Schwerpunkt auf den Wirtschafts- und Finanzwissenschaften
statt. Dazu kommt die Diskussion neuer Technologien wie dem Windows HPC
Server 2008 R2. Mehr Informationen finden Sie im angehängten Flyer sowie auf
der Webseite , die auch eine Anmeldemöglichkeit bietet.

Anmeldung and die WINHPCUG Mailingliste:
Winhpcug@lists.rwth-aachen.de
https://mailman.rwth-aachen.de/mailman/listinfo/winhpcug

Friday, October 15, 2010

Lectures and seminars of the HPC group

Interested in topics in and around HPC for your studies?


Than have a look at the official homepage .


Find all lectures and seminars here.

Wednesday, October 13, 2010

NVIDIA CUDA TCC Driver Released 260.83

Just today Nvidia released the WHQL certified Tesla Compute Cluster driver TCC 260.83 for usage in e.g. Windows 2008 Server/HPC.
Till now only a beta version was available
With that special driver you have the ability to use GPGPU compute resources via RDP or via WindowsHPC batch processing mode.

Download the driver here


/edit:
Actually installing this driver broke my working environment. So be sure to keep a backup of the system. Even reinstalling the beta version did not solve the problem.

Tuesday, October 12, 2010

Win2008 HPC Server and CUDA TCC

Nvidia now provides a beta driver called Tesla Compute Cluster (TCC) in order to use CUDA GPUs within a windows cluster environment. Not only remotely via RDP but also in batch processing. Till now, the HPCServer lacked this ability, as Windows did not fire up the graphics driver inside the limited batch logon mode.

My first steps with TCC took a little bit longer than estimated.

First of all It is not possible to have a NVIDIA and AMD or INTEL GPU side by side as Windows needs to use one unified WDM and thats either one or the other vendor. This was completely new to me.

After this first minor setback and reequipped with only the tesla C2050 the BIOS did not finish, so be sure to be up to date with your BIOS revision.
Another NVIDIA card was the quick fix on my side.

Next thing is the setup. Install the 260 (currently beta) drivers and the exclamation mark in the device manager should vanish.
After that install the toolkit and SDK if you like.
With the nvidia-smi tool, which you find in one of the uncountable NVIDIA folders which are there now, you can have a look if the card is initally correctly recognized.
As well set the TCC mode of the Tesla card to enabled if you want to have remote cuda capabilities:

nvidia-smi -s --> shows the current status
nvidia-smi -g 0 -c 1 --> enables TCC on GPU 0


Next thing you want to test the device query coming with the SDK.
If it runs and everything looks fine, feel gifted!

Nothing did run on my setup. So first of all I tried to build the SDK example myself. Therefore first of all build the Cuda utilities, lying somewhere in the SDK within the folder "common".
Depending on the Nsight or TK version you have installed you get an error when opening the VS project fles . The you need to edit the visual studio with a text editor of your choice and replace the outdated build rule with the one actually installed.

  • In the error message get to the folder where VS does not find the file.

  • Copy the path and go there with your file browser

  • Find the file most equal to the one in the VS error message.

  • Once found open the VS file and replace the wrong filename there with the correct one

  • VS should open



In order to compile, add the correct include and library directories to the VS project.
Finally you can build deviceQuery or any other program.

Still this setup gave me the same error as the precompiled deviceQuery:
cudaGetDeviceCount FAILED CUDA Driver and Runtime version may be mismatched.

With the help of the DependencyWalker i found out that a missing DLL was the problem, namely:
linkinfo.dll.

You can get this by adding the feature named "Desktop Experience" through the server manager.
Once installed and rebooted the device query worked.