Wednesday, May 20, 2009
OpenMP Fortran
This will cause abnormal program abortion, seg faults and undefined behavihor.
However, defining the varibale as PRIVATE works and SHARED of course, too.
Hopefully a small code snippet will provide more insight.
Wednesday, April 29, 2009
Ganglia 3.1.2 for Windows HPC2008
Recent tests of the windows ported ganglia on Microsoft Windows HPC 2008, obtained from APR Consulting web page, showed a problem.
After a few minutes of runtime, the ganglia executable eats up more and more memory till the systems starts to swap, finally becomes unstable and crashes or is no longer reachable.
Not able to deploy ganglia to the cluster I tested different releases from APR and none of them had the problem running on Win2003 x64, however all showed the same memory leak problem on HPC2008x64 or just didn't work at all.
So finally we compiled our own Cygwin based gmond.exe binary and came up with a pretty stable version, with just one flaw:
Till now the installation as a service doesn't work, neither with gmondservice.exe from APR Consulting nor with the windows native tool sc.exe.
However the installation with schtasks.exe as a scheduled task to run once on startup and then daemonize (thats what Linux calls a service), works fine.
In addition a pure swap of the executables or the config file, will now result in an updated ganglia once the node reboots or a task restart is triggered instead of removing and reinstalling a service.
All steps of deployment can be easily done with the clusrun extension, which is essential for cluster administration.
Small tutorial
(all links are below, drop a comment if something is missing/wrong)
- Download a ganglia version (3.1.2 Langley worked indeed very well)
- Download and install cygwin with a gcc and g++ compiler and the additional packages mentioned in the README.WIN file of the ganglia package
- Do: ./configure make make install in the root directory of the confuse lib
- Perhaps you have to exclude the examples from the build:
replace line: SUBDIRS = m4 po src examples tests doc with
SUBDIRS = m4 po src tests doc
They throwed an error on my system. - Do: ./configure --with-libconfuse=/usr/local --enable-static-build and make in the root of ganglia
- With some additional dll files from cygwin, your release is now runnable. Just start the gmond.exe and look into the Event viewer which dll is missing and place them in the same folder or in a folder which is in the PATH.
currently:
libapr1, expat, diffutils, gcc, make, python, sharutils, sunrpc
and for libconfuse:
libiconv
Please note, that this a x86_32 binary and not x64, due to the fact that cygwin is not x64.
It should however be possible to build ganglia with the Windows Services for Unix to native x64.
Links:
Corresponding discussion in HPC2008 MS Forum
Cygwin
Ganglia
confuse library
APR Consulting web page
Thursday, January 22, 2009
Were is my "My Desktop" button
In order to get the "My Desktop" button back , e.g.on Windows Terminal servers, just execute the following command:
regsvr32 /n /i:U shell32
With the next reboot or upon restart of the Quick launch bar, the icon should appear.
Tuesday, December 16, 2008
Windows CCS Cluster Upgrade
One of the initial nodes rejoined the cluster and there are now 28 Opteron Cores available again.
Due to the usage of CFD for production runs, the user home was recently upgraded and the quota was extended to 10 GB per user.
Furthermore for special purposes and a limited amount of time there is an extra project home available with up to 120 GB space for extensive usage.
Monday, December 15, 2008
PCI express revisited
Blocked copys however, climb up to 4.5 GB/s when writing data to GPU memory.
Data copy back to the host is still relatively low at 2 GB/s.
Link to first article
Monday, December 8, 2008
Fast Network, Fast disconnects (Linksys WRT610N )
Looking forward to fast streaming HD Media over my new wireless router (WRT610N) I got into serious trouble on having a stable connection at all.
Having my network set up for WPA2 and TKIP for compatibility reasons, I got random disconnects of the whole 5 GHz band, while 2.4 GHz performed flawlessly. Searching the internet I stumbled across some serious accusations, that the WRT610N is a flawed design and overheats a lot.
Whether this is right or not I cannot say for sure, however I expected much more from Linksys and a home premium line product.
Searching a little more I came across another users experience that a change from the TKIP encryption to AES solved the problem of occuring disconnects.
And voila the problems seems to be solved.
So for everyone who can live with an AES only encryption on the 5 GHz 11N band and TKIP or AES on the 2.4 GHz 11g band the router is a perfect catch in both performance and appearance.
Monday, November 24, 2008
Yeehhaa: NVIDIA GT200 rocks
Some preliminary figures show the great improvement of this new generation as I expected from the data sheets. Soon I will post some verified results here and some about the changes from the G80 generation to the current GT200 chip.
Friday, November 7, 2008
Running MPI Jobs on Windows CCS
As a consequence, four MPI processes are started.
In order to remove the redundant hostnames you call your program the following way from inside the scheduler:
mpiexec.exe -hosts %CCP_NODES: 4= 1%
%CCP_NODES: 4= 1% removes three out of four lines, which reduces each hostname down to one occurence, as the same hostnames are always consecutive.