Craigslist donated a bunch of computers to BSOE in the summer of 2009. We have built a computer cluster out of it, and are allowing students, staff and faculty at UCSC to use the cluster for it's planned lifetime (until January 1st, 2013). It is a rocks cluster, running SunGrid for queuing job submissions. It has 60 nodes, each with 16GB of RAM. Half the nodes have 2 cores and half have 4 cores, giving 180 processors.
Campusrocks currently uses the same account and password in BSOE's VPN and CruzNetSecure. Go to http://vpn.soe.ucsc.edu to set your password. Once you have a password, you can log in to the system via SSH at campusrocks.soe.ucsc.edu.
It has a 3 year life cycle, and will be unplugged at the end, unless
funding appears to rejuvenate it. Its drop-dead date is January 1st, 2013.
It came about based on an idea ISSDM had to build a campus cluster,
and allow others to use it. The idea is have a shared resource, but
allow those doing Computer research to be able to look at REAL file
systems, to see how people use the file systems. So file system data
will be available for both ISSDM and SSRC, and other units that need
access to it.
Regarding the queuing software, we are using SunGrid SGE and its documentation is available on the cluster also.
http://campusrocks.soe.ucsc.edu/roll-documentation/sge/5.2/
qsub --> Submits a job (create a shell script, then run qsub shellscript)
qdel --> Delete a job
qlogin --> Interactive login
qstat --> See the status of jobs in the queue
qmon --> Gui
NOTE: The web site is locked to on campus access only.
Compile the code with mpicc
/opt/openmpi/bin/mpicc
sample c code is in /opt/mpi-tests/src
And then a shell script like this (mpitest16):
#!/bin/csh
unsetenv SGE_ROOT
/opt/openmpi/bin/mpirun -np 16 -machinefile $TMPDIR/machines /opt/mpi-tests/bin/mpi-ring
Finally submitted to the sungrid queue like this:
% qsub -pe mpi 16 mpitest16
Ganglia is Running on the server. Click the following URL from on campus to see the cluster:
http://campusrocks.soe.ucsc.edu/ganglia/
The Small queue is for jobs that will not run for a long time, there is a 48 hour wall clock limit and and 800 hour CPU limit (if you do multi-threaded operations)
you can see queue configurations with the command qconf -sq small.q
The small.q currently has 2 boxes dedicated with 48 processors each
(The all.q has 64 computers with a total of 200 processors)
We will load RPMs that are in the yum repository for the OS we are running, or you can compile code yourself in your home directory.
Put in an ITRequest ticket for known RPMs.
The /campus directory (your home directory) has an rsync done daily. Look in the /backups directory to find them. We run snapshots in the /backups directory cd /backups/.zfs/snapshots (we keep 4 daily snapshots, and 3 monthly snapshots)
Craigslist donated the following equipment to BSOE, we intend to combine the best pieces and make a compute cluster for research