Campus Rocks

What is it?

Craigslist donated a bunch of computers to BSOE in the summer of 2009. We have built a computer cluster out of it, and are allowing students, staff and faculty at UCSC to use the cluster for it's planned lifetime (until January 1st, 2013). It is a rocks cluster, running SunGrid for queuing job submissions. It has 60 nodes, each with 16GB of RAM. Half the nodes have 2 cores and half have 4 cores, giving 180 processors.

How do I log in?

Campusrocks currently uses the same account and password in BSOE's VPN and CruzNetSecure. Go to http://vpn.soe.ucsc.edu to set your password. Once you have a password, you can log in to the system via SSH at campusrocks.soe.ucsc.edu.

How long will it be available?

It has a 3 year life cycle, and will be unplugged at the end, unless
funding appears to rejuvenate it. Its drop-dead date is January 1st, 2013.

What are some of the limitations (privacy, disk space)?

It came about based on an idea ISSDM had to build a campus cluster,
and allow others to use it. The idea is have a shared resource, but
allow those doing Computer research to be able to look at REAL file
systems, to see how people use the file systems. So file system data
will be available for both ISSDM and SSRC, and other units that need
access to it.

How do I use the cluster (submit, cancel, review jobs)?

Regarding the queuing software, we are using SunGrid SGE and its documentation is available on the cluster also.

http://campusrocks.soe.ucsc.edu/roll-documentation/sge/5.2/

qsub --> Submits a job (create a shell script, then run qsub shellscript)
qdel --> Delete a job
qlogin --> Interactive login
qstat --> See the status of jobs in the queue
qmon --> Gui

NOTE: The web site is locked to on campus access only.

How can I run MPI jobs on it?

Compile the code with mpicc

/opt/openmpi/bin/mpicc

sample c code is in /opt/mpi-tests/src

And then a shell script like this (mpitest16):
#!/bin/csh
unsetenv SGE_ROOT
/opt/openmpi/bin/mpirun -np 16 -machinefile $TMPDIR/machines /opt/mpi-tests/bin/mpi-ring

Finally submitted to the sungrid queue like this:
% qsub -pe mpi 16 mpitest16

How can I see how busy the cluster is?

Ganglia is Running on the server. Click the following URL from on campus to see the cluster:

http://campusrocks.soe.ucsc.edu/ganglia/

What is the Small Queue, what are the limitations?

The Small queue is for jobs that will not run for a long time, there is a 48 hour wall clock limit and and 800 hour CPU limit (if you do multi-threaded operations)

you can see queue configurations with the command qconf -sq small.q

The small.q currently has 2 boxes dedicated with 48 processors each

(The all.q has 64 computers with a total of 200 processors)

How do I load Software on it?

We will load RPMs that are in the yum repository for the OS we are running, or you can compile code yourself in your home directory.

Put in an ITRequest ticket for known RPMs.

Are there Backups?

The /campus directory (your home directory) has an rsync done daily. Look in the /backups directory to find them. We run snapshots in the /backups directory cd /backups/.zfs/snapshots  (we keep 4 daily snapshots, and 3 monthly snapshots)

What Was Donated

Craigslist donated the following equipment to BSOE, we intend to combine the best pieces and make a compute cluster for research

  • 4 Rackable File Servers
    rackable 4u
  • 93 SuperMicro 1U, 4GB RAM, 3x36GB disk, 2xdual core AMD Opteron 252 2GHz
    supermicro 1u
  • 8 SuperMicro 3U File Servers 14x26GB SCSI
    supermicro 3u
  • 2 SuperMicro File Servers 3U (Older)
    supermicro 3u older
  • 10 Lantronix Terminal Consoles
    lantronix
  • 6 HP 2650
    hp
  • 37 Rackable 1U 2xDual Core 16 or 32GB
    supermicro 1u-2
  • 27 network controlled PDU's
  • Misc pile (older stuff) 1 Rackable, 2 SuperMicro 1u, 3 SuperMicro 1u, 1 Sony AIT tape robot, 1 SuperMicro file server, 2 old white file servers (look like desktops)
    misc equipment
  • 4u Rackable 2Ghzx4 12x73GB 16GB RAM
    rackable 4u