Originally Craig's List donated a bunch of computers to BSOE in the summer of 2009. We built a computer cluster out of it, and are allowing students, staff and faculty at UCSC to use the cluster . That original hardware is now retired.
The new cluster consists of one head node, and 7 compute nodes with a total of 587 cores to execute processes on. There are 4 queues:
Individuals may run 40 jobs/cores at anytime, but queue up as many as they need within reason.
We are using SunGrid SGE and its documentation is available on the cluster:
The above URL is NOT accessible outside of the UCSC IP space.
|qsub||Submits a job (create a shell script, then run qsub shellscript)|
|qdel||Delete a job|
|qstat||See the status of jobs in the queue|
Compile the code with mpicc
sample c code is in /opt/mpi-tests/src
And then a shell script like this (mpitest16):
/opt/openmpi/bin/mpirun -np 16 -machinefile $TMPDIR/machines /opt/mpi-tests/bin/mpi-ring
Finally submitted to the sungrid queue like this:
% qsub -pe mpi 16 mpitest16
qhost shows the load averages of each of the exec hosts
qstat -g c gives a count of number of jobs running on each queue
You may also use the hummingbird.ucsc.edu URL to access a number of links to graphs. The latest resource is the "PHPQstat Graph" link. This graph lists the queues, cores assigned to a queue, and cores being used/available. The "Ganglia Graphs" link provides node, and overall cluster usage.
The Small queue is for jobs that will not run for a long time, there is a 72 hour wall clock limit and and 800 hour CPU limit (if you do multi-threaded operations) you can see queue configurations with the command qconf -sq small.q
We will load RPMs that are in the yum repository for the OS we are running, or you can compile code yourself in your home directory.
The best method for placing the needed software, or files/data into your home directory on the cluster is to access the storage unit via 'sftp'. This is done via "sftp campusrocks-store-01.soe.ucsc.edu". This will allow you to move the necessary data without adding to overhead on the cluster head node
If you have questions about a package (known RPMs), or file transfer method/procedure please put in an ITRequest ticket .
We keep seven daily snapshots of your home directory. Look in /campusdata/.zfs/snapshot/ to find the snapshots. From there, you can simply copy files back to your home directory as needed. The Hummingbird storage unit is currently backed-up to an off-site external storage unit. This allows for data recovery should the main storage unit experience a catastrophic failure.
Campusrocks has been invaluable for my bioinformatics research with marine metagenomics data. The cluster has enabled me to investigate new ways of assembling and annotating 40-50 of these large datasets, with great speed (both due to fast cores, lots of memory, and parallelization) and reliable backup of scripts and results. I could not have done the same experiments in a reasonable time on my laptop, which would have been unusable for other research tasks had I tried. Finally, the cluster computing skills I have developed by working on campusrocks — my first such experience — will be essential for my bioinformatics work after graduate school. Thanks for maintaing such an important resource!
I use Campus Rocks to use the multi-core version of Stata that is installed, and for computational-intensive work in Python (usually estimation of statistical models). Let me know if you need more details.
I use the cluster for my research in medical imaging. I run simulations in a program called Geant4 which simulates particle interactions in an imaging system. Each simulation requires modeling the behavior and interactions of approximately 200 million proton events, and requires at least 90 cores, therefore, there is no other resource on campus that allows me to do these simulations in a timely manner. Likewise, it is essential to my work that the cluster function efficiently since time is of the essence. Some of the machines, namely (02 and 04) function at 1/3-1/5 the speed of some of the other machines which is extremely frustrating.
I use this cluster for parallelized monte carlo simulations of high energy particle physics processes, especially related to dark matter. I also utilize the cluster for simulating and processing large sets of gamma-ray data in order to search for astrophysical signatures of dark matter. A significant expansion of these resources would be of great value to UCSC's research programs.
I am an undergraduate working with Dr. Camps in METX. My cluster utilization involved analyzing co-variation in cancer databases, namely cbioportal, to provide functional context clues for an orphan gene. Proper analysis requires using the entire genome as a query set, which can be computationally intensive. CampusRocks is a great resource, thanks for your work.
I am a graduate student in Scott Lokey's lab. We use the Campus Rocks cluster for running molecular dynamics simulations on virtual libraries consisting of thousands of members. While each individual simulation is fairly brief and computationally inexpensive, the numbers mandate parallelism. The campus rocks cluster provides a wonderful and free resource for running these simulations. We greatly value its functionality and will continue to use it in whatever capacity we can.
campusrocks has been a tremendously helpful resource for me this year (I only recently joined the UCSC faculty). I have used the system to run DFT calculations and model molecular reactivity, photophysical properties, as well as NMR chemical shifts. In future, and as my lab continues to grow, we will furthermore be conducting bioinformatics analyses (ChIP-seq and RNA-seq). One of my research interests is on the transcription factor NF kappa B. Continued access to the cluster will play a highly important role for my research.
I use the cluster for assembling genome sequence data from bacteria we isolate in extreme environments high in arsenic. Part of my the research we do in my lab involves isolating and characterizing new bacterial species that can grow on the toxic metal, arsenic, which is naturally occurring and at high levels in places like Mono Lake, CA and other soda lakes in Nevada. The cluster is essential to the genome assembly process because I need a computer system with a lot of power. The programs I use work so much better on the cluster. It's been really nice to have this service available.