Status: (more...)

Spartan is High Performance Computing (HPC) system operated by Research Platform Services (ResPlat) at The University of Melbourne. It combines a high performance bare-metal compute with flexible cloud infrastructure and GPGPU to suit a wide range of use-cases.

If your computing jobs take too long on your desktop computer, or are simply not possible due to a lack of speed and memory, a HPC system like Spartan can help.

Use of this service is governed by the University's general regulations for IT resources and our HPC Support Service Policy.

Spartan Daily Weather Report (20190819)

  • CephFS usage: 1033.12TB Free: 345.63TB (74%)
  • Spartan is very busy on the cloud partition, with close to 96% node allocation. Total pending/queued: 4281
  • Spartan is very busy on the physical partition, with close to 100% node allocation. Total pending/queued: 4096
  • Spartan is less busy on the snowy partition, with close to 83% node allocation. Total pending/queued: 435
  • Spartan is busy on the GPGPU partition, with close to 93% node allocation. Total pending/queued: 109
  • GPUGPU usage in the [ gpgpu ] partition: 211 / 232 cards in use (90.94%)
  • Some nodes out (35), primarily due to reserved, future use, RAM replacements, BIOS checks, or failed kill.

Getting Help

Training

We run regular one-day courses on HPC, shell scripting, parallel programming, and GPU programming. ResPlat also offer training in a wide range of other digital tools to accelerate your research.

Signup here: http://melbourne.resbaz.edu.au/participate

Helpdesk

If you can't find an answer here, need advice, or otherwise stuck, you can contact our support team at hpc-support@unimelb.edu.au

Please submit one topic per ticket. If you require a assistance with a separate matter, compose a new ticket. Do not reply to existing or closed tickets.

For password resets please see the FAQ or contact University Services on +61 3 8344 0999 or ext 40999 or email service-centre@unimelb.edu.au.

Specifications

Spartan has a number of partitions available for general usage. A full list of partitions can be viewed with the command sinfo -s.

Partition Nodes Cores/node Memory/node Processor Extra notes
cloud 165 12 100GB Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz
longcloud 2 12 100GB Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz Max walltime of 90 days
physical 19 12 254GB Intel(R) Xeon(R) CPU E5-2643 v3 @ 3.40GHz Group = physg1
6 32 508GB Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz Group = physg3
12 72 1540GB Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz Group = physg4
bigmem 2 36 1540GB Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz Group = physg2
phi 4 256 190GB Intel(R) Xeon Phi(TM) CPU 7230 @ 1.30GHz Based on the Xeon Phi Knights Landing architecture
snowy 31 32 127GB Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz
mig 12 32 127GB Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz Reserved for Melbourne Integrative Genomics users
gpgpu 73 24 127GB Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz 4 P100 Nvidia GPUs per node

Cloud

This partition is best suited for general-purpose single-node jobs. Multiple node jobs will work, but communication between nodes will be comparatively slow.

Physical

Each node is connected by high-speed 25Gb networking with 1.15 µsec latency, making this partition suited to multi-node jobs (e.g. those using OpenMPI).

You can constrain your jobs to use different groups of nodes (e.g. just the Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz nodes) by adding #SBATCH --constraint=physg4 to your submit script

GPGPU

See here for more details.

bigmem

This partition is suited to memory-intensive single-node workloads.

Other Partitions

There are also special partitions which are outside normal walltime constraints. In particular, shortcloud and shortgpgpu should be used for quick test cases; the partitions have a maximum time constraint of one hour.

By adding --time=00:59:00 to your cloud partition job, your job will run in a dedicated reservation, ensuring your job runs faster

Storage system

Spartan uses a storage system called CephFS. CephFS is a highly scalable, parallel and robust filesystem.

The total Spartan storage on CephFS is broken up into 2 areas:

Location Capacity Disk type
/data/cephfs 1760TB 7.2K SAS
/scratch 650TB Sandisk Flash

/home is on the University's NetApp NFS platform, backed by SSD

Citing Spartan

If you use Spartan to obtain results for a publication, we'd appreciate if you'd cite our service, including the DOI below. This makes it easy for us demonstrate research impact, helping to secure ongoing funding for expansion and user support.

Lev Lafayette, Greg Sauter, Linh Vu, Bernard Meade, "Spartan Performance and Flexibility: An HPC-Cloud Chimera", OpenStack Summit, Barcelona, October 27, 2016. doi.org/10.4225/49/58ead90dceaaa

If you are using the LIEF GPGPU cluster for a publication, please include the following citation in the acknowledgements section of your paper:

This research was undertaken using the LIEF HPC-GPGPU Facility hosted at the University of Melbourne. This Facility was established with the assistance of LIEF Grant LE170100200.

Other Resources

Spartan is just one of many research IT resources offered by The University of Melbourne, or available from other institutions.

Nectar

Nectar is a national initiative to provide cloud-based Infrastructure as a Service (IaaS) resources to researchers. It's based on OpenStack, and allows researchers on-demand access to computation instances, storage, and a variety of application platforms and Virtual Laboratories.

Spartan runs some of it's computation resources in the Nectar cloud.

Multi-modal Australian ScienceS Imaging and Visualisation Environment (MASSIVE)

MASSIVE is a HPC system at Monash University and the Australian Synchrotron which is optimized for imaging and visualization. It can run batched jobs, as well as provide a desktop environment for interactive work.