Transferring Files to and from Spartan
Home and Projects
There are two locations on Spartan where users can temporarily copy and store data. These are their home directory (
/home/$username/), and their project directory (
/data/projects/$projectid). The home directory has a soft-limit of 50 GB of data and the shared project directory 500 GB. Spartan is not designed to a permanent storage solution.
For easy of navigation it is recommended that users create a symbolic link from their home directory to a project directory. e.g.,
ln -s /data/projects/$projectid $projectid
Nodes for Transferring
The $HOME and $PROJECT directories are mounted across all compute nodes, as well as the management and login nodes. In addition there is a special i/o node specifically established for transferring files. You can access that node via ssh from the login node.
[lev@spartan ~]$ ssh spartan-io
Last login: Fri Jan 27 15:00:40 2017 from spartan.hpc.unimelb.edu.au
Users should use the i-o for copying data between systems (e.g., their desktop system and spartan). Use of the login node is possible for such transactions, but not recommended. Copying data from the Internet to a compute node is not possible (they only have interal IP addresses). For example:
Transfer between the Internet and a Compute Node
[lev@spartan-rc021 ~]$ ping google.com
PING google.com (126.96.36.199) 56(84) bytes of data.
--- google.com ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Will not work. Likewise (and more obviously)
[lev@cricetomys ~]$ ping spartan-rc021.spartan.hpc.unimelb.edu.au
ping: spartan-rc021.spartan.hpc.unimelb.edu.au: Name or service not known
Also will not work.
Transfer between the Internet and the Login Node
[lev@spartan ~]$ wget http://levlafayette.com/files/2017luv-openstack.pdf
--2017-02-08 14:01:43-- http://levlafayette.com/files/2017luv-openstack.pdf
2017-02-08 14:01:44 (5.64 MB/s) - ‘2017luv-openstack.pdf’ saved [948704/948704]
Works, but not recommended. If it is used, should be for only the smallest transfers.
Transfer between the Internet and the I-O Node
[lev@spartan-io ~]$ wget http://levlafayette.com/files/2017luv-openstack.pdf
Recommended approach. Node is specifically designed for transferring files.
Applications for Copying Files
There are numerous client applications for transferring files between systems. For example applications with a Graphic User Interace for MS-Windows there is WinSCP, for Mac OS X there is Cyberduck and Fugu. Nearly every Linux file browser also has the ability to use ssh/sftp to connect with networked filesystems, by using the SSH Filesystem. Filezilla can be used across each of these operating systems with a similar interface.
Transfers with FTP and SFTP
Like many other places, Spartan does not allow for applications and protocols that transmit passwords in plain-text (e.g., FTP, rcp, and rlogin).
[lev@cricetomys LUV-OpenStack]$ ftp firstname.lastname@example.org
ftp: email@example.com: Name or service not known
This will not work, as the insecure service is not enabled.
[lev@cricetomys LUV-OpenStack]$ sftp firstname.lastname@example.org
Connected to spartan-io.hpc.unimelb.edu.au.
This secure service does work. SFTP is FTP with an SSH wrapper ("the SSH File Transfer Protocol). It other respects it mostly he same commands as FTP (e.g., cd for change directory, chmod to change permissions, get and put for file transfers etc). See
man sftp for a full list.
Tranfers with SCP
Secure Copy (SCP) provide a more limited range of commands than SFTP, but in most cases it is sufficient for the purpose of transferring files between different systems. As SFTP is to FTP, SCP is to RCP, wrapping the protocol in a secure shell. The core commands are similar to the `cp` command, that is:
scp [options] source destination
It is common to make use of relative paths (e.g., `.` for current working directory), recursion for directories (`-r`), and the colon (`:`) for a path delimiter.
Note that the destination must be a accessible IP address to the source. It is not possible, for example, to "push" files from Spartan to a local system (unless that system has a public IP address). Instead, files need to "get" files from Spartan to a local system. For example:
[lev@spartan-io ~]$ ls -lad mpiexample
drwxr-xr-x 2 lev unimelb 4096 Sep 28 16:30 mpiexample
[lev@spartan-io ~]$ scp -r mpiexample lev@cricetomys:
This will not work. Spartan has no idea what or where the host `cricetomys` is. However, as a public system, cricetomys can route to Spartan.
[lev@cricetomys ~]$ scp -r email@example.com:mpiexample .
This command copies the folder mpiexample (`:mpiexample`) and recursively all it's contents (`-r`) to the current working directory on the local system `cricetomys` (`.`).
Rsync for Synchronising Files and Folders
SFTP and SCP are very fine applications and protocols for copying files however this design is less than optimal for synchronisation. For example, imagine a user starts with a large local dataset which they then upload to Spartan for processing. The processing involves changes to the dataset itself, which they then wish to copy to the original local directory. Using SCP or SFTP would require them to copy the entire directory and all files even if only minimal changes had occured. This can be very time consuming.
A popular alternative is `rsync`, which is a file copying and synchronisation tool. If it finds similar files in the the destination directory it will only copy the changes (both in terms of the files themselves and content within the files). It can be used within a system (like the `cp` command) and between systems (like the `scp` command). The basic commands are similar:
rsync [options] source destination
Continuing the example from the `scp` command above. The options being applied here are recursively preserve symbolic links, permissions, user and group ownerships and timestamps), verbose (`-v`), and compress on copy (`-z`). It copies the local directory `mpiexample` to the the directory on Spartan with the same name.
[lev@cricetomys ~]$ cd mpiexample/
[lev@cricetomys mpiexample]$ touch rsync.slurm
[lev@cricetomys mpiexample]$ cd ..
[lev@cricetomys ~]$ rsync -vz mpiexample/ firstname.lastname@example.org:mpiexample/
sending incremental file list
The reverse can be simply used for copying files from Spartan to a local system.
[lev@cricetomys ~]$ rsync -vz email@example.com:mpiexample/ .
Many use `rsync` to synchronise files within a system. In this case consideration should be given to preserve timestamps, symbolic links etc with the `-a` (archive) option.
[lev@cricetomys ~]$ cd ~/Desktop/
[lev@cricetomys ~]$ rsync -avz ~/mpiexample/ .
sent 120083 bytes received 300 bytes 240766.00 bytes/sec