Table of Contents

Moving data to and from the cluster

Please read the section Common files organization before going through this section.

Rules of thumb

:!: Please read this carefully. :!:

When moving data to the shared folders, please follow these common sense rules:

Data transfer solutions

Here's some solutions to move data to the cluster. 1-3 are generic data transfer tools. 4-5 are GRID oriented data transfer tools (mostly for Particle Physicists)

These marked with 8-) are my favourite — Florido Paganelli 2013/08/27 20:20

Generic storage

Solution 1: scp,sftp,lsftp

Example:

Moving ubuntu-12.04.2-desktop-amd64.iso from my local machine to n12.iridium shared folders

  scp ubuntu-12.04.2-desktop-i386.iso n12.iridium:/nfs/shared/pp/

Solution 2: rsync

8-)

Syntax:

  rsync -avz -e 'ssh -l <username>' --progress source destination

However, the progress indicator is not very good and most of the time slows down the transfers in the purpose of writing to standard output. Therefore I suggest you either redirect the standard error and output:

  rsync -avz -e 'ssh -l <username>' --progress source destination &> rsyncoutput.log

Or even better, use rsync own log file instead:

  rsync -avz -e 'ssh -l <username>' --log-file=rsyncoutput.log source destination

check the contents of the logfile now and then to see the status:

  tail rsyncoutput.log

Examples:

Moving ubuntu-12.04.2-desktop-amd64.iso from my local machine to pptest-iridium shared folders

  rsync -avz -e 'ssh -l pflorido' --progress ubuntu-12.04.2-desktop-amd64.iso pptest-iridium.lunarc.lu.se:/nfs/software/pp/

Note on the trailing slashes /:

source without trailing slash on source will create localdir remotely:

  rsync -avz -e 'ssh -l pflorido' --progress localdir pptest-iridium.iridium:/nfs/software/pp/

source with trailing slash on source will NOT create localdir remotely but will copy the contents of localdir remotely

  rsync -avz -e 'ssh -l pflorido' --progress localdir/ pptest-iridium.iridium:/nfs/software/pp/

Trailing slash on destination doesn't have any effect.

Solution 3: FileZilla

More about it: https://filezilla-project.org/download.php?type=client

GRID storage

Solution 4: NorduGrid ARC tools (arccp, arcls, arcrm)

See also http://www.hep.lu.se/grid/localgroupdisk.html for more information on how to use Lund local GRID storage.

Example:

To copy files to/from the storage, use the srm: protocol and arccp tool:

arccp srm://srm.swegrid.se/atlas/disk/atlaslocalgroupdisk/lund/data11_7TeV/NTUP_SUSY/f354_m765_p486/data11_7TeV.00178109.physics_JetTauEtmiss.merge.NTUP_SUSY.f354_m765_p486_tid292683_00/NTUP_SUSY.292683._000131.root.1 file:///tmp/NTUP_SUSY.292683._000131.root.1 

Solution 5: Rucio or dq2 tools

If you have and ATLAS dataset, the best is to transfer it to the local LUND Grid storage first, and then to the cluster directly if needed. To do that you need to submit a DaTRi request

This page contains all you need to know on how to use the local storage: http://www.hep.lu.se/grid/localgroupdisk.html

To move the dataset from any ATLAS grid storage to Iridium, you are recommended to use Rucio, the successor of DQ2. Use the following:

To enable RUCIO tools, you'll need to:

  1. copy and configure you GRID certificate on Iridium.
  2. run setupATLAS
  3. run localSetupRucioClients
  4. login to the GRID using arcproxy -S atlas or voms-proxy-init as one would do on lxplus.cern.ch.

The RUCIO official documentation is here: http://rucio.cern.ch/cli_examples.html

If you still want to use dq2 tools, here's how:

To enable dq2 tools, you'll need to:

  1. copy and configure you GRID certificate on Iridium.
  2. run setupATLAS
  3. run localSetupDQ2Client
  4. login to the GRID using arcproxy or voms-proxy-init as one would do on lxplus.cern.ch .

Information about dq2 on CERN Twiki (only visible if you have a CERN account): https://twiki.cern.ch/twiki/bin/view/AtlasComputing/DQ2ClientsHowTo