User Tools

Site Tools


aurora_cluster:moving_data

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
aurora_cluster:moving_data [2017/07/31 19:38]
florido [Uploading/Downloading data to/from an external source to Aurora]
aurora_cluster:moving_data [2017/07/31 20:12] (current)
florido [Downloading/Uploading data to/from the GRID from Aurora]
Line 10: Line 10:
   * Don't remove other user's files unless you advice them and they'​re ok with it. This is a shared facility.   * Don't remove other user's files unless you advice them and they'​re ok with it. This is a shared facility.
   * Don't expect contents of any ''​scratch''​ folder to be always there. At the moment, however, there is no deletion policy for that.   * Don't expect contents of any ''​scratch''​ folder to be always there. At the moment, however, there is no deletion policy for that.
-===== Moving data for users of Mathematical Physics and generic Lunarc users ===== 
  
-Users of **Mathematical Physics**, as well as any other Lunarc user, can use their favorite tool to download and upload either from your own workstationthe Aurora front-end or the Aurora computing nodes. ​+===== Moving small files for users of Mathematical Physics and all Lunarc users ===== 
 + 
 +Small files, whose size are below 10GB, can be moved using aurora'​s frontend. 
 + 
 +Users of **Mathematical Physics**, as well as any other Lunarc user, can use their favorite tool to download and upload either from your own workstation ​or laptop to the Aurora front-end or the Aurora computing nodes. ​
 You can read about some of those tools on the [[iridium_cluster:​data| Move data to and from the Iridium Cluster]] pages. You can read about some of those tools on the [[iridium_cluster:​data| Move data to and from the Iridium Cluster]] pages.
-===== Moving data for users of Nuclear, Theoretical and Particle Physics ===== 
  
-Users of these division can access the special node //fs2-hep// to be used for downloads or uploads.+For example, from your laptop: <code bash>​sftp myfile aurora.lunarc.lu.se:​/projects/hep/fs2/shared/​np/​myfolder/​myfile</​code>​
  
-These users (in particular ​Particle ​and Theorerical ​Physics) might need to download huge amount of data and therefore it was our objective to offload the Lunarc internal network and the usage of computing nodes as mere downloader nodes.+===== Moving big data for users of Nuclear, Theoretical and Particle Physics ​=====
  
-//fs2-hep// has direct very fast connection ​to the internet for downloads ​and uploads.+For big files (hundreds of gigabytes up) you should use ''​fs2-hep''​ as described below. Aurora is not storage facility, therefore is not meant to be accessed by external sources to do data movement. If you move big data via the Aurora frontend it is extremely slow and will slow down your colleagues work. Also, Aurora frontend managers might interrupt your transfers if they see it is taking too much time
  
-:​!:​**NOTE**:​!:​ **incoming connections from the internet are rejected**. This node can download FROM and upload TO the internet but cannot be accessed directly as a server to retrieve or upload data from OUTSIDE Lunarc. In other words, **it is not possible to directly connect TO** ''​fs2-hep''​ from the internet via ''​sftp''/''​ssh''/''​rsync''​. You can only run those on ''​fs2-hep''​ itself. Read more about this in [[#​Uploading/​Downloading data to/from Aurora from your laptop or workstation]].+If handling big data, I strongly recommend to run an ssh/ftp server on your own laptop or workstation,​ or ask the sysadmin for a convenient form of online storage that can be accessed **from** Aurora as described below. 
 + 
 +//fs2-hep// has a direct very fast 10GB connection to the internet for downloads and uploads. 
 + 
 +Users of Particle and Theoretical Physics might need to download huge amount of data and therefore it was our goal to offload the Lunarc internal network and the usage of computing nodes as mere downloader nodes. 
 + 
 +:​!:​**NOTE**:​!:​ **incoming connections from the internet ​to fs2-hep ​are rejected**. This node can download FROM and upload TO the internet but cannot be accessed directly as a server to retrieve or upload data from OUTSIDE Lunarc. In other words, **it is not possible to directly connect TO** ''​fs2-hep''​ from the internet via ''​sftp''/''​ssh''/''​rsync''​.
  
 An overview of the upload/​download components are shown in the slide below: An overview of the upload/​download components are shown in the slide below:
Line 38: Line 46:
 The picture below shows the various steps. {{ :​aurora_cluster:​datamovementflowchartbiggfx.png?​600 |}} The picture below shows the various steps. {{ :​aurora_cluster:​datamovementflowchartbiggfx.png?​600 |}}
  
-==== Uploading/​Downloading data to/from an external source from Aurora ==== +==== A. Uploading/​Downloading data to/from an external source from Aurora ==== 
-  - Use your favourite download software. Some suggestions are available at [[iridium_cluster:​data|Moving data to and from Iridium]]+  - Use your favourite download software. Some suggestions are available at [[iridium_cluster:​data|Moving data to and from Iridium]]. If the tool is not available please ask Florido to install it on ''​fs2-hep''​.
   - Use your home folder or one of the ''/​projects/​hep''​ folders as a destination folder. Any other path is not writable by your user. The ''/​tmp'' ​ folder will be deleted regularly so you should not use that. ''/​projects/​hep/​fs2''​ is accessible by everyone, while ''/​projects/​hep/​fs3''​ and ''/​projects/​hep/​fs4''​ is dedicated storage for the ATLAS project.   - Use your home folder or one of the ''/​projects/​hep''​ folders as a destination folder. Any other path is not writable by your user. The ''/​tmp'' ​ folder will be deleted regularly so you should not use that. ''/​projects/​hep/​fs2''​ is accessible by everyone, while ''/​projects/​hep/​fs3''​ and ''/​projects/​hep/​fs4''​ is dedicated storage for the ATLAS project.
  
-==== Uploading/Downloading ​data to/from Aurora from your laptop or workstation ==== +==== B. Downloading/​Uploading data to/from the GRID from Aurora ​====
- +
-You should avoid doing this. Aurora is not a storage facility, therefore is not meant to be accessed by external sources to do data movement. It is possible to do that through Aurora frontend but this is extremely slow and will slow down your colleagues work. Also, Aurora frontend managers might interrupt your transfers if they see it is taking too much time. I strongly recommend to follow the instructions at [[#Uploading/​Downloading ​data to/​from ​an external source to Aurora]] above instead, and eventually run an ssh/ftp server on your own laptop or workstation,​ or ask the sysadmin for a convenient form of online storage. +
- +
-For resources that can be stored on the GRID, you should definitely stage them on the Lund GRID storage instead, a few ways described under [[#​Downloading/​Uploading data to/from the GRID to Aurora]], so that you can access them from all over the world in the fastest way possible. +
  
-==== Downloading/​Uploading data to/​from ​the GRID to Aurora ====+For resources that can be stored on the GRID, you should definitely stage them on the Lund GRID storage instead, a few ways described at the link below, so that you can access them from all over the world in the fastest way possible.
  
 Please read the dedicated page [[aurora_cluster:​moving_data:​grid|Moving data between GRID and Aurora]] Please read the dedicated page [[aurora_cluster:​moving_data:​grid|Moving data between GRID and Aurora]]
aurora_cluster/moving_data.1501522689.txt.gz · Last modified: 2017/07/31 19:38 by florido