User Tools

Site Tools


aurora_cluster:how_scheduling_works

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
aurora_cluster:how_scheduling_works [2019/03/07 09:00]
florido created
aurora_cluster:how_scheduling_works [2019/08/27 10:28] (current)
florido [FAQ]
Line 3: Line 3:
 In this section I try to explain how your job requests are treated on the ''​hep''​ partition. In this section I try to explain how your job requests are treated on the ''​hep''​ partition.
  
-anyway long story short: +===== Users within ​the same project ​are as one single user. ===== 
-- if one submits to the project ​ HEP 2016/1-4, it doesn'​t matter who you are, if you're member of the project your requests will be processed FIFO (first in, first out. The first to submit will be the first to have their jobs processed).  ​All other project members running or queued ​in front of you will have higher priority ​--in other words, there is no priority or fair share among members of the same project. It'​s ​like if you were a single person submitting jobs. + 
-if the cluster is busy, requesting an interactive session may take time and fail. The scheduler will happily ​allocate ​resources for you, but if you ask an interactive session with say 6 cores and there is no machine with 6 cores free, the scheduler cannot ​fullfill ​the request. ​If you submit ​a batch job, the scheduler will queue it for you with the LIFO strategy described above. ​ So one suggestion from Lunarc is to put the long, slow jobs first and the fast short ones later, so you give others more and more space as you finish processing the slow ones+Inside a project, say, HEP 2016/1-4, it doesn'​t matter who you are, if you're member of the project your requests will be processed FIFOfirst in, first out. The first to submit will be the first to have their jobs processed. 
-Fairness is enforced ​among the three projects using the hep partition ​ (HEP 2016/1-3, HEP 2016/1-4, HEP 2016/​1-5) ​that means every project is allocated 1/3 of computing power per month. ​Once one project ​exceeds that 1/3, it will be harder for members ​of that project to get resources ​when the other projects ​are runningbecause there is a debt of computing ​power towards themThis happens only when the cluster ​is being used by all, which is quite rare. But if at some point each project ​is using considerable amount ​of computing power, ​it is for sure that all project members will have to wait in the queue to be allocatedRemember ​that the allocation is what you ASK for in the sbatch script: once is allocated ​is yours and others cannot take it. + 
-+As long as the SLURM scheduler can find  resources to match the ones requested by a job, the job will be started soon after being submitted. 
 + 
 +But if there are no resources available, the job will have to wait in line and any job submitted earlier for the same project ​will have higher priority ​due to a longer waiting time.  
 +There are some modifications of priority based on job sizebut **there is no priority or fair share among members of the same project**. 
 + 
 +It'​s ​as if all members ​were a single person submitting jobs, with many names
 + 
 +The exception to this rule is that there are limits to how many jobs a single user can submit and how many resources a single user can utilise simultaneously. Thus, if a user has reached any of these limits, other members of the same project are still able to submit and run jobs within the bounds of similar, but higher limits for the project as a whole. 
 +===== Considerations on interactive sessions ===== 
 + 
 +If the cluster is busy, requesting an interactive session may take time and fail. The scheduler will happily ​schedule ​resources for a user, but if the user asks for an interactive session with say 6 cores and there is no machine with 6 cores free, the scheduler cannot ​fulfil ​the request ​at the moment 
 + 
 +The scheduler treats an interactive job the same as a batch job, queueing ​it with the FIFO strategy described above. ​However, an exception ​to the FIFO scheduling appears when a parallel job is waiting for other jobs to finish ​and release resources. Then a short job can be promoted ahead of the queue, if it fits into an empty slot that is reserved for later use by the parallel job, so called //​backfill//​.  
 + 
 +Thus, queueing interactive jobs with shorter wall times have a higher probability of starting earlier. 
 + 
 + 
 +===== Fairness among projects running on the hep partition ===== 
 + 
 +Fairness is maintained ​among the three projects using the hep partition (HEP 2016/1-3, HEP 2016/1-4, HEP 2016/​1-5) ​by each being allocated 1/3 of the computing power (calculated as core hours per month)This is the basis for fair share; i.e., the priority of jobs from one project ​is calculated with respect to how much of the target ​1/3 has been used by the project over the last 30 days. A project that has used a large portion of the allocated time (or more than the allocation) will have a lower priority than a project that has used a small part. 
 + 
 +:!: If more memory per core is used than the total memory of the node divided by the number of cores of the nodethis will be **equivalent to using more cores** in the calculation ​of usage. 
 +===== Suggestion to self-regulate the usage inside a project ​===== 
 + 
 +  * The project members should interact on a regular basis to understand what are their expected computing needs; 
 +  * Those negotiated needs should be translated into **expected resource requests** 
 +  * These requests should be documented somewhere FIXME on the cluster, like in a file. 
 +  * All users should honour the expected resource requests from the above file when submitting 
 +  * In order to preserve ​the possibility to use all of the nodes when needed, these requests should be flexible enough to be changed on the fly according to needs of the members of the project 
 + 
 +===== FAQ ===== 
 + 
 +//​These ​are true for both HEP and SNIC paritionsit's the overall Lunarc policy. The difference ​is that on the HEP nodes you only compete with users from Particle, Nuclear and Theoretical physics instead ​of the whole Lunarc user base on SNIC.// 
 + 
 +1) If I have an allocation, and I don't run for months, will I get higher prio when I then run? 
 +> No, unused ​computing ​cycles within a billing month are lost. Time not spent is gone and does not add to a "​favour bank". However, there is a sliding 30-day window, which means that you have zero usage when you resume after a long silence and will get a higher priority than those that have been running within ​the 30-day window. How this priority ​is reduced when you are running depends on the allocation. 
 + 
 +2) If I submit a job and noone else is running, can I get more resources than the guaranteed/​allocated one? 
 +> Yes, you can use more time than allocated. There are several soft limits and a hard one with respect to aggregated usage (30 days), but the hard one is four times the allocation for small and medium allocations. There are also restrictions triggered ​by concurrent usage of cores, but they are all soft. There are soft limits that set the allowed number of cores to a low number, such as 20 per user and 60 per project, which becomes hard if a job is too wide. 
 + 
 +3) If I submit ​job at the end of the billing month and it ends up in the queue, will I get an higher prio at the beginning of the next month? 
 +> Not really. The 30-day window ​is sliding, which means that it is not tied to a calendar month. However, if you were running intensely 30 days ago, waiting ​in the queue a few days will erase the memory of that and if you are close to a limit, the priority can jumpThe only time that you can get a clean slate despite using the system recently ​is when one project ends and a new one starts, because usage is not carried over when a project is renewed ​and gets a different project code, which typically happens once a year
aurora_cluster/how_scheduling_works.1551949250.txt.gz · Last modified: 2019/03/07 09:00 by florido

Accessibility Statement