RAC transition FAQ: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
No edit summary
No edit summary
 
(13 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{Draft}}
<languages />


Allocations from the 2019 Resource Allocation Competition come into effect on 2019 April 4. 
<translate>
Here are some notes on how we expect the transition from 2018 to 2019 allocations to go.


=== Storage ===
<!--T:1-->
* There will be 30 days of overlap between 2018 and 2019 storage allocations, starting on 2019 April 4.
Allocations from the 2021 Resource Allocation Competition come into effect on April 1, 2021. 
* On a given system the largest of the two quotas (2018, 2019) will be adopted during the transition period.
Here are some notes on how we expect the transition to go.
* If an allocation has moved from one site to another, users are expected to transfer the data by themselves (via globus, scp, rsync, ''etc.''; see [[Transferring data]]). For large amounts of data (''e.g.'', 200TB or more) please [[Technical support|contact support]] for advice or assistance to manage the transfer.
 
* Groups with an allocation that has been moved to [[Béluga]] are encouraged to start migrating their data '''now.'''  Béluga storage is already accessible via Globus
=== Storage === <!--T:2-->
* Contributed storage systems have different dates of activation and decommissioning. For these we'll be doing the SUM(2018, 2019) for quotas during the 30 days transition period.
* There will be 30 days of overlap between 2020 and 2021 storage allocations, starting on April 1, 2021.
* For every other PI we use default quotas.
* On a given system, the largest of the two quotas (2020, 2021) will be adopted during the transition period.
* If an allocation has moved from one site to another, users are expected to transfer the data by themselves (via globus, scp, rsync, ''etc.''; see [[Transferring data]]). For large amounts of data (''e.g.'', 200TB or more) please [[Technical support|contact support]] for advice or assistance on managing the transfer.
* Contributed storage systems have different dates of activation and decommissioning. For these, we'll be doing the SUM(2020, 2021) for quotas during the 30-day transition period.
* For every other PI, we will use default quotas.
* After the transition period, the quotas on the original sites from which data has been migrated will also be set to default. Users are expected to delete data from those original sites if the usage levels are above the new (default) quota. If usage remains above the new quota after the overlap period, staff may choose to delete everything.
* After the transition period, the quotas on the original sites from which data has been migrated will also be set to default. Users are expected to delete data from those original sites if the usage levels are above the new (default) quota. If usage remains above the new quota after the overlap period, staff may choose to delete everything.
* Reasonable requests for extension of the overlap period will be honored, but such an extension may be impossible or severely constrained if the original cluster is being defunded.
* Reasonable requests for extension of the overlap period will be honoured, but such an extension may be impossible or severely constrained if the original cluster is being defunded.


=== Job scheduling ===
=== Job scheduling === <!--T:3-->
* The scheduler team is planning to archive and compact the Slurm database on April 3 before implementing the new allocations on April 4. We hope to schedule the archive and compaction during off-peak hours. During this time the database may be unresponsive. Specifically, <tt>sacct</tt> and <tt>sacctmgr</tt> may be unresponsive.
* The scheduler team is planning to archive and compact the Slurm database on March 31 before implementing the new allocations on April 1. We hope to schedule the archiving and compaction during off-peak hours. During this time the database may be unresponsive, specifically, <tt>sacct</tt> and <tt>sacctmgr</tt>.
* We expect to begin replacing 2018 allocations with 2019 allocations on April 4.  
* We expect to begin replacing 2020 allocations with 2021 allocations on April 1.  
* Job priority may be inconsistent during the allocation cutover.  Specifically, default allocations may face decreased priority.
* Job priority may be inconsistent during the allocation cutover.  Specifically, default allocations may face decreased priority.
* Jobs already in the system will be retained.  Running jobs will not be stopped.  Waiting jobs may be held.  
* Jobs already in the system will be retained.  Running jobs will not be stopped.  Waiting jobs may be held.  
* Waiting jobs attributed to an allocation which has been moved or not renewed may not schedule after the cutover.  Advice on how to detect and handle such jobs will be forthcoming.
* Waiting jobs attributed to an allocation which has been moved or not renewed may not schedule after the cutover.  Advice on how to detect and handle such jobs will be forthcoming.
</translate>

Latest revision as of 17:48, 11 March 2021

Other languages:


Allocations from the 2021 Resource Allocation Competition come into effect on April 1, 2021. Here are some notes on how we expect the transition to go.

Storage

  • There will be 30 days of overlap between 2020 and 2021 storage allocations, starting on April 1, 2021.
  • On a given system, the largest of the two quotas (2020, 2021) will be adopted during the transition period.
  • If an allocation has moved from one site to another, users are expected to transfer the data by themselves (via globus, scp, rsync, etc.; see Transferring data). For large amounts of data (e.g., 200TB or more) please contact support for advice or assistance on managing the transfer.
  • Contributed storage systems have different dates of activation and decommissioning. For these, we'll be doing the SUM(2020, 2021) for quotas during the 30-day transition period.
  • For every other PI, we will use default quotas.
  • After the transition period, the quotas on the original sites from which data has been migrated will also be set to default. Users are expected to delete data from those original sites if the usage levels are above the new (default) quota. If usage remains above the new quota after the overlap period, staff may choose to delete everything.
  • Reasonable requests for extension of the overlap period will be honoured, but such an extension may be impossible or severely constrained if the original cluster is being defunded.

Job scheduling

  • The scheduler team is planning to archive and compact the Slurm database on March 31 before implementing the new allocations on April 1. We hope to schedule the archiving and compaction during off-peak hours. During this time the database may be unresponsive, specifically, sacct and sacctmgr.
  • We expect to begin replacing 2020 allocations with 2021 allocations on April 1.
  • Job priority may be inconsistent during the allocation cutover. Specifically, default allocations may face decreased priority.
  • Jobs already in the system will be retained. Running jobs will not be stopped. Waiting jobs may be held.
  • Waiting jobs attributed to an allocation which has been moved or not renewed may not schedule after the cutover. Advice on how to detect and handle such jobs will be forthcoming.